NASA Astrophysics Data System (ADS)
Meng, X. T.; Levin, D. S.; Chapman, J. W.; Zhou, B.
2016-09-01
The ATLAS Muon Spectrometer endcap thin-Resistive Plate Chamber trigger project compliments the New Small Wheel endcap Phase-1 upgrade for higher luminosity LHC operation. These new trigger chambers, located in a high rate region of ATLAS, will improve overall trigger acceptance and reduce the fake muon trigger incidence. These chambers must generate a low level muon trigger to be delivered to a remote high level processor within a stringent latency requirement of 43 bunch crossings (1075 ns). To help meet this requirement the High Performance Time to Digital Converter (HPTDC), a multi-channel ASIC designed by CERN Microelectronics group, has been proposed for the digitization of the fast front end detector signals. This paper investigates the HPTDC performance in the context of the overall muon trigger latency, employing detailed behavioral Verilog simulations in which the latency in triggerless mode is measured for a range of configurations and under realistic hit rate conditions. The simulation results show that various HPTDC operational configurations, including leading edge and pair measurement modes can provide high efficiency (>98%) to capture and digitize hits within a time interval satisfying the Phase-1 latency tolerance.
Graphical processors for HEP trigger systems
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Chiozzi, S.; Cotta Ramusino, A.; Di Lorenzo, S.; Fantechi, R.; Fiorini, M.; Frezza, O.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Piandani, R.; Pontisso, L.; Rossetti, D.; Simula, F.; Sozzi, M.; Vicini, P.
2017-02-01
General-purpose computing on GPUs is emerging as a new paradigm in several fields of science, although so far applications have been tailored to employ GPUs as accelerators in offline computations. With the steady decrease of GPU latencies and the increase in link and memory throughputs, time is ripe for real-time applications using GPUs in high-energy physics data acquisition and trigger systems. We will discuss the use of online parallel computing on GPUs for synchronous low level trigger systems, focusing on tests performed on the trigger of the CERN NA62 experiment. Latencies of all components need analysing, networking being the most critical. To keep it under control, we envisioned NaNet, an FPGA-based PCIe Network Interface Card (NIC) enabling GPUDirect connection. Moreover, we discuss how specific trigger algorithms can be parallelised and thus benefit from a GPU implementation, in terms of increased execution speed. Such improvements are particularly relevant for the foreseen LHC luminosity upgrade where highly selective algorithms will be crucial to maintain sustainable trigger rates with very high pileup.
Burst-mode optical label processor with ultralow power consumption.
Ibrahim, Salah; Nakahara, Tatsushi; Ishikawa, Hiroshi; Takahashi, Ryo
2016-04-04
A novel label processor subsystem for 100-Gbps (25-Gbps × 4λs) burst-mode optical packets is developed, in which a highly energy-efficient method is pursued for extracting and interfacing the ultrafast packet-label to a CMOS-based processor where label recognition takes place. The method involves performing serial-to-parallel conversion for the label bits on a bit-by-bit basis by using an optoelectronic converter that is operated with a set of optical triggers generated in a burst-mode manner upon packet arrival. Here we present three key achievements that enabled a significant reduction in the total power consumption and latency of the whole subsystem; 1) based on a novel operation mechanism for providing amplification with bit-level selectivity, an optical trigger pulse generator, that consumes power for a very short duration upon packet arrival, is proposed and experimentally demonstrated, 2) the energy of optical triggers needed by the optoelectronic serial-to-parallel converter is reduced by utilizing a negative-polarity signal while employing an enhanced conversion scheme entitled the discharge-or-hold scheme, 3) the necessary optical trigger energy is further cut down by half by coupling the triggers through the chip's backside, whereas a novel lens-free packaging method is developed to enable a low-cost alignment process that works with simple visual observation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
C. Cuevas, B. Raydo, H. Dong, A. Gupta, F.J. Barbosa, J. Wilson, W.M. Taylor, E. Jastrzembski, D. Abbott
We will demonstrate a hardware and firmware solution for a complete fully pipelined multi-crate trigger system that takes advantage of the elegant high speed VXS serial extensions for VME. This trigger system includes three sections starting with the front end crate trigger processor (CTP), a global Sub-System Processor (SSP) and a Trigger Supervisor that manages the timing, synchronization and front end event readout. Within a front end crate, trigger information is gathered from each 16 Channel, 12 bit Flash ADC module at 4 nS intervals via the VXS backplane, to a Crate Trigger Processor (CTP). Each Crate Trigger Processor receivesmore » these 500 MB/S VXS links from the 16 FADC-250 modules, aligns skewed data inherent of Aurora protocol, and performs real time crate level trigger algorithms. The algorithm results are encoded using a Reed-Solomon technique and transmission of this Level 1 trigger data is sent to the SSP using a multi-fiber link. The multi-fiber link achieves an aggregate trigger data transfer rate to the global trigger at 8 Gb/s. The SSP receives and decodes Reed-Solomon error correcting transmission from each crate, aligns the data, and performs the global level trigger algorithms. The entire trigger system is synchronous and operates at 250 MHz with the Trigger Supervisor managing not only the front end event readout, but also the distribution of the critical timing clocks, synchronization signals, and the global trigger signals to each front end readout crate. These signals are distributed to the front end crates on a separate fiber link and each crate is synchronized using a unique encoding scheme to guarantee that each front end crate is synchronous with a fixed latency, independent of the distance between each crate. The overall trigger signal latency is <3 uS, and the proposed 12GeV experiments at Jefferson Lab require up to 200KHz Level 1 trigger rate.« less
Use of GPUs in Trigger Systems
NASA Astrophysics Data System (ADS)
Lamanna, Gianluca
In recent years the interest for using graphics processor (GPU) in general purpose high performance computing is constantly rising. In this paper we discuss the possible use of GPUs to construct a fast and effective real time trigger system, both in software and hardware levels. In particular, we study the integration of such a system in the NA62 trigger. The first application of GPUs for rings pattern recognition in the RICH will be presented. The results obtained show that there are not showstoppers in trigger systems with relatively low latency. Thanks to the use of off-the-shelf technology, in continous development for purposes related to video game and image processing market, the architecture described would be easily exported to other experiments, to build a versatile and fully customizable online selection.
Power Aware Distributed Systems
2004-01-01
detection or threshold functions to trigger the main CPU. The main processor can sleep and either wakeup on a schedule or by a positive threshold event...the RTOS must determine if wake-up latency can be tolerated (or, if it could be hidden by pre- wakeup ). The prediction accuracy for scheduling ...and processor shutdown/ wakeup . This analysis can be used to accurately analyze the schedulability of non-concrete periodic task sets, scheduled using
The CMS Level-1 Calorimeter Trigger for LHC Run II
NASA Astrophysics Data System (ADS)
Sinthuprasith, Tutanon
2017-01-01
The phase-1 upgrades of the CMS Level-1 calorimeter trigger have been completed. The Level-1 trigger has been fully commissioned and it will be used by CMS to collect data starting from the 2016 data run. The new trigger has been designed to improve the performance at high luminosity and large number of simultaneous inelastic collisions per crossing (pile-up). For this purpose it uses a novel design, the Time Multiplexed Design, which enables the data from an event to be processed by a single trigger processor at full granularity over several bunch crossings. The TMT design is a modular design based on the uTCA standard. The architecture is flexible and the number of trigger processors can be expanded according to the physics needs of CMS. Intelligent, more complex, and innovative algorithms are now the core of the first decision layer of CMS: the upgraded trigger system implements pattern recognition and MVA (Boosted Decision Tree) regression techniques in the trigger processors for pT assignment, pile up subtraction, and isolation requirements for electrons, and taus. The performance of the TMT design and the latency measurements and the algorithm performance which has been measured using data is also presented here.
Low-Latency Embedded Vision Processor (LLEVS)
2016-03-01
26 3.2.3 Task 3 Projected Performance Analysis of FPGA- based Vision Processor ........... 31 3.2.3.1 Algorithms Latency Analysis ...Programmable Gate Array Custom Hardware for Real- Time Multiresolution Analysis . ............................................... 35...conduct data analysis for performance projections. The data acquired through measurements , simulation and estimation provide the requisite platform for
Graphics Processors in HEP Low-Level Trigger Systems
NASA Astrophysics Data System (ADS)
Ammendola, Roberto; Biagioni, Andrea; Chiozzi, Stefano; Cotta Ramusino, Angelo; Cretaro, Paolo; Di Lorenzo, Stefano; Fantechi, Riccardo; Fiorini, Massimiliano; Frezza, Ottorino; Lamanna, Gianluca; Lo Cicero, Francesca; Lonardo, Alessandro; Martinelli, Michele; Neri, Ilaria; Paolucci, Pier Stanislao; Pastorelli, Elena; Piandani, Roberto; Pontisso, Luca; Rossetti, Davide; Simula, Francesco; Sozzi, Marco; Vicini, Piero
2016-11-01
Usage of Graphics Processing Units (GPUs) in the so called general-purpose computing is emerging as an effective approach in several fields of science, although so far applications have been employing GPUs typically for offline computations. Taking into account the steady performance increase of GPU architectures in terms of computing power and I/O capacity, the real-time applications of these devices can thrive in high-energy physics data acquisition and trigger systems. We will examine the use of online parallel computing on GPUs for the synchronous low-level trigger, focusing on tests performed on the trigger system of the CERN NA62 experiment. To successfully integrate GPUs in such an online environment, latencies of all components need analysing, networking being the most critical. To keep it under control, we envisioned NaNet, an FPGA-based PCIe Network Interface Card (NIC) enabling GPUDirect connection. Furthermore, it is assessed how specific trigger algorithms can be parallelized and thus benefit from a GPU implementation, in terms of increased execution speed. Such improvements are particularly relevant for the foreseen Large Hadron Collider (LHC) luminosity upgrade where highly selective algorithms will be essential to maintain sustainable trigger rates with very high pileup.
Low latency messages on distributed memory multiprocessors
NASA Technical Reports Server (NTRS)
Rosing, Matthew; Saltz, Joel
1993-01-01
Many of the issues in developing an efficient interface for communication on distributed memory machines are described and a portable interface is proposed. Although the hardware component of message latency is less than one microsecond on many distributed memory machines, the software latency associated with sending and receiving typed messages is on the order of 50 microseconds. The reason for this imbalance is that the software interface does not match the hardware. By changing the interface to match the hardware more closely, applications with fine grained communication can be put on these machines. Based on several tests that were run on the iPSC/860, an interface that will better match current distributed memory machines is proposed. The model used in the proposed interface consists of a computation processor and a communication processor on each node. Communication between these processors and other nodes in the system is done through a buffered network. Information that is transmitted is either data or procedures to be executed on the remote processor. The dual processor system is better suited for efficiently handling asynchronous communications compared to a single processor system. The ability to send data or procedure is very flexible for minimizing message latency, based on the type of communication being performed. The test performed and the proposed interface are described.
CISN ShakeAlert Earthquake Early Warning System Monitoring Tools
NASA Astrophysics Data System (ADS)
Henson, I. H.; Allen, R. M.; Neuhauser, D. S.
2015-12-01
CISN ShakeAlert is a prototype earthquake early warning system being developed and tested by the California Integrated Seismic Network. The system has recently been expanded to support redundant data processing and communications. It now runs on six machines at three locations with ten Apache ActiveMQ message brokers linking together 18 waveform processors, 12 event association processes and 4 Decision Module alert processes. The system ingests waveform data from about 500 stations and generates many thousands of triggers per day, from which a small portion produce earthquake alerts. We have developed interactive web browser system-monitoring tools that display near real time state-of-health and performance information. This includes station availability, trigger statistics, communication and alert latencies. Connections to regional earthquake catalogs provide a rapid assessment of the Decision Module hypocenter accuracy. Historical performance can be evaluated, including statistics for hypocenter and origin time accuracy and alert time latencies for different time periods, magnitude ranges and geographic regions. For the ElarmS event associator, individual earthquake processing histories can be examined, including details of the transmission and processing latencies associated with individual P-wave triggers. Individual station trigger and latency statistics are available. Detailed information about the ElarmS trigger association process for both alerted events and rejected events is also available. The Google Web Toolkit and Map API have been used to develop interactive web pages that link tabular and geographic information. Statistical analysis is provided by the R-Statistics System linked to a PostgreSQL database.
Common Readout Unit (CRU) - A new readout architecture for the ALICE experiment
NASA Astrophysics Data System (ADS)
Mitra, J.; Khan, S. A.; Mukherjee, S.; Paul, R.
2016-03-01
The ALICE experiment at the CERN Large Hadron Collider (LHC) is presently going for a major upgrade in order to fully exploit the scientific potential of the upcoming high luminosity run, scheduled to start in the year 2021. The high interaction rate and the large event size will result in an experimental data flow of about 1 TB/s from the detectors, which need to be processed before sending to the online computing system and data storage. This processing is done in a dedicated Common Readout Unit (CRU), proposed for data aggregation, trigger and timing distribution and control moderation. It act as common interface between sub-detector electronic systems, computing system and trigger processors. The interface links include GBT, TTC-PON and PCIe. GBT (Gigabit transceiver) is used for detector data payload transmission and fixed latency path for trigger distribution between CRU and detector readout electronics. TTC-PON (Timing, Trigger and Control via Passive Optical Network) is employed for time multiplex trigger distribution between CRU and Central Trigger Processor (CTP). PCIe (Peripheral Component Interconnect Express) is the high-speed serial computer expansion bus standard for bulk data transport between CRU boards and processors. In this article, we give an overview of CRU architecture in ALICE, discuss the different interfaces, along with the firmware design and implementation of CRU on the LHCb PCIe40 board.
A Full Mesh ATCA-based General Purpose Data Processing Board (Pulsar II)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ajuha, S.
The Pulsar II is a custom ATCA full mesh enabled FPGA-based processor board which has been designed with the goal of creating a scalable architecture abundant in flexible, non-blocking, high bandwidth interconnections. The design has been motivated by silicon-based tracking trigger needs for LHC experiments. In this technical memo we describe the Pulsar II hardware and its performance, such as the performance test results with full mesh backplanes from different vendors, how the backplane is used for the development of low-latency time-multiplexed data transfer schemes and how the inter-shelf and intra-shelf synchronization works.
Design and performance of a high resolution, low latency stripline beam position monitor system
NASA Astrophysics Data System (ADS)
Apsimon, R. J.; Bett, D. R.; Blaskovic Kraljevic, N.; Burrows, P. N.; Christian, G. B.; Clarke, C. I.; Constance, B. D.; Dabiri Khah, H.; Davis, M. R.; Perry, C.; Resta López, J.; Swinson, C. J.
2015-03-01
A high-resolution, low-latency beam position monitor (BPM) system has been developed for use in particle accelerators and beam lines that operate with trains of particle bunches with bunch separations as low as several tens of nanoseconds, such as future linear electron-positron colliders and free-electron lasers. The system was tested with electron beams in the extraction line of the Accelerator Test Facility at the High Energy Accelerator Research Organization (KEK) in Japan. It consists of three stripline BPMs instrumented with analogue signal-processing electronics and a custom digitizer for logging the data. The design of the analogue processor units is presented in detail, along with measurements of the system performance. The processor latency is 15.6 ±0.1 ns . A single-pass beam position resolution of 291 ±10 nm has been achieved, using a beam with a bunch charge of approximately 1 nC.
Hardware for dynamic quantum computing experiments: Part I
NASA Astrophysics Data System (ADS)
Johnson, Blake; Ryan, Colm; Riste, Diego; Donovan, Brian; Ohki, Thomas
Static, pre-defined control sequences routinely achieve high-fidelity operation on superconducting quantum processors. Efforts toward dynamic experiments depending on real-time information have mostly proceeded through hardware duplication and triggers, requiring a combinatorial explosion in the number of channels. We provide a hardware efficient solution to dynamic control with a complete platform of specialized FPGA-based control and readout electronics; these components enable arbitrary control flow, low-latency feedback and/or feedforward, and scale far beyond single-qubit control and measurement. We will introduce the BBN Arbitrary Pulse Sequencer 2 (APS2) control system and the X6 QDSP readout platform. The BBN APS2 features: a sequencer built around implementing short quantum gates, a sequence cache to allow long sequences with branching structures, subroutines for code re-use, and a trigger distribution module to capture and distribute steering information. The X6 QDSP features a single-stage DSP pipeline that combines demodulation with arbitrary integration kernels, and multiple taps to inspect data flow for debugging and calibration. We will show system performance when putting it all together, including a latency budget for feedforward operations. This research was funded by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), through the Army Research Office Contract No. W911NF-10-1-0324.
Low latency memory access and synchronization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.
A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processormore » only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.« less
Low latency memory access and synchronization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.
A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processormore » only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.« less
The Level 0 Pixel Trigger system for the ALICE experiment
NASA Astrophysics Data System (ADS)
Aglieri Rinella, G.; Kluge, A.; Krivda, M.; ALICE Silicon Pixel Detector project
2007-01-01
The ALICE Silicon Pixel Detector contains 1200 readout chips. Fast-OR signals indicate the presence of at least one hit in the 8192 pixel matrix of each chip. The 1200 bits are transmitted every 100 ns on 120 data readout optical links using the G-Link protocol. The Pixel Trigger System extracts and processes them to deliver an input signal to the Level 0 trigger processor targeting a latency of 800 ns. The system is compact, modular and based on FPGA devices. The architecture allows the user to define and implement various trigger algorithms. The system uses advanced 12-channel parallel optical fiber modules operating at 1310 nm as optical receivers and 12 deserializer chips closely packed in small area receiver boards. Alternative solutions with multi-channel G-Link deserializers implemented directly in programmable hardware devices were investigated. The design of the system and the progress of the ALICE Pixel Trigger project are described in this paper.
Deri, Robert J.; DeGroot, Anthony J.; Haigh, Ronald E.
2002-01-01
As the performance of individual elements within parallel processing systems increases, increased communication capability between distributed processor and memory elements is required. There is great interest in using fiber optics to improve interconnect communication beyond that attainable using electronic technology. Several groups have considered WDM, star-coupled optical interconnects. The invention uses a fiber optic transceiver to provide low latency, high bandwidth channels for such interconnects using a robust multimode fiber technology. Instruction-level simulation is used to quantify the bandwidth, latency, and concurrency required for such interconnects to scale to 256 nodes, each operating at 1 GFLOPS performance. Performance scales have been shown to .apprxeq.100 GFLOPS for scientific application kernels using a small number of wavelengths (8 to 32), only one wavelength received per node, and achievable optoelectronic bandwidth and latency.
Local wavelet transform: a cost-efficient custom processor for space image compression
NASA Astrophysics Data System (ADS)
Masschelein, Bart; Bormans, Jan G.; Lafruit, Gauthier
2002-11-01
Thanks to its intrinsic scalability features, the wavelet transform has become increasingly popular as decorrelator in image compression applications. Throuhgput, memory requirements and complexity are important parameters when developing hardware image compression modules. An implementation of the classical, global wavelet transform requires large memory sizes and implies a large latency between the availability of the input image and the production of minimal data entities for entropy coding. Image tiling methods, as proposed by JPEG2000, reduce the memory sizes and the latency, but inevitably introduce image artefacts. The Local Wavelet Transform (LWT), presented in this paper, is a low-complexity wavelet transform architecture using a block-based processing that results in the same transformed images as those obtained by the global wavelet transform. The architecture minimizes the processing latency with a limited amount of memory. Moreover, as the LWT is an instruction-based custom processor, it can be programmed for specific tasks, such as push-broom processing of infinite-length satelite images. The features of the LWT makes it appropriate for use in space image compression, where high throughput, low memory sizes, low complexity, low power and push-broom processing are important requirements.
The ATLAS Level-1 Topological Trigger performance in Run 2
NASA Astrophysics Data System (ADS)
Riu, Imma; ATLAS Collaboration
2017-10-01
The Level-1 trigger is the first event rate reducing step in the ATLAS detector trigger system, with an output rate of up to 100 kHz and decision latency smaller than 2.5 μs. During the LHC shutdown after Run 1, the Level-1 trigger system was upgraded at hardware, firmware and software levels. In particular, a new electronics sub-system was introduced in the real-time data processing path: the Level-1 Topological trigger system. It consists of a single electronics shelf equipped with two Level-1 Topological processor blades. They receive real-time information from the Level-1 calorimeter and muon triggers, which is processed to measure angles between trigger objects, invariant masses or other kinematic variables. Complementary to other requirements, these measurements are taken into account in the final Level-1 trigger decision. The system was installed and commissioning started in 2015 and continued during 2016. As part of the commissioning, the decisions from individual algorithms were simulated and compared with the hardware response. An overview of the Level-1 Topological trigger system design, commissioning process and impact on several event selections are illustrated.
Level Zero Trigger Processor for the NA62 experiment
NASA Astrophysics Data System (ADS)
Soldi, D.; Chiozzi, S.
2018-05-01
The NA62 experiment is designed to measure the ultra-rare decay K+ arrow π+ ν bar nu branching ratio with a precision of ~ 10% at the CERN Super Proton Synchrotron (SPS). The trigger system of NA62 consists in three different levels designed to select events of physics interest in a high beam rate environment. The L0 Trigger Processor (L0TP) is the lowest level system of the trigger chain. It is hardware implemented using programmable logic. The architecture of the NA62 L0TP system is a new approach compared to existing systems used in high-energy physics experiments. It is fully digital, based on a standard gigabit Ethernet communication between detectors and the L0TP Board. The L0TP Board is a commercial development board, mounting a programmable logic device (FPGA). The primitives generated by sub-detectors are sent asynchronously using the UDP protocol to the L0TP during the entire beam spill period. The L0TP realigns in time the primitives coming from seven different sources and performs a data selection based on the characteristics of the event such as energy, multiplicity and topology of hits in the sub-detectors. It guarantees a maximum latency of 1 ms. The maximum input rate is about 10 MHz for each sub-detector, while the design maximum output trigger rate is 1 MHz. A description of the trigger algorithm is presented here.
First low-latency LIGO+Virgo search for binary inspirals and their electromagnetic counterparts
NASA Astrophysics Data System (ADS)
Abadie, J.; Abbott, B. P.; Abbott, R.; Abbott, T. D.; Abernathy, M.; Accadia, T.; Acernese, F.; Adams, C.; Adhikari, R.; Affeldt, C.; Agathos, M.; Agatsuma, K.; Ajith, P.; Allen, B.; Amador Ceron, E.; Amariutei, D.; Anderson, S. B.; Anderson, W. G.; Arai, K.; Arain, M. A.; Araya, M. C.; Aston, S. M.; Astone, P.; Atkinson, D.; Aufmuth, P.; Aulbert, C.; Aylott, B. E.; Babak, S.; Baker, P.; Ballardin, G.; Ballmer, S.; Barayoga, J. C. B.; Barker, D.; Barone, F.; Barr, B.; Barsotti, L.; Barsuglia, M.; Barton, M. A.; Bartos, I.; Bassiri, R.; Bastarrika, M.; Basti, A.; Batch, J.; Bauchrowitz, J.; Bauer, Th. S.; Bebronne, M.; Beck, D.; Behnke, B.; Bejger, M.; Beker, M. G.; Bell, A. S.; Belletoile, A.; Belopolski, I.; Benacquista, M.; Berliner, J. M.; Bertolini, A.; Betzwieser, J.; Beveridge, N.; Beyersdorf, P. T.; Bilenko, I. A.; Billingsley, G.; Birch, J.; Biswas, R.; Bitossi, M.; Bizouard, M. A.; Black, E.; Blackburn, J. K.; Blackburn, L.; Blair, D.; Bland, B.; Blom, M.; Bock, O.; Bodiya, T. P.; Bogan, C.; Bondarescu, R.; Bondu, F.; Bonelli, L.; Bonnand, R.; Bork, R.; Born, M.; Boschi, V.; Bose, S.; Bosi, L.; Bouhou, B.; Braccini, S.; Bradaschia, C.; Brady, P. R.; Braginsky, V. B.; Branchesi, M.; Brau, J. E.; Breyer, J.; Briant, T.; Bridges, D. O.; Brillet, A.; Brinkmann, M.; Brisson, V.; Britzger, M.; Brooks, A. F.; Brown, D. A.; Bulik, T.; Bulten, H. J.; Buonanno, A.; Burguet-Castell, J.; Buskulic, D.; Buy, C.; Byer, R. L.; Cadonati, L.; Cagnoli, G.; Calloni, E.; Camp, J. B.; Campsie, P.; Cannizzo, J.; Cannon, K.; Canuel, B.; Cao, J.; Capano, C. D.; Carbognani, F.; Carbone, L.; Caride, S.; Caudill, S.; Cavaglià, M.; Cavalier, F.; Cavalieri, R.; Cella, G.; Cepeda, C.; Cesarini, E.; Chaibi, O.; Chalermsongsak, T.; Charlton, P.; Chassande-Mottin, E.; Chelkowski, S.; Chen, W.; Chen, X.; Chen, Y.; Chincarini, A.; Chiummo, A.; Cho, H. S.; Chow, J.; Christensen, N.; Chua, S. S. Y.; Chung, C. T. Y.; Chung, S.; Ciani, G.; Clara, F.; Clark, D. E.; Clark, J.; Clayton, J. H.; Cleva, F.; Coccia, E.; Cohadon, P.-F.; Colacino, C. N.; Colas, J.; Colla, A.; Colombini, M.; Conte, A.; Conte, R.; Cook, D.; Corbitt, T. R.; Cordier, M.; Cornish, N.; Corsi, A.; Costa, C. A.; Coughlin, M.; Coulon, J.-P.; Couvares, P.; Coward, D. M.; Cowart, M.; Coyne, D. C.; Creighton, J. D. E.; Creighton, T. D.; Cruise, A. M.; Cumming, A.; Cunningham, L.; Cuoco, E.; Cutler, R. M.; Dahl, K.; Danilishin, S. L.; Dannenberg, R.; D'Antonio, S.; Danzmann, K.; Dattilo, V.; Daudert, B.; Daveloza, H.; Davier, M.; Daw, E. J.; Day, R.; Dayanga, T.; De Rosa, R.; DeBra, D.; Debreczeni, G.; Del Pozzo, W.; del Prete, M.; Dent, T.; Dergachev, V.; DeRosa, R.; DeSalvo, R.; Dhurandhar, S.; Di Fiore, L.; Di Lieto, A.; Di Palma, I.; Emilio, M. Di Paolo; Di Virgilio, A.; Díaz, M.; Dietz, A.; Donovan, F.; Dooley, K. L.; Drago, M.; Drever, R. W. P.; Driggers, J. C.; Du, Z.; Dumas, J.-C.; Dwyer, S.; Eberle, T.; Edgar, M.; Edwards, M.; Effler, A.; Ehrens, P.; Endrőczi, G.; Engel, R.; Etzel, T.; Evans, K.; Evans, M.; Evans, T.; Factourovich, M.; Fafone, V.; Fairhurst, S.; Fan, Y.; Farr, B. F.; Fazi, D.; Fehrmann, H.; Feldbaum, D.; Feroz, F.; Ferrante, I.; Fidecaro, F.; Finn, L. S.; Fiori, I.; Fisher, R. P.; Flaminio, R.; Flanigan, M.; Foley, S.; Forsi, E.; Forte, L. A.; Fotopoulos, N.; Fournier, J.-D.; Franc, J.; Frasca, S.; Frasconi, F.; Frede, M.; Frei, M.; Frei, Z.; Freise, A.; Frey, R.; Fricke, T. T.; Friedrich, D.; Fritschel, P.; Frolov, V. V.; Fujimoto, M.-K.; Fulda, P. J.; Fyffe, M.; Gair, J.; Galimberti, M.; Gammaitoni, L.; Garcia, J.; Garufi, F.; Gáspár, M. E.; Gemme, G.; Geng, R.; Genin, E.; Gennai, A.; Gergely, L. Á.; Ghosh, S.; Giaime, J. A.; Giampanis, S.; Giardina, K. D.; Giazotto, A.; Gil-Casanova, S.; Gill, C.; Gleason, J.; Goetz, E.; Goggin, L. M.; González, G.; Gorodetsky, M. L.; Goßler, S.; Gouaty, R.; Graef, C.; Graff, P. B.; Granata, M.; Grant, A.; Gras, S.; Gray, C.; Gray, N.; Greenhalgh, R. J. S.; Gretarsson, A. M.; Greverie, C.; Grosso, R.; Grote, H.; Grunewald, S.; Guidi, G. M.; Guido, C.; Gupta, R.; Gustafson, E. K.; Gustafson, R.; Ha, T.; Hallam, J. M.; Hammer, D.; Hammond, G.; Hanks, J.; Hanna, C.; Hanson, J.; Harms, J.; Harry, G. M.; Harry, I. W.; Harstad, E. D.; Hartman, M. T.; Haughian, K.; Hayama, K.; Hayau, J.-F.; Heefner, J.; Heidmann, A.; Heintze, M. C.; Heitmann, H.; Hello, P.; Hendry, M. A.; Heng, I. S.; Heptonstall, A. W.; Herrera, V.; Hewitson, M.; Hild, S.; Hoak, D.; Hodge, K. A.; Holt, K.; Holtrop, M.; Hong, T.; Hooper, S.; Hosken, D. J.; Hough, J.; Howell, E. J.; Hughey, B.; Husa, S.; Huttner, S. H.; Huynh-Dinh, T.; Ingram, D. R.; Inta, R.; Isogai, T.; Ivanov, A.; Izumi, K.; Jacobson, M.; James, E.; Jang, Y. J.; Jaranowski, P.; Jesse, E.; Johnson, W. W.; Jones, D. I.; Jones, G.; Jones, R.; Ju, L.; Kalmus, P.; Kalogera, V.; Kandhasamy, S.; Kang, G.; Kanner, J. B.; Kasturi, R.; Katsavounidis, E.; Katzman, W.; Kaufer, H.; Kawabe, K.; Kawamura, S.; Kawazoe, F.; Kelley, D.; Kells, W.; Keppel, D. G.; Keresztes, Z.; Khalaidovski, A.; Khalili, F. Y.; Khazanov, E. A.; Kim, B. K.; Kim, C.; Kim, H.; Kim, K.; Kim, N.; Kim, Y. M.; King, P. J.; Kinzel, D. L.; Kissel, J. S.; Klimenko, S.; Kokeyama, K.; Kondrashov, V.; Koranda, S.; Korth, W. Z.; Kowalska, I.; Kozak, D.; Kranz, O.; Kringel, V.; Krishnamurthy, S.; Krishnan, B.; Królak, A.; Kuehn, G.; Kumar, R.; Kwee, P.; Lam, P. K.; Landry, M.; Lantz, B.; Lastzka, N.; Lawrie, C.; Lazzarini, A.; Leaci, P.; Lee, C. H.; Lee, H. K.; Lee, H. M.; Leong, J. R.; Leonor, I.; Leroy, N.; Letendre, N.; Li, J.; Li, T. G. F.; Liguori, N.; Lindquist, P. E.; Liu, Y.; Liu, Z.; Lockerbie, N. A.; Lodhia, D.; Lorenzini, M.; Loriette, V.; Lormand, M.; Losurdo, G.; Lough, J.; Luan, J.; Lubinski, M.; Lück, H.; Lundgren, A. P.; Macdonald, E.; Machenschalk, B.; MacInnis, M.; Macleod, D. M.; Mageswaran, M.; Mailand, K.; Majorana, E.; Maksimovic, I.; Man, N.; Mandel, I.; Mandic, V.; Mantovani, M.; Marandi, A.; Marchesoni, F.; Marion, F.; Márka, S.; Márka, Z.; Markosyan, A.; Maros, E.; Marque, J.; Martelli, F.; Martin, I. W.; Martin, R. M.; Marx, J. N.; Mason, K.; Masserot, A.; Matichard, F.; Matone, L.; Matzner, R. A.; Mavalvala, N.; Mazzolo, G.; McCarthy, R.; McClelland, D. E.; McGuire, S. C.; McIntyre, G.; McIver, J.; McKechan, D. J. A.; McWilliams, S.; Meadors, G. D.; Mehmet, M.; Meier, T.; Melatos, A.; Melissinos, A. C.; Mendell, G.; Mercer, R. A.; Meshkov, S.; Messenger, C.; Meyer, M. S.; Miao, H.; Michel, C.; Milano, L.; Miller, J.; Minenkov, Y.; Mitrofanov, V. P.; Mitselmakher, G.; Mittleman, R.; Miyakawa, O.; Moe, B.; Mohan, M.; Mohanty, S. D.; Mohapatra, S. R. P.; Moraru, D.; Moreno, G.; Morgado, N.; Morgia, A.; Mori, T.; Morriss, S. R.; Mosca, S.; Mossavi, K.; Mours, B.; Mow-Lowry, C. M.; Mueller, C. L.; Mueller, G.; Mukherjee, S.; Mullavey, A.; Müller-Ebhardt, H.; Munch, J.; Murphy, D.; Murray, P. G.; Mytidis, A.; Nash, T.; Naticchioni, L.; Necula, V.; Nelson, J.; Neri, I.; Newton, G.; Nguyen, T.; Nishizawa, A.; Nitz, A.; Nocera, F.; Nolting, D.; Normandin, M. E.; Nuttall, L.; Ochsner, E.; O'Dell, J.; Oelker, E.; Ogin, G. H.; Oh, J. J.; Oh, S. H.; O'Reilly, B.; O'Shaughnessy, R.; Osthelder, C.; Ott, C. D.; Ottaway, D. J.; Ottens, R. S.; Overmier, H.; Owen, B. J.; Page, A.; Pagliaroli, G.; Palladino, L.; Palomba, C.; Pan, Y.; Pankow, C.; Paoletti, F.; Papa, M. A.; Parisi, M.; Pasqualetti, A.; Passaquieti, R.; Passuello, D.; Patel, P.; Pedraza, M.; Peiris, P.; Pekowsky, L.; Penn, S.; Perreca, A.; Persichetti, G.; Phelps, M.; Pichot, M.; Pickenpack, M.; Piergiovanni, F.; Pietka, M.; Pinard, L.; Pinto, I. M.; Pitkin, M.; Pletsch, H. J.; Plissi, M. V.; Poggiani, R.; Pöld, J.; Postiglione, F.; Prato, M.; Predoi, V.; Prestegard, T.; Price, L. R.; Prijatelj, M.; Principe, M.; Privitera, S.; Prix, R.; Prodi, G. A.; Prokhorov, L. G.; Puncken, O.; Punturo, M.; Puppo, P.; Quetschke, V.; Quitzow-James, R.; Raab, F. J.; Rabeling, D. S.; Rácz, I.; Radkins, H.; Raffai, P.; Rakhmanov, M.; Rankins, B.; Rapagnani, P.; Raymond, V.; Re, V.; Redwine, K.; Reed, C. M.; Reed, T.; Regimbau, T.; Reid, S.; Reitze, D. H.; Ricci, F.; Riesen, R.; Riles, K.; Robertson, N. A.; Robinet, F.; Robinson, C.; Robinson, E. L.; Rocchi, A.; Roddy, S.; Rodriguez, C.; Rodruck, M.; Rolland, L.; Rollins, J. G.; Romano, J. D.; Romano, R.; Romie, J. H.; Rosińska, D.; Röver, C.; Rowan, S.; Rüdiger, A.; Ruggi, P.; Ryan, K.; Sainathan, P.; Salemi, F.; Sammut, L.; Sandberg, V.; Sannibale, V.; Santamaría, L.; Santiago-Prieto, I.; Santostasi, G.; Sassolas, B.; Sathyaprakash, B. S.; Sato, S.; Saulson, P. R.; Savage, R. L.; Schilling, R.; Schnabel, R.; Schofield, R. M. S.; Schreiber, E.; Schulz, B.; Schutz, B. F.; Schwinberg, P.; Scott, J.; Scott, S. M.; Seifert, F.; Sellers, D.; Sentenac, D.; Sergeev, A.; Shaddock, D. A.; Shaltev, M.; Shapiro, B.; Shawhan, P.; Shoemaker, D. H.; Sibley, A.; Siemens, X.; Sigg, D.; Singer, A.; Singer, L.; Sintes, A. M.; Skelton, G. R.; Slagmolen, B. J. J.; Slutsky, J.; Smith, J. R.; Smith, M. R.; Smith, R. J. E.; Smith-Lefebvre, N. D.; Somiya, K.; Sorazu, B.; Soto, J.; Speirits, F. C.; Sperandio, L.; Stefszky, M.; Stein, A. J.; Stein, L. C.; Steinert, E.; Steinlechner, J.; Steinlechner, S.; Steplewski, S.; Stochino, A.; Stone, R.; Strain, K. A.; Strigin, S. E.; Stroeer, A. S.; Sturani, R.; Stuver, A. L.; Summerscales, T. Z.; Sung, M.; Susmithan, S.; Sutton, P. J.; Swinkels, B.; Tacca, M.; Taffarello, L.; Talukder, D.; Tanner, D. B.; Tarabrin, S. P.; Taylor, J. R.; Taylor, R.; Thomas, P.; Thorne, K. A.; Thorne, K. S.; Thrane, E.; Thüring, A.; Tokmakov, K. V.; Tomlinson, C.; Toncelli, A.; Tonelli, M.; Torre, O.; Torres, C.; Torrie, C. I.; Tournefier, E.; Travasso, F.; Traylor, G.; Tseng, K.; Ugolini, D.; Vahlbruch, H.; Vajente, G.; van den Brand, J. F. J.; Van Den Broeck, C.; van der Putten, S.; van Veggel, A. A.; Vass, S.; Vasuth, M.; Vaulin, R.; Vavoulidis, M.; Vecchio, A.; Vedovato, G.; Veitch, J.; Veitch, P. J.; Veltkamp, C.; Verkindt, D.; Vetrano, F.; Viceré, A.; Villar, A. E.; Vinet, J.-Y.; Vitale, S.; Vocca, H.; Vorvick, C.; Vyatchanin, S. P.; Wade, A.; Wade, L.; Wade, M.; Waldman, S. J.; Wallace, L.; Wan, Y.; Wang, M.; Wang, X.; Wang, Z.; Wanner, A.; Ward, R. L.; Was, M.; Weinert, M.; Weinstein, A. J.; Weiss, R.; Wen, L.; Wessels, P.; West, M.; Westphal, T.; Wette, K.; Whelan, J. T.; Whitcomb, S. E.; White, D. J.; Whiting, B. F.; Wilkinson, C.; Willems, P. A.; Williams, L.; Williams, R.; Willke, B.; Winkelmann, L.; Winkler, W.; Wipf, C. C.; Wiseman, A. G.; Wittel, H.; Woan, G.; Wooley, R.; Worden, J.; Yakushin, I.; Yamamoto, H.; Yamamoto, K.; Yancey, C. C.; Yang, H.; Yeaton-Massey, D.; Yoshida, S.; Yu, P.; Yvert, M.; Zadrożny, A.; Zanolin, M.; Zendri, J.-P.; Zhang, F.; Zhang, L.; Zhang, W.; Zhao, C.; Zotov, N.; Zucker, M. E.; Zweizig, J.
2012-05-01
Aims: The detection and measurement of gravitational-waves from coalescing neutron-star binary systems is an important science goal for ground-based gravitational-wave detectors. In addition to emitting gravitational-waves at frequencies that span the most sensitive bands of the LIGO and Virgo detectors, these sources are also amongst the most likely to produce an electromagnetic counterpart to the gravitational-wave emission. A joint detection of the gravitational-wave and electromagnetic signals would provide a powerful new probe for astronomy. Methods: During the period between September 19 and October 20, 2010, the first low-latency search for gravitational-waves from binary inspirals in LIGO and Virgo data was conducted. The resulting triggers were sent to electromagnetic observatories for followup. We describe the generation and processing of the low-latency gravitational-wave triggers. The results of the electromagnetic image analysis will be described elsewhere. Results: Over the course of the science run, three gravitational-wave triggers passed all of the low-latency selection cuts. Of these, one was followed up by several of our observational partners. Analysis of the gravitational-wave data leads to an estimated false alarm rate of once every 6.4 days, falling far short of the requirement for a detection based solely on gravitational-wave data.
Design of the SLAC RCE Platform: A General Purpose ATCA Based Data Acquisition System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Herbst, R.; Claus, R.; Freytag, M.
2015-01-23
The SLAC RCE platform is a general purpose clustered data acquisition system implemented on a custom ATCA compliant blade, called the Cluster On Board (COB). The core of the system is the Reconfigurable Cluster Element (RCE), which is a system-on-chip design based upon the Xilinx Zynq family of FPGAs, mounted on custom COB daughter-boards. The Zynq architecture couples a dual core ARM Cortex A9 based processor with a high performance 28nm FPGA. The RCE has 12 external general purpose bi-directional high speed links, each supporting serial rates of up to 12Gbps. 8 RCE nodes are included on a COB, eachmore » with a 10Gbps connection to an on-board 24-port Ethernet switch integrated circuit. The COB is designed to be used with a standard full-mesh ATCA backplane allowing multiple RCE nodes to be tightly interconnected with minimal interconnect latency. Multiple shelves can be clustered using the front panel 10-gbps connections. The COB also supports local and inter-blade timing and trigger distribution. An experiment specific Rear Transition Module adapts the 96 high speed serial links to specific experiments and allows an experiment-specific timing and busy feedback connection. This coupling of processors with a high performance FPGA fabric in a low latency, multiple node cluster allows high speed data processing that can be easily adapted to any physics experiment. RTEMS and Linux are both ported to the module. The RCE has been used or is the baseline for several current and proposed experiments (LCLS, HPS, LSST, ATLAS-CSC, LBNE, DarkSide, ILC-SiD, etc).« less
Level Zero Trigger Processor for the ultra rare kaon decay experiment: NA62
NASA Astrophysics Data System (ADS)
Soldi, Dario; Chiozzi, S.; Gamberini, E.; Gianoli, A.; Mila, G.; Neri, I.; Petrucci, F.
2017-02-01
The NA62 experiment is designed to measure the (ultra-)rare decay K+ →π+ ν ν bar branching ratio with a precision of ∼ 10 % at the CERN Super Proton Synchrotron (SPS). The L0 Trigger Processor (L0TP) is the lowest level system of the trigger chain. It is hardware implemented using programmable logic. The architecture of the L0TP is completely new for a high energy physics experiment. It is fully digital, based on a standard gigabit ethernet communication between detectors and L0TP Board. The L0TP Board is a commercial development board, Terasic DE4, mounting an Altera Stratix IV FPGA. The primitives generated by sub-detectors are sent asynchronously using the UDP protocol to the L0TP during the entire beam spill period (about 5 seconds). The L0TP realigns in time the primitives coming from 7 different sources and manages the information of the time plus all the characteristics of the event as energy, multiplicity and position of hits in order to select good events with a comparison with preset masks. It should guarantee a maximum latency of 1 ms. The maximum input rate is 10 MHz for each sub-detector, while the design maximum output trigger rate is 1 MHz. A complete trigger-less parasitic acquisition of the primitives is possible using mirroring switches to monitor the L0 behavior. A first version of the L0TP was commissioned during the 2014 NA62 pilot run and it is used in the current data taking. A description of the trigger algorithm is here presented.
Parallel processor for real-time structural control
NASA Astrophysics Data System (ADS)
Tise, Bert L.
1993-07-01
A parallel processor that is optimized for real-time linear control has been developed. This modular system consists of A/D modules, D/A modules, and floating-point processor modules. The scalable processor uses up to 1,000 Motorola DSP96002 floating-point processors for a peak computational rate of 60 GFLOPS. Sampling rates up to 625 kHz are supported by this analog-in to analog-out controller. The high processing rate and parallel architecture make this processor suitable for computing state-space equations and other multiply/accumulate-intensive digital filters. Processor features include 14-bit conversion devices, low input-to-output latency, 240 Mbyte/s synchronous backplane bus, low-skew clock distribution circuit, VME connection to host computer, parallelizing code generator, and look- up-tables for actuator linearization. This processor was designed primarily for experiments in structural control. The A/D modules sample sensors mounted on the structure and the floating- point processor modules compute the outputs using the programmed control equations. The outputs are sent through the D/A module to the power amps used to drive the structure's actuators. The host computer is a Sun workstation. An OpenWindows-based control panel is provided to facilitate data transfer to and from the processor, as well as to control the operating mode of the processor. A diagnostic mode is provided to allow stimulation of the structure and acquisition of the structural response via sensor inputs.
New Dimensions in Microarchitecture Harnessing 3D Integration Technologies (BRIEFING CHARTS)
2007-03-06
Quad Core Bandwidth and Latency Boundaries General Purpose Processor Loads Latency limited Ba nd w id th li m ite dProcessor load trade -off between I...delay No= number of ckts at 1V do= ckt delay at 1V From “3D Intergration ” Special Topic Sessionl W. Haensch, ISSCC ‘07, 2/07 11 DARPA MTS March 6, 2007
Communication System and Method
NASA Technical Reports Server (NTRS)
Sanders, Adam M. (Inventor); Strawser, Philip A. (Inventor)
2014-01-01
A communication system for communicating over high-latency, low bandwidth networks includes a communications processor configured to receive a collection of data from a local system, and a transceiver in communication with the communications processor. The transceiver is configured to transmit and receive data over a network according to a plurality of communication parameters. The communications processor is configured to divide the collection of data into a plurality of data streams; assign a priority level to each of the respective data streams, where the priority level reflects the criticality of the respective data stream; and modify a communication parameter of at least one of the plurality of data streams according to the priority of the at least one data stream.
Lingala, Sajan Goud; Zhu, Yinghua; Lim, Yongwan; Toutios, Asterios; Ji, Yunhua; Lo, Wei-Ching; Seiberlich, Nicole; Narayanan, Shrikanth; Nayak, Krishna S
2017-12-01
To evaluate the feasibility of through-time spiral generalized autocalibrating partial parallel acquisition (GRAPPA) for low-latency accelerated real-time MRI of speech. Through-time spiral GRAPPA (spiral GRAPPA), a fast linear reconstruction method, is applied to spiral (k-t) data acquired from an eight-channel custom upper-airway coil. Fully sampled data were retrospectively down-sampled to evaluate spiral GRAPPA at undersampling factors R = 2 to 6. Pseudo-golden-angle spiral acquisitions were used for prospective studies. Three subjects were imaged while performing a range of speech tasks that involved rapid articulator movements, including fluent speech and beat-boxing. Spiral GRAPPA was compared with view sharing, and a parallel imaging and compressed sensing (PI-CS) method. Spiral GRAPPA captured spatiotemporal dynamics of vocal tract articulators at undersampling factors ≤4. Spiral GRAPPA at 18 ms/frame and 2.4 mm 2 /pixel outperformed view sharing in depicting rapidly moving articulators. Spiral GRAPPA and PI-CS provided equivalent temporal fidelity. Reconstruction latency per frame was 14 ms for view sharing and 116 ms for spiral GRAPPA, using a single processor. Spiral GRAPPA kept up with the MRI data rate of 18ms/frame with eight processors. PI-CS required 17 minutes to reconstruct 5 seconds of dynamic data. Spiral GRAPPA enabled 4-fold accelerated real-time MRI of speech with a low reconstruction latency. This approach is applicable to wide range of speech RT-MRI experiments that benefit from real-time feedback while visualizing rapid articulator movement. Magn Reson Med 78:2275-2282, 2017. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
Parallel processor for real-time structural control
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tise, B.L.
1992-01-01
A parallel processor that is optimized for real-time linear control has been developed. This modular system consists of A/D modules, D/A modules, and floating-point processor modules. The scalable processor uses up to 1,000 Motorola DSP96002 floating-point processors for a peak computational rate of 60 GFLOPS. Sampling rates up to 625 kHz are supported by this analog-in to analog-out controller. The high processing rate and parallel architecture make this processor suitable for computing state-space equations and other multiply/accumulate-intensive digital filters. Processor features include 14-bit conversion devices, low input-output latency, 240 Mbyte/s synchronous backplane bus, low-skew clock distribution circuit, VME connection tomore » host computer, parallelizing code generator, and look-up-tables for actuator linearization. This processor was designed primarily for experiments in structural control. The A/D modules sample sensors mounted on the structure and the floating-point processor modules compute the outputs using the programmed control equations. The outputs are sent through the D/A module to the power amps used to drive the structure's actuators. The host computer is a Sun workstation. An Open Windows-based control panel is provided to facilitate data transfer to and from the processor, as well as to control the operating mode of the processor. A diagnostic mode is provided to allow stimulation of the structure and acquisition of the structural response via sensor inputs.« less
Friedmann, Simon; Frémaux, Nicolas; Schemmel, Johannes; Gerstner, Wulfram; Meier, Karlheinz
2013-01-01
In this study, we propose and analyze in simulations a new, highly flexible method of implementing synaptic plasticity in a wafer-scale, accelerated neuromorphic hardware system. The study focuses on globally modulated STDP, as a special use-case of this method. Flexibility is achieved by embedding a general-purpose processor dedicated to plasticity into the wafer. To evaluate the suitability of the proposed system, we use a reward modulated STDP rule in a spike train learning task. A single layer of neurons is trained to fire at specific points in time with only the reward as feedback. This model is simulated to measure its performance, i.e., the increase in received reward after learning. Using this performance as baseline, we then simulate the model with various constraints imposed by the proposed implementation and compare the performance. The simulated constraints include discretized synaptic weights, a restricted interface between analog synapses and embedded processor, and mismatch of analog circuits. We find that probabilistic updates can increase the performance of low-resolution weights, a simple interface between analog synapses and processor is sufficient for learning, and performance is insensitive to mismatch. Further, we consider communication latency between wafer and the conventional control computer system that is simulating the environment. This latency increases the delay, with which the reward is sent to the embedded processor. Because of the time continuous operation of the analog synapses, delay can cause a deviation of the updates as compared to the not delayed situation. We find that for highly accelerated systems latency has to be kept to a minimum. This study demonstrates the suitability of the proposed implementation to emulate the selected reward modulated STDP learning rule. It is therefore an ideal candidate for implementation in an upgraded version of the wafer-scale system developed within the BrainScaleS project.
Friedmann, Simon; Frémaux, Nicolas; Schemmel, Johannes; Gerstner, Wulfram; Meier, Karlheinz
2013-01-01
In this study, we propose and analyze in simulations a new, highly flexible method of implementing synaptic plasticity in a wafer-scale, accelerated neuromorphic hardware system. The study focuses on globally modulated STDP, as a special use-case of this method. Flexibility is achieved by embedding a general-purpose processor dedicated to plasticity into the wafer. To evaluate the suitability of the proposed system, we use a reward modulated STDP rule in a spike train learning task. A single layer of neurons is trained to fire at specific points in time with only the reward as feedback. This model is simulated to measure its performance, i.e., the increase in received reward after learning. Using this performance as baseline, we then simulate the model with various constraints imposed by the proposed implementation and compare the performance. The simulated constraints include discretized synaptic weights, a restricted interface between analog synapses and embedded processor, and mismatch of analog circuits. We find that probabilistic updates can increase the performance of low-resolution weights, a simple interface between analog synapses and processor is sufficient for learning, and performance is insensitive to mismatch. Further, we consider communication latency between wafer and the conventional control computer system that is simulating the environment. This latency increases the delay, with which the reward is sent to the embedded processor. Because of the time continuous operation of the analog synapses, delay can cause a deviation of the updates as compared to the not delayed situation. We find that for highly accelerated systems latency has to be kept to a minimum. This study demonstrates the suitability of the proposed implementation to emulate the selected reward modulated STDP learning rule. It is therefore an ideal candidate for implementation in an upgraded version of the wafer-scale system developed within the BrainScaleS project. PMID:24065877
Electronics for CMS Endcap Muon Level-1 Trigger System Phase-1 and HL LHC upgrades
NASA Astrophysics Data System (ADS)
Madorsky, A.
2017-07-01
To accommodate high-luminosity LHC operation at a 13 TeV collision energy, the CMS Endcap Muon Level-1 Trigger system had to be significantly modified. To provide robust track reconstruction, the trigger system must now import all available trigger primitives generated by the Cathode Strip Chambers and by certain other subsystems, such as Resistive Plate Chambers (RPC). In addition to massive input bandwidth, this also required significant increase in logic and memory resources. To satisfy these requirements, a new Sector Processor unit has been designed. It consists of three modules. The Core Logic module houses the large FPGA that contains the track-finding logic and multi-gigabit serial links for data exchange. The Optical module contains optical receivers and transmitters; it communicates with the Core Logic module via a custom backplane section. The Pt Lookup table (PTLUT) module contains 1 GB of low-latency memory that is used to assign the final Pt to reconstructed muon tracks. The μ TCA architecture (adopted by CMS) was used for this design. The talk presents the details of the hardware and firmware design of the production system based on Xilinx Virtex-7 FPGA family. The next round of LHC and CMS upgrades starts in 2019, followed by a major High-Luminosity (HL) LHC upgrade starting in 2024. In the course of these upgrades, new Gas Electron Multiplier (GEM) detectors and more RPC chambers will be added to the Endcap Muon system. In order to keep up with all these changes, a new Advanced Processor unit is being designed. This device will be based on Xilinx UltraScale+ FPGAs. It will be able to accommodate up to 100 serial links with bit rates of up to 25 Gb/s, and provide up to 2.5 times more logic resources than the device used currently. The amount of PTLUT memory will be significantly increased to provide more flexibility for the Pt assignment algorithm. The talk presents preliminary details of the hardware design program.
Evaluation of GPUs as a level-1 track trigger for the High-Luminosity LHC
NASA Astrophysics Data System (ADS)
Mohr, H.; Dritschler, T.; Ardila, L. E.; Balzer, M.; Caselle, M.; Chilingaryan, S.; Kopmann, A.; Rota, L.; Schuh, T.; Vogelgesang, M.; Weber, M.
2017-04-01
In this work, we investigate the use of GPUs as a way of realizing a low-latency, high-throughput track trigger, using CMS as a showcase example. The CMS detector at the Large Hadron Collider (LHC) will undergo a major upgrade after the long shutdown from 2024 to 2026 when it will enter the high luminosity era. During this upgrade, the silicon tracker will have to be completely replaced. In the High Luminosity operation mode, luminosities of 5-7 × 1034 cm-2s-1 and pileups averaging at 140 events, with a maximum of up to 200 events, will be reached. These changes will require a major update of the triggering system. The demonstrated systems rely on dedicated hardware such as associative memory ASICs and FPGAs. We investigate the use of GPUs as an alternative way of realizing the requirements of the L1 track trigger. To this end we implemeted a Hough transformation track finding step on GPUs and established a low-latency RDMA connection using the PCIe bus. To showcase the benefits of floating point operations, made possible by the use of GPUs, we present a modified algorithm. It uses hexagonal bins for the parameter space and leads to a more truthful representation of the possible track parameters of the individual hits in Hough space. This leads to fewer duplicate candidates and reduces fake track candidates compared to the regular approach. With data-transfer latencies of 2 μs and processing times for the Hough transformation as low as 3.6 μs, we can show that latencies are not as critical as expected. However, computing throughput proves to be challenging due to hardware limitations.
Using MaxCompiler for the high level synthesis of trigger algorithms
NASA Astrophysics Data System (ADS)
Summers, S.; Rose, A.; Sanders, P.
2017-02-01
Firmware for FPGA trigger applications at the CMS experiment is conventionally written using hardware description languages such as Verilog and VHDL. MaxCompiler is an alternative, Java based, tool for developing FPGA applications which uses a higher level of abstraction from the hardware than a hardware description language. An implementation of the jet and energy sum algorithms for the CMS Level-1 calorimeter trigger has been written using MaxCompiler to benchmark against the VHDL implementation in terms of accuracy, latency, resource usage, and code size. A Kalman Filter track fitting algorithm has been developed using MaxCompiler for a proposed CMS Level-1 track trigger for the High-Luminosity LHC upgrade. The design achieves a low resource usage, and has a latency of 187.5 ns per iteration.
NASA Astrophysics Data System (ADS)
Meng, X. T.; Levin, D. S.; Chapman, J. W.; Li, D. C.; Yao, Z. E.; Zhou, B.
2017-02-01
The High Performance Time to Digital Converter (HPTDC), a multi-channel ASIC designed by the CERN Microelectronics group, has been proposed for the digitization of the thin-Resistive Plate Chambers (tRPC) in the ATLAS Muon Spectrometer Phase-1 upgrade project. These chambers, to be staged for higher luminosity LHC operation, will increase trigger acceptance and reduce or eliminate the fake muon trigger rates in the barrel-endcap transition region, corresponding to pseudo-rapidity range 1<|η|<1.3. Low level trigger candidates must be flagged within a maximum latency of 1075 ns, thus imposing stringent signal processing time performance requirements on the readout system in general, and on the digitization electronics in particular. This paper investigates the HPTDC signal latency performance based on a specially designed evaluation board coupled with an external FPGA evaluation board, when operated in triggerless mode, and under hit rate conditions expected in Phase-I. This hardware based study confirms previous simulations and demonstrates that the HPTDC in triggerless operation satisfies the digitization timing requirements in both leading edge and pair modes.
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Fiorini, M.; Frezza, O.; Lonardo, A.; Lamanna, G.; Lo Cicero, F.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Piandani, R.; Pontisso, L.; Rossetti, D.; Simula, F.; Sozzi, M.; Tosoratto, L.; Vicini, P.
2016-03-01
A GPU-based low level (L0) trigger is currently integrated in the experimental setup of the RICH detector of the NA62 experiment to assess the feasibility of building more refined physics-related trigger primitives and thus improve the trigger discriminating power. To ensure the real-time operation of the system, a dedicated data transport mechanism has been implemented: an FPGA-based Network Interface Card (NaNet-10) receives data from detectors and forwards them with low, predictable latency to the memory of the GPU performing the trigger algorithms. Results of the ring-shaped hit patterns reconstruction will be reported and discussed.
Modeling heterogeneous processor scheduling for real time systems
NASA Technical Reports Server (NTRS)
Leathrum, J. F.; Mielke, R. R.; Stoughton, J. W.
1994-01-01
A new model is presented to describe dataflow algorithms implemented in a multiprocessing system. Called the resource/data flow graph (RDFG), the model explicitly represents cyclo-static processor schedules as circuits of processor arcs which reflect the order that processors execute graph nodes. The model also allows the guarantee of meeting hard real-time deadlines. When unfolded, the model identifies statically the processor schedule. The model therefore is useful for determining the throughput and latency of systems with heterogeneous processors. The applicability of the model is demonstrated using a space surveillance algorithm.
Real-time machine vision system using FPGA and soft-core processor
NASA Astrophysics Data System (ADS)
Malik, Abdul Waheed; Thörnberg, Benny; Meng, Xiaozhou; Imran, Muhammad
2012-06-01
This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a power consumption that is much lower compared to commercially available smart camera solutions.
The ATLAS Level-1 Calorimeter Trigger: PreProcessor implementation and performance
NASA Astrophysics Data System (ADS)
Åsman, B.; Achenbach, R.; Allbrooke, B. M. M.; Anders, G.; Andrei, V.; Büscher, V.; Bansil, H. S.; Barnett, B. M.; Bauss, B.; Bendtz, K.; Bohm, C.; Bracinik, J.; Brawn, I. P.; Brock, R.; Buttinger, W.; Caputo, R.; Caughron, S.; Cerrito, L.; Charlton, D. G.; Childers, J. T.; Curtis, C. J.; Daniells, A. C.; Davis, A. O.; Davygora, Y.; Dorn, M.; Eckweiler, S.; Edmunds, D.; Edwards, J. P.; Eisenhandler, E.; Ellis, K.; Ermoline, Y.; Föhlisch, F.; Faulkner, P. J. W.; Fedorko, W.; Fleckner, J.; French, S. T.; Gee, C. N. P.; Gillman, A. R.; Goeringer, C.; Hülsing, T.; Hadley, D. R.; Hanke, P.; Hauser, R.; Heim, S.; Hellman, S.; Hickling, R. S.; Hidvégi, A.; Hillier, S. J.; Hofmann, J. I.; Hristova, I.; Ji, W.; Johansen, M.; Keller, M.; Khomich, A.; Kluge, E.-E.; Koll, J.; Laier, H.; Landon, M. P. J.; Lang, V. S.; Laurens, P.; Lepold, F.; Lilley, J. N.; Linnemann, J. T.; Müller, F.; Müller, T.; Mahboubi, K.; Martin, T. A.; Mass, A.; Meier, K.; Meyer, C.; Middleton, R. P.; Moa, T.; Moritz, S.; Morris, J. D.; Mudd, R. D.; Narayan, R.; zur Nedden, M.; Neusiedl, A.; Newman, P. R.; Nikiforov, A.; Ohm, C. C.; Perera, V. J. O.; Pfeiffer, U.; Plucinski, P.; Poddar, S.; Prieur, D. P. F.; Qian, W.; Rieck, P.; Rizvi, E.; Sankey, D. P. C.; Schäfer, U.; Scharf, V.; Schmitt, K.; Schröder, C.; Schultz-Coulon, H.-C.; Schumacher, C.; Schwienhorst, R.; Silverstein, S. B.; Simioni, E.; Snidero, G.; Staley, R. J.; Stamen, R.; Stock, P.; Stockton, M. C.; Tan, C. L. A.; Tapprogge, S.; Thomas, J. P.; Thompson, P. D.; Thomson, M.; True, P.; Watkins, P. M.; Watson, A. T.; Watson, M. F.; Weber, P.; Wessels, M.; Wiglesworth, C.; Williams, S. L.
2012-12-01
The PreProcessor system of the ATLAS Level-1 Calorimeter Trigger (L1Calo) receives about 7200 analogue signals from the electromagnetic and hadronic components of the calorimetric detector system. Lateral division results in cells which are pre-summed to so-called Trigger Towers of size 0.1 × 0.1 along azimuth (phi) and pseudorapidity (η). The received calorimeter signals represent deposits of transverse energy. The system consists of 124 individual PreProcessor modules that digitise the input signals for each LHC collision, and provide energy and timing information to the digital processors of the L1Calo system, which identify physics objects forming much of the basis for the full ATLAS first level trigger decision. This paper describes the architecture of the PreProcessor, its hardware realisation, functionality, and performance.
Spacewire on Earth orbiting scatterometers
NASA Technical Reports Server (NTRS)
Bachmann, Alex; Lang, Minh; Lux, James; Steffke, Richard
2002-01-01
The need for a high speed, reliable and easy to implement communication link has led to the development of a space flight oriented version of IEEE 1355 called SpaceWire. SpaceWire is based on high-speed (200 Mbps) serial point-to-point links using Low Voltage Differential Signaling (LVDS). SpaceWIre has provisions for routing messages between a large network of processors, using wormhole routing for low overhead and latency. {additionally, there are available space qualified hybrids, which provide the Link layer to the user's bus}. A test bed of multiple digital signal processor breadboards, demonstrating the ability to meet signal processing requirements for an orbiting scatterometer has been implemented using three Astrium MCM-DSPs, each breadboard consists of a Multi Chip Module (MCM) that combines a space qualified Digital Signal Processor and peripherals, including IEEE-1355 links. With the addition of appropriate physical layer interfaces and software on the DSP, the SpaceWire link is used to communicate between processors on the test bed, e.g. sending timing references, commands, status, and science data among the processors. Results are presented on development issues surrounding the use of SpaceWire in this environment, from physical layer implementation (cables, connectors, LVDS drivers) to diagnostic tools, driver firmware, and development methodology. The tools, methods, and hardware, software challenges and preliminary performance are investigated and discussed.
NASA Technical Reports Server (NTRS)
Dongarra, Jack
1998-01-01
This exploratory study initiated our inquiry into algorithms and applications that would benefit by latency tolerant approach to algorithm building, including the construction of new algorithms where appropriate. In a multithreaded execution, when a processor reaches a point where remote memory access is necessary, the request is sent out on the network and a context--switch occurs to a new thread of computation. This effectively masks a long and unpredictable latency due to remote loads, thereby providing tolerance to remote access latency. We began to develop standards to profile various algorithm and application parameters, such as the degree of parallelism, granularity, precision, instruction set mix, interprocessor communication, latency etc. These tools will continue to develop and evolve as the Information Power Grid environment matures. To provide a richer context for this research, the project also focused on issues of fault-tolerance and computation migration of numerical algorithms and software. During the initial phase we tried to increase our understanding of the bottlenecks in single processor performance. Our work began by developing an approach for the automatic generation and optimization of numerical software for processors with deep memory hierarchies and pipelined functional units. Based on the results we achieved in this study we are planning to study other architectures of interest, including development of cost models, and developing code generators appropriate to these architectures.
Huang, Kuan-Ju; Shih, Wei-Yeh; Chang, Jui Chung; Feng, Chih Wei; Fang, Wai-Chi
2013-01-01
This paper presents a pipeline VLSI design of fast singular value decomposition (SVD) processor for real-time electroencephalography (EEG) system based on on-line recursive independent component analysis (ORICA). Since SVD is used frequently in computations of the real-time EEG system, a low-latency and high-accuracy SVD processor is essential. During the EEG system process, the proposed SVD processor aims to solve the diagonal, inverse and inverse square root matrices of the target matrices in real time. Generally, SVD requires a huge amount of computation in hardware implementation. Therefore, this work proposes a novel design concept for data flow updating to assist the pipeline VLSI implementation. The SVD processor can greatly improve the feasibility of real-time EEG system applications such as brain computer interfaces (BCIs). The proposed architecture is implemented using TSMC 90 nm CMOS technology. The sample rate of EEG raw data adopts 128 Hz. The core size of the SVD processor is 580×580 um(2), and the speed of operation frequency is 20MHz. It consumes 0.774mW of power during the 8-channel EEG system per execution time.
Method for prefetching non-contiguous data structures
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Brewster, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Takken, Todd E [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2009-05-05
A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple perfecting for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefect rather than some other predictive algorithm. This enables hardware to effectively prefect memory access patterns that are non-contiguous, but repetitive.
A low-latency pipeline for GRB light curve and spectrum using Fermi/GBM near real-time data
NASA Astrophysics Data System (ADS)
Zhao, Yi; Zhang, Bin-Bin; Xiong, Shao-Lin; Long, Xi; Zhang, Qiang; Song, Li-Ming; Sun, Jian-Chao; Wang, Yuan-Hao; Li, Han-Cheng; Bu, Qing-Cui; Feng, Min-Zi; Li, Zheng-Heng; Wen, Xing; Wu, Bo-Bing; Zhang, Lai-Yu; Zhang, Yong-Jie; Zhang, Shuang-Nan; Shao, Jian-Xiong
2018-05-01
Rapid response and short time latency are very important for Time Domain Astronomy, such as the observations of Gamma-ray Bursts (GRBs) and electromagnetic (EM) counterparts of gravitational waves (GWs). Based on near real-time Fermi/GBM data, we developed a low-latency pipeline to automatically calculate the temporal and spectral properties of GRBs. With this pipeline, some important parameters can be obtained, such as T 90 and fluence, within ∼ 20 min after the GRB trigger. For ∼ 90% of GRBs, T 90 and fluence are consistent with the GBM catalog results within 2σ errors. This pipeline has been used by the Gamma-ray Bursts Polarimeter (POLAR) and the Insight Hard X-ray Modulation Telescope (Insight-HXMT) to follow up the bursts of interest. For GRB 170817A, the first EM counterpart of GW events detected by Fermi/GBM and INTEGRAL/SPI-ACS, the pipeline gave T 90 and spectral information 21 min after the GBM trigger, providing important information for POLAR and Insight-HXMT observations.
Method for triggering an action
Hall, David R.; Bartholomew, David B.; Johnson, Monte L.; Moon, Justin; Koehler, Roger O.
2006-10-17
A method for triggering an action of at least one downhole device on a downhole network integrated into a downhole tool string synchronized to an event comprises determining latency, sending a latency adjusted signal, and performing the action. The latency is determined between a control device and the at least one downhole device. The latency adjusted signal for triggering an action is sent to the downhole device. The action is performed downhole synchronized to the event. A preferred method for determining latency comprises the steps: a control device sends a first signal to the downhole device; after receiving the signal, the downhole device sends a response signal to the control device; and the control device analyzes the time from sending the signal to receiving the response signal.
Low latency network and distributed storage for next generation HPC systems: the ExaNeSt project
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Cretaro, P.; Frezza, O.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Paolucci, P. S.; Pastorelli, E.; Pisani, F.; Simula, F.; Vicini, P.; Navaridas, J.; Chaix, F.; Chrysos, N.; Katevenis, M.; Papaeustathiou, V.
2017-10-01
With processor architecture evolution, the HPC market has undergone a paradigm shift. The adoption of low-cost, Linux-based clusters extended the reach of HPC from its roots in modelling and simulation of complex physical systems to a broader range of industries, from biotechnology, cloud computing, computer analytics and big data challenges to manufacturing sectors. In this perspective, the near future HPC systems can be envisioned as composed of millions of low-power computing cores, densely packed — meaning cooling by appropriate technology — with a tightly interconnected, low latency and high performance network and equipped with a distributed storage architecture. Each of these features — dense packing, distributed storage and high performance interconnect — represents a challenge, made all the harder by the need to solve them at the same time. These challenges lie as stumbling blocks along the road towards Exascale-class systems; the ExaNeSt project acknowledges them and tasks itself with investigating ways around them.
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Chiozzi, S.; Cretaro, P.; Cotta Ramusino, A.; Di Lorenzo, S.; Fantechi, R.; Fiorini, M.; Frezza, O.; Gianoli, A.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Piandani, R.; Piccini, M.; Pontisso, L.; Rossetti, D.; Simula, F.; Sozzi, M.; Vicini, P.
2017-03-01
This project aims to exploit the parallel computing power of a commercial Graphics Processing Unit (GPU) to implement fast pattern matching in the Ring Imaging Cherenkov (RICH) detector for the level 0 (L0) trigger of the NA62 experiment. In this approach, the ring-fitting algorithm is seedless, being fed with raw RICH data, with no previous information on the ring position from other detectors. Moreover, since the L0 trigger is provided with a more elaborated information than a simple multiplicity number, it results in a higher selection power. Two methods have been studied in order to reduce the data transfer latency from the readout boards of the detector to the GPU, i.e., the use of a dedicated NIC device driver with very low latency and a direct data transfer protocol from a custom FPGA-based NIC to the GPU. The performance of the system, developed through the FPGA approach, for multi-ring Cherenkov online reconstruction obtained during the NA62 physics runs is presented.
Graphics Processing Units for HEP trigger systems
NASA Astrophysics Data System (ADS)
Ammendola, R.; Bauce, M.; Biagioni, A.; Chiozzi, S.; Cotta Ramusino, A.; Fantechi, R.; Fiorini, M.; Giagu, S.; Gianoli, A.; Lamanna, G.; Lonardo, A.; Messina, A.; Neri, I.; Paolucci, P. S.; Piandani, R.; Pontisso, L.; Rescigno, M.; Simula, F.; Sozzi, M.; Vicini, P.
2016-07-01
General-purpose computing on GPUs (Graphics Processing Units) is emerging as a new paradigm in several fields of science, although so far applications have been tailored to the specific strengths of such devices as accelerator in offline computation. With the steady reduction of GPU latencies, and the increase in link and memory throughput, the use of such devices for real-time applications in high-energy physics data acquisition and trigger systems is becoming ripe. We will discuss the use of online parallel computing on GPU for synchronous low level trigger, focusing on CERN NA62 experiment trigger system. The use of GPU in higher level trigger system is also briefly considered.
Electrophysiological evidence of automatic early semantic processing.
Hinojosa, José A; Martín-Loeches, Manuel; Muñoz, Francisco; Casado, Pilar; Pozo, Miguel A
2004-01-01
This study investigates the automatic-controlled nature of early semantic processing by means of the Recognition Potential (RP), an event-related potential response that reflects lexical selection processes. For this purpose tasks differing in their processing requirements were used. Half of the participants performed a physical task involving a lower-upper case discrimination judgement (shallow processing requirements), whereas the other half carried out a semantic task, consisting in detecting animal names (deep processing requirements). Stimuli were identical in the two tasks. Reaction time measures revealed that the physical task was easier to perform than the semantic task. However, RP effects elicited by the physical and semantic tasks did not differ in either latency, amplitude, or topographic distribution. Thus, the results from the present study suggest that early semantic processing is automatically triggered whenever a linguistic stimulus enters the language processor.
Predicting Cost/Performance Trade-Offs for Whitney: A Commodity Computing Cluster
NASA Technical Reports Server (NTRS)
Becker, Jeffrey C.; Nitzberg, Bill; VanderWijngaart, Rob F.; Kutler, Paul (Technical Monitor)
1997-01-01
Recent advances in low-end processor and network technology have made it possible to build a "supercomputer" out of commodity components. We develop simple models of the NAS Parallel Benchmarks version 2 (NPB 2) to explore the cost/performance trade-offs involved in building a balanced parallel computer supporting a scientific workload. We develop closed form expressions detailing the number and size of messages sent by each benchmark. Coupling these with measured single processor performance, network latency, and network bandwidth, our models predict benchmark performance to within 30%. A comparison based on total system cost reveals that current commodity technology (200 MHz Pentium Pros with 100baseT Ethernet) is well balanced for the NPBs up to a total system cost of around $1,000,000.
NASA Astrophysics Data System (ADS)
Deng, B.; Xiao, L.; Zhao, X.; Baker, E.; Gong, D.; Guo, D.; He, H.; Hou, S.; Liu, C.; Liu, T.; Sun, Q.; Thomas, J.; Wang, J.; Xiang, A. C.; Yang, D.; Ye, J.; Zhou, W.
2018-05-01
Two optical data link data transmission Application Specific Integrated Circuits (ASICs), the baseline and its backup, have been designed for the ATLAS Liquid Argon (LAr) Calorimeter Phase-I trigger upgrade. The latency of each ASIC and that of its corresponding receiver implemented in a back-end Field-Programmable Gate Array (FPGA) are critical specifications. In this paper, we present the latency measurements and simulation of two ASICs. The measurement results indicate that both ASICs achieve their design goals and meet the latency specifications. The consistency between the simulation and measurements validates the ASIC latency characterization.
Point-to-Point Multicast Communications Protocol
NASA Technical Reports Server (NTRS)
Byrd, Gregory T.; Nakano, Russell; Delagi, Bruce A.
1987-01-01
This paper describes a protocol to support point-to-point interprocessor communications with multicast. Dynamic, cut-through routing with local flow control is used to provide a high-throughput, low-latency communications path between processors. In addition multicast transmissions are available, in which copies of a packet are sent to multiple destinations using common resources as much as possible. Special packet terminators and selective buffering are introduced to avoid a deadlock during multicasts. A simulated implementation of the protocol is also described.
The Watchdog Task: Concurrent error detection using assertions
NASA Technical Reports Server (NTRS)
Ersoz, A.; Andrews, D. M.; Mccluskey, E. J.
1985-01-01
The Watchdog Task, a software abstraction of the Watchdog-processor, is shown to be a powerful error detection tool with a great deal of flexibility and the advantages of watchdog techniques. A Watchdog Task system in Ada is presented; issues of recovery, latency, efficiency (communication) and preprocessing are discussed. Different applications, one of which is error detection on a single processor, are examined.
Accelerating list management for MPI.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hemmert, K. Scott; Rodrigues, Arun F.; Underwood, Keith Douglas
2005-07-01
The latency and throughput of MPI messages are critically important to a range of parallel scientific applications. In many modern networks, both of these performance characteristics are largely driven by the performance of a processor on the network interface. Because of the semantics of MPI, this embedded processor is forced to traverse a linked list of posted receives each time a message is received. As this list grows long, the latency of message reception grows and the throughput of MPI messages decreases. This paper presents a novel hardware feature to handle list management functions on a network interface. By movingmore » functions such as list insertion, list traversal, and list deletion to the hardware unit, latencies are decreased by up to 20% in the zero length queue case with dramatic improvements in the presence of long queues. Similarly, the throughput is increased by up to 10% in the zero length queue case and by nearly 100% in the presence queues of 30 messages.« less
NASA Astrophysics Data System (ADS)
Rizvi, Syed S.; Shah, Dipali; Riasat, Aasia
The Time Wrap algorithm [3] offers a run time recovery mechanism that deals with the causality errors. These run time recovery mechanisms consists of rollback, anti-message, and Global Virtual Time (GVT) techniques. For rollback, there is a need to compute GVT which is used in discrete-event simulation to reclaim the memory, commit the output, detect the termination, and handle the errors. However, the computation of GVT requires dealing with transient message problem and the simultaneous reporting problem. These problems can be dealt in an efficient manner by the Samadi's algorithm [8] which works fine in the presence of causality errors. However, the performance of both Time Wrap and Samadi's algorithms depends on the latency involve in GVT computation. Both algorithms give poor latency for large simulation systems especially in the presence of causality errors. To improve the latency and reduce the processor ideal time, we implement tree and butterflies barriers with the optimistic algorithm. Our analysis shows that the use of synchronous barriers such as tree and butterfly with the optimistic algorithm not only minimizes the GVT latency but also minimizes the processor idle time.
Data acquisition using the 168/E. [CERN ISR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carroll, J.T.; Cittolin, S.; Demoulin, M.
1983-03-01
Event sizes and data rates at the CERN anti p p collider compose a formidable environment for a high level trigger. A system using three 168/E processors for experiment UA1 real-time event selection is described. With 168/E data memory expanded to 512K bytes, each processor holds a complete event allowing a FORTRAN trigger algorithm access to data from the entire detector. A smart CAMAC interface reads five Remus branches in parallel transferring one word to the target processor every 0.5 ..mu..s. The NORD host computer can simultaneously read an accepted event from another processor.
An "artificial retina" processor for track reconstruction at the full LHC crossing rate
NASA Astrophysics Data System (ADS)
Abba, A.; Bedeschi, F.; Caponio, F.; Cenci, R.; Citterio, M.; Cusimano, A.; Fu, J.; Geraci, A.; Grizzuti, M.; Lusardi, N.; Marino, P.; Morello, M. J.; Neri, N.; Ninci, D.; Petruzzo, M.; Piucci, A.; Punzi, G.; Ristori, L.; Spinella, F.; Stracka, S.; Tonelli, D.; Walsh, J.
2016-07-01
We present the latest results of an R&D study for a specialized processor capable of reconstructing, in a silicon pixel detector, high-quality tracks from high-energy collision events at 40 MHz. The processor applies a highly parallel pattern-recognition algorithm inspired to quick detection of edges in mammals visual cortex. After a detailed study of a real-detector application, demonstrating that online reconstruction of offline-quality tracks is feasible at 40 MHz with sub-microsecond latency, we are implementing a prototype using common high-bandwidth FPGA devices.
An "artificial retina" processor for track reconstruction at the full LHC crossing rate
Abba, A.; F. Bedeschi; Caponio, F.; ...
2015-10-23
Here, we present the latest results of an R&D; study for a specialized processor capable of reconstructing, in a silicon pixel detector, high-quality tracks from high-energy collision events at 40 MHz. The processor applies a highly parallel pattern-recognition algorithm inspired to quick detection of edges in mammals visual cortex. After a detailed study of a real-detector application, demonstrating that online reconstruction of offline-quality tracks is feasible at 40 MHz with sub-microsecond latency, we are implementing a prototype using common high-bandwidth FPGA devices.
Low Latency DESDynI Data Products for Disaster Response, Resource Management and Other Applications
NASA Technical Reports Server (NTRS)
Doubleday, Joshua R.; Chien, Steve A.; Lou, Yunling
2011-01-01
We are developing onboard processor technology targeted at the L-band SAR instrument onboard the planned DESDynI mission to enable formation of SAR images onboard opening possibilities for near-real-time data products to augment full data streams. Several image processing and/or interpretation techniques are being explored as possible direct-broadcast products for use by agencies in need of low-latency data, responsible for disaster mitigation and assessment, resource management, agricultural development, shipping, etc. Data collected through UAVSAR (L-band) serves as surrogate to the future DESDynI instrument. We have explored surface water extent as a tool for flooding response, and disturbance images on polarimetric backscatter of repeat pass imagery potentially useful for structural collapse (earthquake), mud/land/debris-slides etc. We have also explored building vegetation and snow/ice classifiers, via support vector machines utilizing quad-pol backscatter, cross-pol phase, and a number of derivatives (radar vegetation index, dielectric estimates, etc.). We share our qualitative and quantitative results thus far.
First Results of an “Artificial Retina” Processor Prototype
Cenci, Riccardo; Bedeschi, Franco; Marino, Pietro; ...
2016-11-15
We report on the performance of a specialized processor capable of reconstructing charged particle tracks in a realistic LHC silicon tracker detector, at the same speed of the readout and with sub-microsecond latency. The processor is based on an innovative pattern-recognition algorithm, called “artificial retina algorithm”, inspired from the vision system of mammals. A prototype of the processor has been designed, simulated, and implemented on Tel62 boards equipped with high-bandwidth Altera Stratix III FPGA devices. Also, the prototype is the first step towards a real-time track reconstruction device aimed at processing complex events of high-luminosity LHC experiments at 40 MHzmore » crossing rate.« less
First Results of an “Artificial Retina” Processor Prototype
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cenci, Riccardo; Bedeschi, Franco; Marino, Pietro
We report on the performance of a specialized processor capable of reconstructing charged particle tracks in a realistic LHC silicon tracker detector, at the same speed of the readout and with sub-microsecond latency. The processor is based on an innovative pattern-recognition algorithm, called “artificial retina algorithm”, inspired from the vision system of mammals. A prototype of the processor has been designed, simulated, and implemented on Tel62 boards equipped with high-bandwidth Altera Stratix III FPGA devices. Also, the prototype is the first step towards a real-time track reconstruction device aimed at processing complex events of high-luminosity LHC experiments at 40 MHzmore » crossing rate.« less
2004-07-01
steadily for the past fifteen years, while memory latency and bandwidth have improved much more slowly. For example, Intel processor clock rates38 have... processor and memory performance) all greatly restrict the ability to achieve high levels of performance for science, engineering, and national...sub-nuclear distances. Guide experiments to identify transition from quantum chromodynamics to quark -gluon plasma. Accelerator Physics Accurate
Avoiding and tolerating latency in large-scale next-generation shared-memory multiprocessors
NASA Technical Reports Server (NTRS)
Probst, David K.
1993-01-01
A scalable solution to the memory-latency problem is necessary to prevent the large latencies of synchronization and memory operations inherent in large-scale shared-memory multiprocessors from reducing high performance. We distinguish latency avoidance and latency tolerance. Latency is avoided when data is brought to nearby locales for future reference. Latency is tolerated when references are overlapped with other computation. Latency-avoiding locales include: processor registers, data caches used temporally, and nearby memory modules. Tolerating communication latency requires parallelism, allowing the overlap of communication and computation. Latency-tolerating techniques include: vector pipelining, data caches used spatially, prefetching in various forms, and multithreading in various forms. Relaxing the consistency model permits increased use of avoidance and tolerance techniques. Each model is a mapping from the program text to sets of partial orders on program operations; it is a convention about which temporal precedences among program operations are necessary. Information about temporal locality and parallelism constrains the use of avoidance and tolerance techniques. Suitable architectural primitives and compiler technology are required to exploit the increased freedom to reorder and overlap operations in relaxed models.
Optoelectronic-cache memory system architecture.
Chiarulli, D M; Levitan, S P
1996-05-10
We present an investigation of the architecture of an optoelectronic cache that can integrate terabit optical memories with the electronic caches associated with high-performance uniprocessors and multiprocessors. The use of optoelectronic-cache memories enables these terabit technologies to provide transparently low-latency secondary memory with frame sizes comparable with disk pages but with latencies that approach those of electronic secondary-cache memories. This enables the implementation of terabit memories with effective access times comparable with the cycle times of current microprocessors. The cache design is based on the use of a smart-pixel array and combines parallel free-space optical input-output to-and-from optical memory with conventional electronic communication to the processor caches. This cache and the optical memory system to which it will interface provide a large random-access memory space that has a lower overall latency than that of magnetic disks and disk arrays. In addition, as a consequence of the high-bandwidth parallel input-output capabilities of optical memories, fault service times for the optoelectronic cache are substantially less than those currently achievable with any rotational media.
A new, ultra-low latency data transmission protocol for Earthquake Early Warning Systems
NASA Astrophysics Data System (ADS)
Hill, P.; Hicks, S. P.; McGowan, M.
2016-12-01
One measure used to assess the performance of Earthquake Early Warning Systems (EEWS) is the delay time between earthquake origin and issued alert. EEWS latency is dependent on a number of sources (e.g. P-wave propagation, digitisation, transmission, receiver processing, triggering, event declaration). Many regional seismic networks use the SEEDlink protocol; however, packet size is fixed to 512-byte miniSEED records, resulting in transmission latencies of >0.5 s. Data packetisation is seen as one of the main sources of delays in EEWS (Brown et al., 2011). Optimising data-logger and telemetry configurations is a cost-effective strategy to improve EEWS alert times (Behr et al., 2015). Digitisers with smaller, selectable packets can result in faster alerts (Sokos et al., 2016). We propose a new seismic protocol for regional seismic networks benefiting low-latency applications such as EEWS. The protocol, based on Güralp's existing GDI-link format is an efficient and flexible method to exchange data between seismic stations and data centers for a range of network configurations. The main principle is to stream data sample-by-sample instead of fixed-length packets to minimise transmission latency. Self-adaptive packetisation with compression maximises available telemetry bandwidth. Highly flexible metadata fields within GDI-link are compatible with existing miniSEED definitions. Data is sent as integers or floats, supporting a wide range of data formats, including discrete parameters such as Pd & τC for on-site earthquake early warning. Other advantages include: streaming station state-of-health information, instrument control, support of backfilling and fail-over strategies during telemetry outages. Based on tests carried out on the Güralp Minimus data-logger, we show our new protocol can reduce transmission latency to as low as 1 ms. The low-latency protocol is currently being implemented with common processing packages. The results of these tests will help to highlight latency levels that can be achieved with next-generation EEWS.
Hiding the Disk and Network Latency of Out-of-Core Visualization
NASA Technical Reports Server (NTRS)
Ellsworth, David
2001-01-01
This paper describes an algorithm that improves the performance of application-controlled demand paging for out-of-core visualization by hiding the latency of reading data from both local disks or disks on remote servers. The performance improvements come from better overlapping the computation with the page reading process, and by performing multiple page reads in parallel. The paper includes measurements that show that the new multithreaded paging algorithm decreases the time needed to compute visualizations by one third when using one processor and reading data from local disk. The time needed when using one processor and reading data from remote disk decreased by two thirds. Visualization runs using data from remote disk actually ran faster than ones using data from local disk because the remote runs were able to make use of the remote server's high performance disk array.
Latency Hiding in Dynamic Partitioning and Load Balancing of Grid Computing Applications
NASA Technical Reports Server (NTRS)
Das, Sajal K.; Harvey, Daniel J.; Biswas, Rupak
2001-01-01
The Information Power Grid (IPG) concept developed by NASA is aimed to provide a metacomputing platform for large-scale distributed computations, by hiding the intricacies of highly heterogeneous environment and yet maintaining adequate security. In this paper, we propose a latency-tolerant partitioning scheme that dynamically balances processor workloads on the.IPG, and minimizes data movement and runtime communication. By simulating an unsteady adaptive mesh application on a wide area network, we study the performance of our load balancer under the Globus environment. The number of IPG nodes, the number of processors per node, and the interconnected speeds are parameterized to derive conditions under which the IPG would be suitable for parallel distributed processing of such applications. Experimental results demonstrate that effective solution are achieved when the IPG nodes are connected by a high-speed asynchronous interconnection network.
Bhanot, Gyan [Princeton, NJ; Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Takken, Todd E [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2009-09-08
Class network routing is implemented in a network such as a computer network comprising a plurality of parallel compute processors at nodes thereof. Class network routing allows a compute processor to broadcast a message to a range (one or more) of other compute processors in the computer network, such as processors in a column or a row. Normally this type of operation requires a separate message to be sent to each processor. With class network routing pursuant to the invention, a single message is sufficient, which generally reduces the total number of messages in the network as well as the latency to do a broadcast. Class network routing is also applied to dense matrix inversion algorithms on distributed memory parallel supercomputers with hardware class function (multicast) capability. This is achieved by exploiting the fact that the communication patterns of dense matrix inversion can be served by hardware class functions, which results in faster execution times.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murphy, Richard C.
2009-09-01
This report details the accomplishments of the 'Building More Powerful Less Expensive Supercomputers Using Processing-In-Memory (PIM)' LDRD ('PIM LDRD', number 105809) for FY07-FY09. Latency dominates all levels of supercomputer design. Within a node, increasing memory latency, relative to processor cycle time, limits CPU performance. Between nodes, the same increase in relative latency impacts scalability. Processing-In-Memory (PIM) is an architecture that directly addresses this problem using enhanced chip fabrication technology and machine organization. PIMs combine high-speed logic and dense, low-latency, high-bandwidth DRAM, and lightweight threads that tolerate latency by performing useful work during memory transactions. This work examines the potential ofmore » PIM-based architectures to support mission critical Sandia applications and an emerging class of more data intensive informatics applications. This work has resulted in a stronger architecture/implementation collaboration between 1400 and 1700. Additionally, key technology components have impacted vendor roadmaps, and we are in the process of pursuing these new collaborations. This work has the potential to impact future supercomputer design and construction, reducing power and increasing performance. This final report is organized as follow: this summary chapter discusses the impact of the project (Section 1), provides an enumeration of publications and other public discussion of the work (Section 1), and concludes with a discussion of future work and impact from the project (Section 1). The appendix contains reprints of the refereed publications resulting from this work.« less
NASA Technical Reports Server (NTRS)
Katz, Randy H.; Anderson, Thomas E.; Ousterhout, John K.; Patterson, David A.
1991-01-01
Rapid advances in high performance computing are making possible more complete and accurate computer-based modeling of complex physical phenomena, such as weather front interactions, dynamics of chemical reactions, numerical aerodynamic analysis of airframes, and ocean-land-atmosphere interactions. Many of these 'grand challenge' applications are as demanding of the underlying storage system, in terms of their capacity and bandwidth requirements, as they are on the computational power of the processor. A global view of the Earth's ocean chlorophyll and land vegetation requires over 2 terabytes of raw satellite image data. In this paper, we describe our planned research program in high capacity, high bandwidth storage systems. The project has four overall goals. First, we will examine new methods for high capacity storage systems, made possible by low cost, small form factor magnetic and optical tape systems. Second, access to the storage system will be low latency and high bandwidth. To achieve this, we must interleave data transfer at all levels of the storage system, including devices, controllers, servers, and communications links. Latency will be reduced by extensive caching throughout the storage hierarchy. Third, we will provide effective management of a storage hierarchy, extending the techniques already developed for the Log Structured File System. Finally, we will construct a protototype high capacity file server, suitable for use on the National Research and Education Network (NREN). Such research must be a Cornerstone of any coherent program in high performance computing and communications.
Schultz, Benjamin G; van Vugt, Floris T
2016-12-01
Timing abilities are often measured by having participants tap their finger along with a metronome and presenting tap-triggered auditory feedback. These experiments predominantly use electronic percussion pads combined with software (e.g., FTAP or Max/MSP) that records responses and delivers auditory feedback. However, these setups involve unknown latencies between tap onset and auditory feedback and can sometimes miss responses or record multiple, superfluous responses for a single tap. These issues may distort measurements of tapping performance or affect the performance of the individual. We present an alternative setup using an Arduino microcontroller that addresses these issues and delivers low-latency auditory feedback. We validated our setup by having participants (N = 6) tap on a force-sensitive resistor pad connected to the Arduino and on an electronic percussion pad with various levels of force and tempi. The Arduino delivered auditory feedback through a pulse-width modulation (PWM) pin connected to a headphone jack or a wave shield component. The Arduino's PWM (M = 0.6 ms, SD = 0.3) and wave shield (M = 2.6 ms, SD = 0.3) demonstrated significantly lower auditory feedback latencies than the percussion pad (M = 9.1 ms, SD = 2.0), FTAP (M = 14.6 ms, SD = 2.8), and Max/MSP (M = 15.8 ms, SD = 3.4). The PWM and wave shield latencies were also significantly less variable than those from FTAP and Max/MSP. The Arduino missed significantly fewer taps, and recorded fewer superfluous responses, than the percussion pad. The Arduino captured all responses, whereas at lower tapping forces, the percussion pad missed more taps. Regardless of tapping force, the Arduino outperformed the percussion pad. Overall, the Arduino is a high-precision, low-latency, portable, and affordable tool for auditory experiments.
Error recovery in shared memory multiprocessors using private caches
NASA Technical Reports Server (NTRS)
Wu, Kun-Lung; Fuchs, W. Kent; Patel, Janak H.
1990-01-01
The problem of recovering from processor transient faults in shared memory multiprocesses systems is examined. A user-transparent checkpointing and recovery scheme using private caches is presented. Processes can recover from errors due to faulty processors by restarting from the checkpointed computation state. Implementation techniques using checkpoint identifiers and recovery stacks are examined as a means of reducing performance degradation in processor utilization during normal execution. This cache-based checkpointing technique prevents rollback propagation, provides rapid recovery, and can be integrated into standard cache coherence protocols. An analytical model is used to estimate the relative performance of the scheme during normal execution. Extensions to take error latency into account are presented.
Silicon Nanophotonics for Many-Core On-Chip Networks
NASA Astrophysics Data System (ADS)
Mohamed, Moustafa
Number of cores in many-core architectures are scaling to unprecedented levels requiring ever increasing communication capacity. Traditionally, architects follow the path of higher throughput at the expense of latency. This trend has evolved into being problematic for performance in many-core architectures. Moreover, the trends of power consumption is increasing with system scaling mandating nontraditional solutions. Nanophotonics can address these problems, offering benefits in the three frontiers of many-core processor design: Latency, bandwidth, and power. Nanophotonics leverage circuit-switching flow control allowing low latency; in addition, the power consumption of optical links is significantly lower compared to their electrical counterparts at intermediate and long links. Finally, through wave division multiplexing, we can keep the high bandwidth trends without sacrificing the throughput. This thesis focuses on realizing nanophotonics for communication in many-core architectures at different design levels considering reliability challenges that our fabrication and measurements reveal. First, we study how to design on-chip networks for low latency, low power, and high bandwidth by exploiting the full potential of nanophotonics. The design process considers device level limitations and capabilities on one hand, and system level demands in terms of power and performance on the other hand. The design involves the choice of devices, designing the optical link, the topology, the arbitration technique, and the routing mechanism. Next, we address the problem of reliability in on-chip networks. Reliability not only degrades performance but can block communication. Hence, we propose a reliability-aware design flow and present a reliability management technique based on this flow to address reliability in the system. In the proposed flow reliability is modeled and analyzed for at the device, architecture, and system level. Our reliability management technique is superior to existing solutions in terms of power and performance. In fact, our solution can scale to thousand core with low overhead.
NASA Astrophysics Data System (ADS)
Salathé, Yves; Kurpiers, Philipp; Karg, Thomas; Lang, Christian; Andersen, Christian Kraglund; Akin, Abdulkadir; Krinner, Sebastian; Eichler, Christopher; Wallraff, Andreas
2018-03-01
Quantum computing architectures rely on classical electronics for control and readout. Employing classical electronics in a feedback loop with the quantum system allows us to stabilize states, correct errors, and realize specific feedforward-based quantum computing and communication schemes such as deterministic quantum teleportation. These feedback and feedforward operations are required to be fast compared to the coherence time of the quantum system to minimize the probability of errors. We present a field-programmable-gate-array-based digital signal processing system capable of real-time quadrature demodulation, a determination of the qubit state, and a generation of state-dependent feedback trigger signals. The feedback trigger is generated with a latency of 110 ns with respect to the timing of the analog input signal. We characterize the performance of the system for an active qubit initialization protocol based on the dispersive readout of a superconducting qubit and discuss potential applications in feedback and feedforward algorithms.
Feasibility of optically interconnected parallel processors using wavelength division multiplexing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deri, R.J.; De Groot, A.J.; Haigh, R.E.
1996-03-01
New national security demands require enhanced computing systems for nearly ab initio simulations of extremely complex systems and analyzing unprecedented quantities of remote sensing data. This computational performance is being sought using parallel processing systems, in which many less powerful processors are ganged together to achieve high aggregate performance. Such systems require increased capability to communicate information between individual processor and memory elements. As it is likely that the limited performance of today`s electronic interconnects will prevent the system from achieving its ultimate performance, there is great interest in using fiber optic technology to improve interconnect communication. However, little informationmore » is available to quantify the requirements on fiber optical hardware technology for this application. Furthermore, we have sought to explore interconnect architectures that use the complete communication richness of the optical domain rather than using optics as a simple replacement for electronic interconnects. These considerations have led us to study the performance of a moderate size parallel processor with optical interconnects using multiple optical wavelengths. We quantify the bandwidth, latency, and concurrency requirements which allow a bus-type interconnect to achieve scalable computing performance using up to 256 nodes, each operating at GFLOP performance. Our key conclusion is that scalable performance, to {approx}150 GFLOPS, is achievable for several scientific codes using an optical bus with a small number of WDM channels (8 to 32), only one WDM channel received per node, and achievable optoelectronic bandwidth and latency requirements. 21 refs. , 10 figs.« less
FPGA cluster for high-performance AO real-time control system
NASA Astrophysics Data System (ADS)
Geng, Deli; Goodsell, Stephen J.; Basden, Alastair G.; Dipper, Nigel A.; Myers, Richard M.; Saunter, Chris D.
2006-06-01
Whilst the high throughput and low latency requirements for the next generation AO real-time control systems have posed a significant challenge to von Neumann architecture processor systems, the Field Programmable Gate Array (FPGA) has emerged as a long term solution with high performance on throughput and excellent predictability on latency. Moreover, FPGA devices have highly capable programmable interfacing, which lead to more highly integrated system. Nevertheless, a single FPGA is still not enough: multiple FPGA devices need to be clustered to perform the required subaperture processing and the reconstruction computation. In an AO real-time control system, the memory bandwidth is often the bottleneck of the system, simply because a vast amount of supporting data, e.g. pixel calibration maps and the reconstruction matrix, need to be accessed within a short period. The cluster, as a general computing architecture, has excellent scalability in processing throughput, memory bandwidth, memory capacity, and communication bandwidth. Problems, such as task distribution, node communication, system verification, are discussed.
A multi-port 10GbE PCIe NIC featuring UDP offload and GPUDirect capabilities.
NASA Astrophysics Data System (ADS)
Ammendola, Roberto; Biagioni, Andrea; Frezza, Ottorino; Lamanna, Gianluca; Lo Cicero, Francesca; Lonardo, Alessandro; Martinelli, Michele; Stanislao Paolucci, Pier; Pastorelli, Elena; Pontisso, Luca; Rossetti, Davide; Simula, Francesco; Sozzi, Marco; Tosoratto, Laura; Vicini, Piero
2015-12-01
NaNet-10 is a four-ports 10GbE PCIe Network Interface Card designed for low-latency real-time operations with GPU systems. To this purpose the design includes an UDP offload module, for fast and clock-cycle deterministic handling of the transport layer protocol, plus a GPUDirect P2P/RDMA engine for low-latency communication with NVIDIA Tesla GPU devices. A dedicated module (Multi-Stream) can optionally process input UDP streams before data is delivered through PCIe DMA to their destination devices, re-organizing data from different streams guaranteeing computational optimization. NaNet-10 is going to be integrated in the NA62 CERN experiment in order to assess the suitability of GPGPU systems as real-time triggers; results and lessons learned while performing this activity will be reported herein.
NASA Astrophysics Data System (ADS)
Erez, Mattan; Dally, William J.
Stream processors, like other multi core architectures partition their functional units and storage into multiple processing elements. In contrast to typical architectures, which contain symmetric general-purpose cores and a cache hierarchy, stream processors have a significantly leaner design. Stream processors are specifically designed for the stream execution model, in which applications have large amounts of explicit parallel computation, structured and predictable control, and memory accesses that can be performed at a coarse granularity. Applications in the streaming model are expressed in a gather-compute-scatter form, yielding programs with explicit control over transferring data to and from on-chip memory. Relying on these characteristics, which are common to many media processing and scientific computing applications, stream architectures redefine the boundary between software and hardware responsibilities with software bearing much of the complexity required to manage concurrency, locality, and latency tolerance. Thus, stream processors have minimal control consisting of fetching medium- and coarse-grained instructions and executing them directly on the many ALUs. Moreover, the on-chip storage hierarchy of stream processors is under explicit software control, as is all communication, eliminating the need for complex reactive hardware mechanisms.
Eight-Channel Digital Signal Processor and Universal Trigger Module
NASA Astrophysics Data System (ADS)
Skulski, Wojtek; Wolfs, Frank
2003-04-01
A 10-bit, 8-channel, 40 megasamples per second digital signal processor and waveform digitizer DDC-8 (nicknamed Universal Trigger Module) is presented. The digitizer features 8 analog inputs, 1 analog output for a reconstructed analog waveform, 16 NIM logic inputs, 8 NIM logic outputs, and a pool of 16 TTL logic lines which can be individually configured as either inputs or outputs. The first application of this device is to enhance the present trigger electronics for PHOBOS at RHIC. The status of the development and the first results are presented. Possible applications of the new device are discussed. Supported by the NSF grant PHY-0072204.
Papera, Massimiliano; Richards, Anne
2016-05-01
Exogenous allocation of attentional resources allows the visual system to encode and maintain representations of stimuli in visual working memory (VWM). However, limits in the processing capacity to allocate resources can prevent unexpected visual stimuli from gaining access to VWM and thereby to consciousness. Using a novel approach to create unbiased stimuli of increasing saliency, we investigated visual processing during a visual search task in individuals who show a high or low propensity to neglect unexpected stimuli. When propensity to inattention is high, ERP recordings show a diminished amplification concomitantly with a decrease in theta band power during the N1 latency, followed by a poor target enhancement during the N2 latency. Furthermore, a later modulation in the P3 latency was also found in individuals showing propensity to visual neglect, suggesting that more effort is required for conscious maintenance of visual information in VWM. Effects during early stages of processing (N80 and P1) were also observed suggesting that sensitivity to contrasts and medium-to-high spatial frequencies may be modulated by low-level saliency (albeit no statistical group differences were found). In accordance with the Global Workplace Model, our data indicate that a lack of resources in low-level processors and visual attention may be responsible for the failure to "ignite" a state of high-level activity spread across several brain areas that is necessary for stimuli to access awareness. These findings may aid in the development of diagnostic tests and intervention to detect/reduce inattention propensity to visual neglect of unexpected stimuli. © 2016 Society for Psychophysiological Research.
Parallel Agent-Based Simulations on Clusters of GPUs and Multi-Core Processors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aaby, Brandon G; Perumalla, Kalyan S; Seal, Sudip K
2010-01-01
An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Messagemore » Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs.« less
1980-08-01
reduce I/O latency. Periodically, the polling processor would hand off the polling task to a different processor which would then become the active...DMA (2 X 1.5 microsecond/word) since both halves of the SKI carry on simultaneous DMA transfers in a looped configuration. The difference between 3.0...intelligent DMA. The only difference between this approach and the intelligent DNA is that the true intelligent DNA (approach (1)) would not use up I
Measurement of fault latency in a digital avionic mini processor, part 2
NASA Technical Reports Server (NTRS)
Mcgough, J.; Swern, F.
1983-01-01
The results of fault injection experiments utilizing a gate-level emulation of the central processor unit of the Bendix BDX-930 digital computer are described. Several earlier programs were reprogrammed, expanding the instruction set to capitalize on the full power of the BDX-930 computer. As a final demonstration of fault coverage an extensive, 3-axis, high performance flght control computation was added. The stages in the development of a CPU self-test program emphasizing the relationship between fault coverage, speed, and quantity of instructions were demonstrated.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth; Sewell, Christopher; Usher, William
Here, one of the most critical challenges for high-performance computing (HPC) scientific visualization is execution on massively threaded processors. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Our current production scientific visualization software is not designed for these new types of architectures. To address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth; Sewell, Christopher; Usher, William
Execution on massively threaded processors is one of the most critical challenges for high-performance computing (HPC) scientific visualization. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Moreover, our current production scientific visualization software is not designed for these new types of architectures. In order to address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
Cache-based error recovery for shared memory multiprocessor systems
NASA Technical Reports Server (NTRS)
Wu, Kun-Lung; Fuchs, W. Kent; Patel, Janak H.
1989-01-01
A multiprocessor cache-based checkpointing and recovery scheme for of recovering from transient processor errors in a shared-memory multiprocessor with private caches is presented. New implementation techniques that use checkpoint identifiers and recovery stacks to reduce performance degradation in processor utilization during normal execution are examined. This cache-based checkpointing technique prevents rollback propagation, provides for rapid recovery, and can be integrated into standard cache coherence protocols. An analytical model is used to estimate the relative performance of the scheme during normal execution. Extensions that take error latency into account are presented.
A high-speed DAQ framework for future high-level trigger and event building clusters
NASA Astrophysics Data System (ADS)
Caselle, M.; Ardila Perez, L. E.; Balzer, M.; Dritschler, T.; Kopmann, A.; Mohr, H.; Rota, L.; Vogelgesang, M.; Weber, M.
2017-03-01
Modern data acquisition and trigger systems require a throughput of several GB/s and latencies of the order of microseconds. To satisfy such requirements, a heterogeneous readout system based on FPGA readout cards and GPU-based computing nodes coupled by InfiniBand has been developed. The incoming data from the back-end electronics is delivered directly into the internal memory of GPUs through a dedicated peer-to-peer PCIe communication. High performance DMA engines have been developed for direct communication between FPGAs and GPUs using "DirectGMA (AMD)" and "GPUDirect (NVIDIA)" technologies. The proposed infrastructure is a candidate for future generations of event building clusters, high-level trigger filter farms and low-level trigger system. In this paper the heterogeneous FPGA-GPU architecture will be presented and its performance be discussed.
Jolij, Jacob; Scholte, H Steven; van Gaal, Simon; Hodgson, Timothy L; Lamme, Victor A F
2011-12-01
Humans largely guide their behavior by their visual representation of the world. Recent studies have shown that visual information can trigger behavior within 150 msec, suggesting that visually guided responses to external events, in fact, precede conscious awareness of those events. However, is such a view correct? By using a texture discrimination task, we show that the brain relies on long-latency visual processing in order to guide perceptual decisions. Decreasing stimulus saliency leads to selective changes in long-latency visually evoked potential components reflecting scene segmentation. These latency changes are accompanied by almost equal changes in simple RTs and points of subjective simultaneity. Furthermore, we find a strong correlation between individual RTs and the latencies of scene segmentation related components in the visually evoked potentials, showing that the processes underlying these late brain potentials are critical in triggering a response. However, using the same texture stimuli in an antisaccade task, we found that reflexive, but erroneous, prosaccades, but not antisaccades, can be triggered by earlier visual processes. In other words: The brain can act quickly, but decides late. Differences between our study and earlier findings suggesting that action precedes conscious awareness can be explained by assuming that task demands determine whether a fast and unconscious, or a slower and conscious, representation is used to initiate a visually guided response.
NASA Astrophysics Data System (ADS)
Suarez, Hernan; Zhang, Yan R.
2015-05-01
New radar applications need to perform complex algorithms and process large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression for real-time transceiver optimization are presented, they are based on a System-on-Chip architecture for Xilinx devices. This study also evaluates the performance of dedicated coprocessor as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through the high performance AXI buses, to perform floating-point operations, control the processing blocks, and communicate with external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band tested together with a low-cost channel emulator for different types of waveforms.
Six-port optical switch for cluster-mesh photonic network-on-chip
NASA Astrophysics Data System (ADS)
Jia, Hao; Zhou, Ting; Zhao, Yunchou; Xia, Yuhao; Dai, Jincheng; Zhang, Lei; Ding, Jianfeng; Fu, Xin; Yang, Lin
2018-05-01
Photonic network-on-chip for high-performance multi-core processors has attracted substantial interest in recent years as it offers a systematic method to meet the demand of large bandwidth, low latency and low power dissipation. In this paper we demonstrate a non-blocking six-port optical switch for cluster-mesh photonic network-on-chip. The architecture is constructed by substituting three optical switching units of typical Spanke-Benes network to optical waveguide crossings. Compared with Spanke-Benes network, the number of optical switching units is reduced by 20%, while the connectivity of routing path is maintained. By this way the footprint and power consumption can be reduced at the expense of sacrificing the network latency performance in some cases. The device is realized by 12 thermally tuned silicon Mach-Zehnder optical switching units. Its theoretical spectral responses are evaluated by establishing a numerical model. The experimental spectral responses are also characterized, which indicates that the optical signal-to-noise ratios of the optical switch are larger than 13.5 dB in the wavelength range from 1525 nm to 1565 nm. Data transmission experiment with the data rate of 32 Gbps is implemented for each optical link.
Ibáñez, Jaime; Monge-Pereira, Esther; Molina-Rueda, Francisco; Serrano, J I; Del Castillo, Maria D; Cuesta-Gómez, Alicia; Carratalá-Tejada, María; Cano-de-la-Cuerda, Roberto; Alguacil-Diego, Isabel M; Miangolarra-Page, Juan C; Pons, Jose L
2017-01-01
Background: The association between motor-related cortical activity and peripheral stimulation with temporal precision has been proposed as a possible intervention to facilitate cortico-muscular pathways and thereby improve motor rehabilitation after stroke. Previous studies with patients have provided evidence of the possibility to implement brain-machine interface platforms able to decode motor intentions and use this information to trigger afferent stimulation and movement assistance. This study tests the use a low-latency movement intention detector to drive functional electrical stimulation assisting upper-limb reaching movements of patients with stroke. Methods: An eight-sessions intervention on the paretic arm was tested on four chronic stroke patients along 1 month. Patients' intentions to initiate reaching movements were decoded from electroencephalographic signals and used to trigger functional electrical stimulation that in turn assisted patients to do the task. The analysis of the patients' ability to interact with the intervention platform, the assessment of changes in patients' clinical scales and of the system usability and the kinematic analysis of the reaching movements before and after the intervention period were carried to study the potential impact of the intervention. Results: On average 66.3 ± 15.7% of trials (resting intervals followed by self-initiated movements) were correctly classified with the decoder of motor intentions. The average detection latency (with respect to the movement onsets estimated with gyroscopes) was 112 ± 278 ms. The Fügl-Meyer index upper extremity increased 11.5 ± 5.5 points with the intervention. The stroke impact scale also increased. In line with changes in clinical scales, kinematics of reaching movements showed a trend toward lower compensatory mechanisms. Patients' assessment of the therapy reflected their acceptance of the proposed intervention protocol. Conclusions: According to results obtained here with a small sample of patients, Brain-Machine Interfaces providing low-latency support to upper-limb reaching movements in patients with stroke are a reliable and usable solution for motor rehabilitation interventions with potential functional benefits.
Ibáñez, Jaime; Monge-Pereira, Esther; Molina-Rueda, Francisco; Serrano, J. I.; del Castillo, Maria D.; Cuesta-Gómez, Alicia; Carratalá-Tejada, María; Cano-de-la-Cuerda, Roberto; Alguacil-Diego, Isabel M.; Miangolarra-Page, Juan C.; Pons, Jose L.
2017-01-01
Background: The association between motor-related cortical activity and peripheral stimulation with temporal precision has been proposed as a possible intervention to facilitate cortico-muscular pathways and thereby improve motor rehabilitation after stroke. Previous studies with patients have provided evidence of the possibility to implement brain-machine interface platforms able to decode motor intentions and use this information to trigger afferent stimulation and movement assistance. This study tests the use a low-latency movement intention detector to drive functional electrical stimulation assisting upper-limb reaching movements of patients with stroke. Methods: An eight-sessions intervention on the paretic arm was tested on four chronic stroke patients along 1 month. Patients' intentions to initiate reaching movements were decoded from electroencephalographic signals and used to trigger functional electrical stimulation that in turn assisted patients to do the task. The analysis of the patients' ability to interact with the intervention platform, the assessment of changes in patients' clinical scales and of the system usability and the kinematic analysis of the reaching movements before and after the intervention period were carried to study the potential impact of the intervention. Results: On average 66.3 ± 15.7% of trials (resting intervals followed by self-initiated movements) were correctly classified with the decoder of motor intentions. The average detection latency (with respect to the movement onsets estimated with gyroscopes) was 112 ± 278 ms. The Fügl-Meyer index upper extremity increased 11.5 ± 5.5 points with the intervention. The stroke impact scale also increased. In line with changes in clinical scales, kinematics of reaching movements showed a trend toward lower compensatory mechanisms. Patients' assessment of the therapy reflected their acceptance of the proposed intervention protocol. Conclusions: According to results obtained here with a small sample of patients, Brain-Machine Interfaces providing low-latency support to upper-limb reaching movements in patients with stroke are a reliable and usable solution for motor rehabilitation interventions with potential functional benefits. PMID:28367109
Fast Inference of Deep Neural Networks in FPGAs for Particle Physics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duarte, Javier; Han, Song; Harris, Philip
Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which wouldmore » enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.« less
Ikeda, M; Kurokawa, K; Maruyama, Y
1994-06-01
Ca(2+)-mediated Ca2+ spikes were analysed in fura-2-loaded megakaryocytes. Direct Ca2+ loading using whole-cell dialysis induced an all-or-none Ca2+ spike on top of a tonic increase in cellular Ca2+ concentration ([Ca2+]i) with a latency of 3-7 s. The latency decreased with increasingly higher concentrations of Ca2+ in the dialysing solution. Spike size and its initiation did not correlate with the tonic level of [Ca2+]i. Thapsigargin completely abolished the Ca(2+)-induced spike initiation, suggesting that Ca2+ spikes originate from thapsigargin-sensitive Ca2+ pools. An inhibitor of phosphatidylinositide-specific phospholipase C (PLC), 2-nitro-4-carboxyphenyl-N,N-diphenyl-carbamate prolonged the latency without changes of spike size in most cases (6/9 cells), but abolished the spike initiation in the other cells (3/9). The results suggest that an increase in [Ca2+]i charges up the inositol-1,4,5-trisphosphate-(InsP3)- and thapsigargin-sensitive Ca2+ pools which progressively sensitize to low or slightly elevated levels of InsP3 by the action of Ca(2+)-dependent PLC until a critical Ca2+ content is reached, and then the Ca2+ spike is triggered. Thus, the limiting step of Ca2+ spike triggering is the initial filling process and the level of InsP3 in megakaryocytes.
NASA Astrophysics Data System (ADS)
Meng, Xiangting; Chapman, John; Levin, Daniel; Dai, Tiesheng; Zhu, Junjie; Zhou, Bing; Um Atlas Group Team
2016-03-01
The ATLAS Muon Spectrometer Phase-I (and Phase-II) upgrade includes the BIS78 muon trigger detector project: two sets of eight very thin Resistive Place Chambers (tRPCs) combined with small Monitored Drift Tube (MDT) chambers in the pseudorapidity region 1<| η|<1.3. The tRPCs will be comprised of triplet readout layer in each of the eta and azimuthal phi coordinates, with about 400 readout strips per layer. The anticipated hit rate is 100-200 kHz per strip. Digitization of the strip signals will be done by 32-channel CERN HPTDC chips. The HPTDC is a highly configurable ASIC designed by the CERN Microelectronics group. It can work in both trigger and trigger-less modes, be readout in parallel or serially. For Phase-I operation, a stringent latency requirement of 43 bunch crossings (1075 ns) is imposed. The latency budget for the front end digitization must be kept to a minimal value, ideally less than 350 ns. We conducted detailed HPTDC latency simulations using the Behavioral Verilog code from the CERN group. We will report the results of these simulations run for the anticipated detector operating environment and for various HPTDC configurations.
Reconfigurable PCI Express cards for low-latency data transport in HEP experiments
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Cretaro, P.; Frezza, O.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Paolucci, P. S.; Pastorelli, E.; Pontisso, L.; Simula, F.; Vicini, P.
2017-01-01
State-of-the-art technology supports the High Energy Physics community in addressing the problem of managing an overwhelming amount of experimental data. From the point of view of communication between the detectors' readout system and computing nodes, the critical issues are the following: latency, moving data in a deterministic and low amount of time; bandwidth, guaranteeing the maximum capability of the link and communication protocol adopted; endpoint consolidation, tight aggregation of channels on a single board. This contribution describes the status and performances of the NaNet project, whose goal is the design of a family of FPGA-based PCIe network interface cards. The efforts of the team are focused on implementing a low-latency, real-time data transport mechanism between the board network multi-channel system and CPU and GPU accelerators memories on the host. Several opportunities concerning technical solutions and scientific applications have been explored: NaNet-1 with a single GbE I/O interface, and NaNet-10, offering four 10GbE ports, for activities related to the GPU-based real-time trigger of NA62 experiment at CERN; NaNet ^3 , with four 2.5Gbit optical channels, developed for the KM3NeT-ITALIA underwater neutrino telescope.
VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures
Moreland, Kenneth; Sewell, Christopher; Usher, William; ...
2016-05-09
Here, one of the most critical challenges for high-performance computing (HPC) scientific visualization is execution on massively threaded processors. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Our current production scientific visualization software is not designed for these new types of architectures. To address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures
Moreland, Kenneth; Sewell, Christopher; Usher, William; ...
2016-05-09
Execution on massively threaded processors is one of the most critical challenges for high-performance computing (HPC) scientific visualization. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Moreover, our current production scientific visualization software is not designed for these new types of architectures. In order to address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
Michael H. L. S. Wang; Cancelo, Gustavo; Green, Christopher; ...
2016-06-25
Here, we explore the Micron Automata Processor (AP) as a suitable commodity technology that can address the growing computational needs of pattern recognition in High Energy Physics (HEP) experiments. A toy detector model is developed for which an electron track confirmation trigger based on the Micron AP serves as a test case. Although primarily meant for high speed text-based searches, we demonstrate a proof of concept for the use of the Micron AP in a HEP trigger application.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Michael H. L. S. Wang; Cancelo, Gustavo; Green, Christopher
Here, we explore the Micron Automata Processor (AP) as a suitable commodity technology that can address the growing computational needs of pattern recognition in High Energy Physics (HEP) experiments. A toy detector model is developed for which an electron track confirmation trigger based on the Micron AP serves as a test case. Although primarily meant for high speed text-based searches, we demonstrate a proof of concept for the use of the Micron AP in a HEP trigger application.
PixonVision real-time video processor
NASA Astrophysics Data System (ADS)
Puetter, R. C.; Hier, R. G.
2007-09-01
PixonImaging LLC and DigiVision, Inc. have developed a real-time video processor, the PixonVision PV-200, based on the patented Pixon method for image deblurring and denoising, and DigiVision's spatially adaptive contrast enhancement processor, the DV1000. The PV-200 can process NTSC and PAL video in real time with a latency of 1 field (1/60 th of a second), remove the effects of aerosol scattering from haze, mist, smoke, and dust, improve spatial resolution by up to 2x, decrease noise by up to 6x, and increase local contrast by up to 8x. A newer version of the processor, the PV-300, is now in prototype form and can handle high definition video. Both the PV-200 and PV-300 are FPGA-based processors, which could be spun into ASICs if desired. Obvious applications of these processors include applications in the DOD (tanks, aircraft, and ships), homeland security, intelligence, surveillance, and law enforcement. If developed into an ASIC, these processors will be suitable for a variety of portable applications, including gun sights, night vision goggles, binoculars, and guided munitions. This paper presents a variety of examples of PV-200 processing, including examples appropriate to border security, battlefield applications, port security, and surveillance from unmanned aerial vehicles.
Software for embedded processors: Problems and solutions
NASA Astrophysics Data System (ADS)
Bogaerts, J. A. C.
1990-08-01
Data Acquistion systems in HEP experiments use a wide spectrum of computers to cope with two major problems: high event rates and a large data volume. They do this by using special fast trigger processors at the source to reduce the event rate by several orders of magnitude. The next stage of a data acquisition system consists of a network of fast but conventional microprocessors which are embedded in high speed bus systems where data is still further reduced, filtered and merged. In the final stage complete events are farmed out to a another collection of processors, which reconstruct the events and perhaps achieve a further event rejection by a small factor, prior to recording onto magnetic tape. Detectors are monitored by analyzing a fraction of the data. This may be done for individual detectors at an early state of the data acquisition or it may be delayed till the complete events are available. A network of workstations is used for monitoring, displays and run control. Software for trigger processors must have a simple structure. Rejection algorithms are carefully optimized, and overheads introduced by system software cannot be tolerated. The embedded microprocessors have to co-operate, and need to be synchronized with the preceding and following stages. Real time kernels are typically used to solve synchronization and communication problems. Applications are usually coded in C, which is reasonably efficient and allows direct control over low level hardware functions. Event reconstruction software is very similar or even identical to offline software, predominantly written in FORTRAN. With the advent of powerful RISC processors, and with manufacturers tending to adopt open bus architectures, there is a move towards commercial processors and hence the introduction of the UNIX operating system. Building and controlling such a heterogeneous data acquisition system puts a heavy strain on the software. Communications is now as important as CPU capacity and I/O bandwidth, the traditional key parameters of a HEP data acquisition system. Software engineering and real time system simulation tools are becoming indispensible for the design of future data acquisition systems.
Gu, Xiaochun; Chen, Wei; Volkow, Nora D; Koretsky, Alan P; Du, Congwu; Pan, Yingtian
2018-06-26
The role of astrocytes in neurovascular coupling (NVC) is unclear. Here, we applied a multimodality imaging approach to concomitantly measure synchronized neuronal or astrocytic Ca 2+ and hemodynamic changes in the mouse somatosensory cortex at rest and during sensory electrical stimulation. Strikingly, we found that low-frequency stimulation (0.3-1 Hz), which consistently evokes fast neuronal Ca 2+ transients (6.0 ± 2.7 ms latency) that always precede vascular responses, does not always elicit astrocytic Ca 2+ transients (313 ± 65 ms latency). However, the magnitude of the hemodynamic response is increased when astrocytic transients occur, suggesting a facilitatory role of astrocytes in NVC. High-frequency stimulation (5-10 Hz) consistently evokes a large, delayed astrocytic Ca 2+ accumulation (3.48 ± 0.09 s latency) that is temporarily associated with vasoconstriction, suggesting a role for astrocytes in resetting NVC. At rest, neuronal, but not astrocytic, Ca 2+ fluctuations correlate with hemodynamic low-frequency oscillations. Taken together, these results support a role for astrocytes in modulating, but not triggering, NVC. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Optimization on fixed low latency implementation of the GBT core in FPGA
Chen, K.; Chen, H.; Wu, W.; ...
2017-07-11
We present that in the upgrade of ATLAS experiment, the front-end electronics components are subjected to a large radiation background. Meanwhile high speed optical links are required for the data transmission between the on-detector and off-detector electronics. The GBT architecture and the Versatile Link (VL) project are designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed data transmission which is called GBT link. In the ATLAS upgrade, besides the link with on-detector, the GBT link is also used between different off-detector systems. The GBTX ASIC is designed for the on-detector front-end, correspondingly for the off-detector electronics, themore » GBT architecture is implemented in Field Programmable Gate Arrays (FPGA). CERN launches the GBT-FPGA project to provide examples in different types of FPGA. In the ATLAS upgrade framework, the Front-End LInk eXchange (FELIX) system is used to interface the front end electronics of several ATLAS subsystems. The GBT link is used between them, to transfer the detector data and the timing, trigger, control and monitoring information. The trigger signal distributed in the down-link from FELIX to the front-end requires a fixed and low latency. In this paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve a lower fixed latency. For FELIX, a common firmware will be used to interface different front-ends with support of both GBT modes: the forward error correction mode and the wide mode. The modified GBT-FPGA core has the ability to switch between the GBT modes without FPGA reprogramming. Finally, the system clock distribution of the multi-channel FELIX firmware is also discussed in this paper.« less
Optimization on fixed low latency implementation of the GBT core in FPGA
NASA Astrophysics Data System (ADS)
Chen, K.; Chen, H.; Wu, W.; Xu, H.; Yao, L.
2017-07-01
In the upgrade of ATLAS experiment [1], the front-end electronics components are subjected to a large radiation background. Meanwhile high speed optical links are required for the data transmission between the on-detector and off-detector electronics. The GBT architecture and the Versatile Link (VL) project are designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed data transmission which is called GBT link [2]. In the ATLAS upgrade, besides the link with on-detector, the GBT link is also used between different off-detector systems. The GBTX ASIC is designed for the on-detector front-end, correspondingly for the off-detector electronics, the GBT architecture is implemented in Field Programmable Gate Arrays (FPGA). CERN launches the GBT-FPGA project to provide examples in different types of FPGA [3]. In the ATLAS upgrade framework, the Front-End LInk eXchange (FELIX) system [4, 5] is used to interface the front-end electronics of several ATLAS subsystems. The GBT link is used between them, to transfer the detector data and the timing, trigger, control and monitoring information. The trigger signal distributed in the down-link from FELIX to the front-end requires a fixed and low latency. In this paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve a lower fixed latency. For FELIX, a common firmware will be used to interface different front-ends with support of both GBT modes: the forward error correction mode and the wide mode. The modified GBT-FPGA core has the ability to switch between the GBT modes without FPGA reprogramming. The system clock distribution of the multi-channel FELIX firmware is also discussed in this paper.
Multi-petascale highly efficient parallel supercomputer
Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng
2015-07-14
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
Towards energy-efficient photonic interconnects
NASA Astrophysics Data System (ADS)
Demir, Yigit; Hardavellas, Nikos
2015-03-01
Silicon photonics have emerged as a promising solution to meet the growing demand for high-bandwidth, low-latency, and energy-efficient on-chip and off-chip communication in many-core processors. However, current silicon-photonic interconnect designs for many-core processors waste a significant amount of power because (a) lasers are always on, even during periods of interconnect inactivity, and (b) microring resonators employ heaters which consume a significant amount of power just to overcome thermal variations and maintain communication on the photonic links, especially in a 3D-stacked design. The problem of high laser power consumption is particularly important as lasers typically have very low energy efficiency, and photonic interconnects often remain underutilized both in scientific computing (compute-intensive execution phases underutilize the interconnect), and in server computing (servers in Google-scale datacenters have a typical utilization of less than 30%). We address the high laser power consumption by proposing EcoLaser+, which is a laser control scheme that saves energy by predicting the interconnect activity and opportunistically turning the on-chip laser off when possible, and also by scaling the width of the communication link based on a runtime prediction of the expected message length. Our laser control scheme can save up to 62 - 92% of the laser energy, and improve the energy efficiency of a manycore processor with negligible performance penalty. We address the high trimming (heating) power consumption of the microrings by proposing insulation methods that reduce the impact of localized heating induced by highly-active components on the 3D-stacked logic die.
Cache write generate for parallel image processing on shared memory architectures.
Wittenbrink, C M; Somani, A K; Chen, C H
1996-01-01
We investigate cache write generate, our cache mode invention. We demonstrate that for parallel image processing applications, the new mode improves main memory bandwidth, CPU efficiency, cache hits, and cache latency. We use register level simulations validated by the UW-Proteus system. Many memory, cache, and processor configurations are evaluated.
NASA Astrophysics Data System (ADS)
Fang, Juan; Hao, Xiaoting; Fan, Qingwen; Chang, Zeqing; Song, Shuying
2017-05-01
In the Heterogeneous multi-core architecture, CPU and GPU processor are integrated on the same chip, which poses a new challenge to the last-level cache management. In this architecture, the CPU application and the GPU application execute concurrently, accessing the last-level cache. CPU and GPU have different memory access characteristics, so that they have differences in the sensitivity of last-level cache (LLC) capacity. For many CPU applications, a reduced share of the LLC could lead to significant performance degradation. On the contrary, GPU applications can tolerate increase in memory access latency when there is sufficient thread-level parallelism. Taking into account the GPU program memory latency tolerance characteristics, this paper presents a method that let GPU applications can access to memory directly, leaving lots of LLC space for CPU applications, in improving the performance of CPU applications and does not affect the performance of GPU applications. When the CPU application is cache sensitive, and the GPU application is insensitive to the cache, the overall performance of the system is improved significantly.
Sub-nanosecond clock synchronization and trigger management in the nuclear physics experiment AGATA
NASA Astrophysics Data System (ADS)
Bellato, M.; Bortolato, D.; Chavas, J.; Isocrate, R.; Rampazzo, G.; Triossi, A.; Bazzacco, D.; Mengoni, D.; Recchia, F.
2013-07-01
The new-generation spectrometer AGATA, the Advanced GAmma Tracking Array, requires sub-nanosecond clock synchronization among readout and front-end electronics modules that may lie hundred meters apart. We call GTS (Global Trigger and Synchronization System) the infrastructure responsible for precise clock synchronization and for the trigger management of AGATA. It is made of a central trigger processor and nodes, connected in a tree structure by means of optical fibers operated at 2Gb/s. The GTS tree handles the synchronization and the trigger data flow, whereas the trigger processor analyses and eventually validates the trigger primitives centrally. Sub-nanosecond synchronization is achieved by measuring two different types of round-trip times and by automatically correcting for phase-shift differences. For a tree of depth two, the peak-to-peak clock jitter at each leaf is 70 ps; the mean phase difference is 180 ps, while the standard deviation over such phase difference, namely the phase equalization repeatability, is 20 ps. The GTS system has run flawlessly for the two-year long AGATA campaign, held at the INFN Legnaro National Laboratories, Italy, where five triple clusters of the AGATA sub-array were coupled with a variety of ancillary detectors.
The New Feedback Control System of RFX-mod Based on the MARTe Real-Time Framework
NASA Astrophysics Data System (ADS)
Manduchi, G.; Luchetta, A.; Soppelsa, A.; Taliercio, C.
2014-06-01
A real-time system has been successfully used since 2004 in the RFX-mod nuclear fusion experiment to control the position of the plasma and its Magneto Hydrodynamic (MHD) modes. However, its latency and the limited computation power of the used processors prevented the usage of more aggressive control algorithms. Therefore a new hardware and software architecture has been designed to overcome such limitations and to provide a shorter latency and a much increased computation power. The new system is based on a Linux multi-core server and uses MARTe, a framework for real-time control which is gaining interest in the fusion community.
Optimizing latency in Xilinx FPGA implementations of the GBT
NASA Astrophysics Data System (ADS)
Muschter, S.; Baron, S.; Bohm, C.; Cachemiche, J.-P.; Soos, C.
2010-12-01
The GigaBit Transceiver (GBT) [1] system has been developed to replace the Timing, Trigger and Control (TTC) system [2], currently used by LHC, as well as to provide data transmission between on-detector and off-detector components in future sLHC detectors. A VHDL version of the GBT-SERDES, designed for FPGAs, was released in March 2010 as a GBT-FPGA Starter Kit for future GBT users and for off-detector GBT implementation [3]. This code was optimized for resource utilization [4], as the GBT protocol is very demanding. It was not, however, optimized for latency — which will be a critical parameter when used in the trigger path. The GBT-FPGA Starter Kit firmware was first analyzed in terms of latency by looking at the separate components of the VHDL version. Once the parts which contribute most to the latency were identified and modified, two possible optimizations were chosen, resulting in a latency reduced by a factor of three. The modifications were also analyzed in terms of logic utilization. The latency optimization results were compared with measurement results from a Virtex 6 ML605 development board [5] equipped with a XC6VLX240T with speedgrade-1 and the package FF1156. Bit error rate tests were also performed to ensure an error free operation. The two final optimizations were analyzed for utilization and compared with the original code, distributed in the Starter Kit.
A VME-based software trigger system using UNIX processors
NASA Astrophysics Data System (ADS)
Atmur, Robert; Connor, David F.; Molzon, William
1997-02-01
We have constructed a distributed computing platform with eight processors to assemble and filter data from digitization crates. The filtered data were transported to a tape-writing UNIX computer via ethernet. Each processor ran a UNIX operating system and was installed in its own VME crate. Each VME crate contained dual-port memories which interfaced with the digitizers. Using standard hardware and software (VME and UNIX) allows us to select from a wide variety of non-proprietary products and makes upgrades simpler, if they are necessary.
NASA Astrophysics Data System (ADS)
Heckman, S.
2015-12-01
Modern lightning locating systems (LLS) provide real-time monitoring and early warning of lightningactivities. In addition, LLS provide valuable data for statistical analysis in lightning research. It isimportant to know the performance of such LLS. In the present study, the performance of the EarthNetworks Total Lightning Network (ENTLN) is studied using rocket-triggered lightning data acquired atthe International Center for Lightning Research and Testing (ICLRT), Camp Blanding, Florida.In the present study, 18 flashes triggered at ICLRT in 2014 were analyzed and they comprise of 78negative cloud-to-ground return strokes. The geometric mean, median, minimum, and maximum for thepeak currents of the 78 return strokes are 13.4 kA, 13.6 kA, 3.7 kA, and 38.4 kA, respectively. The peakcurrents represent typical subsequent return strokes in natural cloud-to-ground lightning.Earth Networks has developed a new data processor to improve the performance of their network. Inthis study, results are presented for the ENTLN data using the old processor (originally reported in 2014)and the ENTLN data simulated using the new processor. The flash detection efficiency, stroke detectionefficiency, percentage of misclassification, median location error, median peak current estimation error,and median absolute peak current estimation error for the originally reported data from old processorare 100%, 94%, 49%, 271 m, 5%, and 13%, respectively, and those for the simulated data using the newprocessor are 100%, 99%, 9%, 280 m, 11%, and 15%, respectively. The use of new processor resulted inhigher stroke detection efficiency and lower percentage of misclassification. It is worth noting that theslight differences in median location error, median peak current estimation error, and median absolutepeak current estimation error for the two processors are due to the fact that the new processordetected more number of return strokes than the old processor.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kirkham, R.; Siddons, D.; Dunn, P.A.
2010-06-23
The Maia detector system is engineered for energy dispersive x-ray fluorescence spectroscopy and elemental imaging at photon rates exceeding 10{sup 7}/s, integrated scanning of samples for pixel transit times as small as 50 {micro}s and high definition images of 10{sup 8} pixels and real-time processing of detected events for spectral deconvolution and online display of pure elemental images. The system developed by CSIRO and BNL combines a planar silicon 384 detector array, application-specific integrated circuits for pulse shaping and peak detection and sampling and optical data transmission to an FPGA-based pipelined, parallel processor. This paper describes the system and themore » underpinning engineering solutions.« less
NaNet: a configurable NIC bridging the gap between HPC and real-time HEP GPU computing
NASA Astrophysics Data System (ADS)
Lonardo, A.; Ameli, F.; Ammendola, R.; Biagioni, A.; Cotta Ramusino, A.; Fiorini, M.; Frezza, O.; Lamanna, G.; Lo Cicero, F.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Pontisso, L.; Rossetti, D.; Simeone, F.; Simula, F.; Sozzi, M.; Tosoratto, L.; Vicini, P.
2015-04-01
NaNet is a FPGA-based PCIe Network Interface Card (NIC) design with GPUDirect and Remote Direct Memory Access (RDMA) capabilities featuring a configurable and extensible set of network channels. The design currently supports both standard—Gbe (1000BASE-T) and 10GbE (10Base-R)—and custom—34 Gbps APElink and 2.5 Gbps deterministic latency KM3link—channels, but its modularity allows for straightforward inclusion of other link technologies. The GPUDirect feature combined with a transport layer offload module and a data stream processing stage makes NaNet a low-latency NIC suitable for real-time GPU processing. In this paper we describe the NaNet architecture and its performances, exhibiting two of its use cases: the GPU-based low-level trigger for the RICH detector in the NA62 experiment at CERN and the on-/off-shore data transport system for the KM3NeT-IT underwater neutrino telescope.
Connecting Restricted, High-Availability, or Low-Latency Resources to a Seamless Global Pool for CMS
NASA Astrophysics Data System (ADS)
Balcas, J.; Bockelman, B.; Hufnagel, D.; Hurtado Anampa, K.; Jayatilaka, B.; Khan, F.; Larson, K.; Letts, J.; Mascheroni, M.; Mohapatra, A.; Marra Da Silva, J.; Mason, D.; Perez-Calero Yzquierdo, A.; Piperov, S.; Tiradani, A.; Verguilov, V.; CMS Collaboration
2017-10-01
The connection of diverse and sometimes non-Grid enabled resource types to the CMS Global Pool, which is based on HTCondor and glideinWMS, has been a major goal of CMS. These resources range in type from a high-availability, low latency facility at CERN for urgent calibration studies, called the CAF, to a local user facility at the Fermilab LPC, allocation-based computing resources at NERSC and SDSC, opportunistic resources provided through the Open Science Grid, commercial clouds, and others, as well as access to opportunistic cycles on the CMS High Level Trigger farm. In addition, we have provided the capability to give priority to local users of beyond WLCG pledged resources at CMS sites. Many of the solutions employed to bring these diverse resource types into the Global Pool have common elements, while some are very specific to a particular project. This paper details some of the strategies and solutions used to access these resources through the Global Pool in a seamless manner.
A TTC upgrade proposal using bidirectional 10G-PON FTTH technology
NASA Astrophysics Data System (ADS)
Kolotouros, D. M.; Baron, S.; Soos, C.; Vasey, F.
2015-04-01
A new generation FPGA-based Timing-Trigger and Control (TTC) system based on emerging Passive Optical Network (PON) technology is being proposed to replace the existing off-detector TTC system used by the LHC experiments. High split ratio, dynamic software partitioning, low and deterministic latency, as well as low jitter are required. Exploiting the latest available technologies allows delivering higher capacity together with bidirectionality, a feature absent from the legacy TTC system. This article focuses on the features and capabilities of the latest TTC-PON prototype based on 10G-PON FTTH components along with some metrics characterizing its performance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chapline, G.
1998-03-01
The engineering problems of constructing autonomous networks of sensors and data processors that can provide alerts for dangerous situations provide a new context for debating the question whether man-made systems can emulate the cognitive capabilities of the mammalian brain. In this paper we consider the question whether a distributed network of sensors and data processors can form ``perceptions`` based on sensory data. Because sensory data can have exponentially many explanations, the use of a central data processor to analyze the outputs from a large ensemble of sensors will in general introduce unacceptable latencies for responding to dangerous situations. A bettermore » idea is to use a distributed ``Helmholtz machine`` architecture in which the sensors are connected to a network of simple processors, and the collective state of the network as a whole provides an explanation for the sensory data. In general communication within such a network will require time division multiplexing, which opens the door to the possibility that with certain refinements to the Helmholtz machine architecture it may be possible to build sensor networks that exhibit a form of artificial consciousness.« less
The Fine Tuning of Pain Thresholds: A Sophisticated Double Alarm System
Plaghki, Léon; Decruynaere, Céline; Van Dooren, Paul; Le Bars, Daniel
2010-01-01
Two distinctive features characterize the way in which sensations including pain, are evoked by heat: (1) a thermal stimulus is always progressive; (2) a painful stimulus activates two different types of nociceptors, connected to peripheral afferent fibers with medium and slow conduction velocities, namely Aδ- and C-fibers. In the light of a recent study in the rat, our objective was to develop an experimental paradigm in humans, based on the joint analysis of the stimulus and the response of the subject, to measure the thermal thresholds and latencies of pain elicited by Aδ- and C-fibers. For comparison, the same approach was applied to the sensation of warmth elicited by thermoreceptors. A CO2 laser beam raised the temperature of the skin filmed by an infrared camera. The subject stopped the beam when he/she perceived pain. The thermal images were analyzed to provide four variables: true thresholds and latencies of pain triggered by heat via Aδ- and C-fibers. The psychophysical threshold of pain triggered by Aδ-fibers was always higher (2.5–3°C) than that triggered by C-fibers. The initial skin temperature did not influence these thresholds. The mean conduction velocities of the corresponding fibers were 13 and 0.8 m/s, respectively. The triggering of pain either by C- or by Aδ-fibers was piloted by several factors including the low/high rate of stimulation, the low/high base temperature of the skin, the short/long peripheral nerve path and some pharmacological manipulations (e.g. Capsaicin). Warming a large skin area increased the pain thresholds. Considering the warmth detection gave a different picture: the threshold was strongly influenced by the initial skin temperature and the subjects detected an average variation of 2.7°C, whatever the initial temperature. This is the first time that thresholds and latencies for pain elicited by both Aδ- and C-fibers from a given body region have been measured in the same experimental run. Such an approach illustrates the role of nociception as a “double level” and “double release” alarm system based on level detectors. By contrast, warmth detection was found to be based on difference detectors. It is hypothesized that pain results from a CNS build-up process resulting from population coding and strongly influenced by the background temperatures surrounding at large the stimulation site. We propose an alternative solution to the conventional methods that only measure a single “threshold of pain”, without knowing which of the two systems is involved. PMID:20428245
Pani, Danilo; Barabino, Gianluca; Citi, Luca; Meloni, Paolo; Raspopovic, Stanisa; Micera, Silvestro; Raffo, Luigi
2016-09-01
The control of upper limb neuroprostheses through the peripheral nervous system (PNS) can allow restoring motor functions in amputees. At present, the important aspect of the real-time implementation of neural decoding algorithms on embedded systems has been often overlooked, notwithstanding the impact that limited hardware resources have on the efficiency/effectiveness of any given algorithm. Present study is addressing the optimization of a template matching based algorithm for PNS signals decoding that is a milestone for its real-time, full implementation onto a floating-point digital signal processor (DSP). The proposed optimized real-time algorithm achieves up to 96% of correct classification on real PNS signals acquired through LIFE electrodes on animals, and can correctly sort spikes of a synthetic cortical dataset with sufficiently uncorrelated spike morphologies (93% average correct classification) comparably to the results obtained with top spike sorter (94% on average on the same dataset). The power consumption enables more than 24 h processing at the maximum load, and latency model has been derived to enable a fair performance assessment. The final embodiment demonstrates the real-time performance onto a low-power off-the-shelf DSP, opening to experiments exploiting the efferent signals to control a motor neuroprosthesis.
Signal processor for processing ultrasonic receiver signals
Fasching, George E.
1980-01-01
A signal processor is provided which uses an analog integrating circuit in conjunction with a set of digital counters controlled by a precision clock for sampling timing to provide an improved presentation of an ultrasonic transmitter/receiver signal. The signal is sampled relative to the transmitter trigger signal timing at precise times, the selected number of samples are integrated and the integrated samples are transferred and held for recording on a strip chart recorder or converted to digital form for storage. By integrating multiple samples taken at precisely the same time with respect to the trigger for the ultrasonic transmitter, random noise, which is contained in the ultrasonic receiver signal, is reduced relative to the desired useful signal.
A Parallel Pipelined Renderer for the Time-Varying Volume Data
NASA Technical Reports Server (NTRS)
Chiueh, Tzi-Cker; Ma, Kwan-Liu
1997-01-01
This paper presents a strategy for efficiently rendering time-varying volume data sets on a distributed-memory parallel computer. Time-varying volume data take large storage space and visualizing them requires reading large files continuously or periodically throughout the course of the visualization process. Instead of using all the processors to collectively render one volume at a time, a pipelined rendering process is formed by partitioning processors into groups to render multiple volumes concurrently. In this way, the overall rendering time may be greatly reduced because the pipelined rendering tasks are overlapped with the I/O required to load each volume into a group of processors; moreover, parallelization overhead may be reduced as a result of partitioning the processors. We modify an existing parallel volume renderer to exploit various levels of rendering parallelism and to study how the partitioning of processors may lead to optimal rendering performance. Two factors which are important to the overall execution time are re-source utilization efficiency and pipeline startup latency. The optimal partitioning configuration is the one that balances these two factors. Tests on Intel Paragon computers show that in general optimal partitionings do exist for a given rendering task and result in 40-50% saving in overall rendering time.
Trigger and Readout System for the Ashra-1 Detector
NASA Astrophysics Data System (ADS)
Aita, Y.; Aoki, T.; Asaoka, Y.; Morimoto, Y.; Motz, H. M.; Sasaki, M.; Abiko, C.; Kanokohata, C.; Ogawa, S.; Shibuya, H.; Takada, T.; Kimura, T.; Learned, J. G.; Matsuno, S.; Kuze, S.; Binder, P. M.; Goldman, J.; Sugiyama, N.; Watanabe, Y.
Highly sophisticated trigger and readout system has been developed for All-sky Survey High Resolution Air-shower (Ashra) detector. Ashra-1 detector has 42 degree diameter field of view. Detection of Cherenkov and fluorescence light from large background in the large field of view requires finely segmented and high speed trigger and readout system. The system is composed of optical fiber image transmission system, 64 × 64 channel trigger sensor and FPGA based trigger logic processor. The system typically processes the image within 10 to 30 ns and opens the shutter on the fine CMOS sensor. 64 × 64 coarse split image is transferred via 64 × 64 precisely aligned optical fiber bundle to a photon sensor. Current signals from the photon sensor are discriminated by custom made trigger amplifiers. FPGA based processor processes 64 × 64 hit pattern and correspondent partial area of the fine image is acquired. Commissioning earth skimming tau neutrino observational search was carried out with this trigger system. In addition to the geometrical advantage of the Ashra observational site, the excellent tau shower axis measurement based on the fine imaging and the night sky background rejection based on the fine and fast imaging allow zero background tau shower search. Adoption of the optical fiber bundle and trigger LSI realized 4k channel trigger system cheaply. Detectability of tau shower is also confirmed by simultaneously observed Cherenkov air shower. Reduction of the trigger threshold appears to enhance the effective area especially in PeV tau neutrino energy region. New two dimensional trigger LSI was introduced and the trigger threshold was lowered. New calibration system of the trigger system was recently developed and introduced to the Ashra detector
Merlin - Massively parallel heterogeneous computing
NASA Technical Reports Server (NTRS)
Wittie, Larry; Maples, Creve
1989-01-01
Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.
NASA Astrophysics Data System (ADS)
Männer, R.
1989-12-01
This paper describes a systolic array processor for a ring image Cherenkov counter which is capable of identifying pairs of electron circles with a known radius and a certain minimum distance within 15 μs. The processor is a very flexible and fast device. It consists of 128 x 128 processing elements (PEs), where one PE is assigned to each pixel of the image. All PEs run synchronously at 40 MHz. The identification of electron circles is done by correlating the detector image with the proper circle circumference. Circle centers are found by peak detection in the correlation result. A second correlation with a circle disc allows circles of closed electron pairs to be rejected. The trigger decision is generated if a pseudo adder detects at least two remaining circles. The device is controlled by a freely programmable sequencer. A VLSI chip containing 8 x 8 PEs is being developed using a VENUS design system and will be produced in 2μ CMOS technology.
TTEthernet for Integrated Spacecraft Networks
NASA Technical Reports Server (NTRS)
Loveless, Andrew
2015-01-01
Aerospace projects have traditionally employed federated avionics architectures, in which each computer system is designed to perform one specific function (e.g. navigation). There are obvious downsides to this approach, including excessive weight (from so much computing hardware), and inefficient processor utilization (since modern processors are capable of performing multiple tasks). There has therefore been a push for integrated modular avionics (IMA), in which common computing platforms can be leveraged for different purposes. This consolidation of multiple vehicle functions to shared computing platforms can significantly reduce spacecraft cost, weight, and design complexity. However, the application of IMA principles introduces significant challenges, as the data network must accommodate traffic of mixed criticality and performance levels - potentially all related to the same shared computer hardware. Because individual network technologies are rarely so competent, the development of truly integrated network architectures often proves unreasonable. Several different types of networks are utilized - each suited to support a specific vehicle function. Critical functions are typically driven by precise timing loops, requiring networks with strict guarantees regarding message latency (i.e. determinism) and fault-tolerance. Alternatively, non-critical systems generally employ data networks prioritizing flexibility and high performance over reliable operation. Switched Ethernet has seen widespread success filling this role in terrestrial applications. Its high speed, flexibility, and the availability of inexpensive commercial off-the-shelf (COTS) components make it desirable for inclusion in spacecraft platforms. Basic Ethernet configurations have been incorporated into several preexisting aerospace projects, including both the Space Shuttle and International Space Station (ISS). However, classical switched Ethernet cannot provide the high level of network determinism required by real-time spacecraft applications. Even with modern advancements, the uncoordinated (i.e. event-driven) nature of Ethernet communication unavoidably leads to message contention within network switches. The arbitration process used to resolve such conflicts introduces variation in the time it takes for messages to be forwarded. TTEthernet1 introduces decentralized clock synchronization to switched Ethernet, enabling message transmission according to a time-triggered (TT) paradigm. A network planning tool is used to allocate each device a finite amount of time in which it may transmit a frame. Each time slot is repeated sequentially to form a periodic communication schedule that is then loaded onto each TTEthernet device (e.g. switches and end systems). Each network participant references the synchronized time in order to dispatch messages at predetermined instances. This schedule guarantees that no contention exists between time-triggered Ethernet frames in the network switches, therefore eliminating the need for arbitration (and the timing variation it causes). Besides time-triggered messaging, TTEthernet networks may provide two additional traffic classes to support communication of different criticality levels. In the rate-constrained (RC) traffic class, the frame payload size and rate of transmission along each communication channel are limited to predetermined maximums. The network switches can therefore be configured to accommodate the known worst-case traffic pattern, and buffer overflows can be eliminated. The best-effort (BE) traffic class behaves akin to classical Ethernet. No guarantees are provided regarding transmission latency or successful message delivery. TTEthernet coordinates transmission of all three traffic classes over the same physical connections, therefore accommodating the full spectrum of traffic criticality levels required in IMA architectures. Common computing platforms (e.g. LRUs) can share networking resources in such a way that failures in non-critical systems (using BE or RC communication modes) cannot impact flight-critical functions (using TT communication). Furthermore, TTEthernet hardware (e.g. switches, cabling) can be shared by both TTEthernet and classical Ethernet traffic.
A hardware fast tracker for the ATLAS trigger
NASA Astrophysics Data System (ADS)
Asbah, Nedaa
2016-09-01
The trigger system of the ATLAS experiment is designed to reduce the event rate from the LHC nominal bunch crossing at 40 MHz to about 1 kHz, at the design luminosity of 1034 cm-2 s-1. After a successful period of data taking from 2010 to early 2013, the LHC already started with much higher instantaneous luminosity. This will increase the load on High Level Trigger system, the second stage of the selection based on software algorithms. More sophisticated algorithms will be needed to achieve higher background rejection while maintaining good efficiency for interesting physics signals. The Fast TracKer (FTK) is part of the ATLAS trigger upgrade project. It is a hardware processor that will provide, at every Level-1 accepted event (100 kHz) and within 100 microseconds, full tracking information for tracks with momentum as low as 1 GeV. Providing fast, extensive access to tracking information, with resolution comparable to the offline reconstruction, FTK will help in precise detection of the primary and secondary vertices to ensure robust selections and improve the trigger performance. FTK exploits hardware technologies with massive parallelism, combining Associative Memory ASICs, FPGAs and high-speed communication links.
Messamore, William G.; Van Acker, Gustaf M.; Hudson, Heather M.; Zhang, Hongyu Y.; Kovac, Anthony; Nazzaro, Jules; Cheney, Paul D.
2016-01-01
While a large body of evidence supports the view that ipsilateral motor cortex may make an important contribution to normal movements and to recovery of function following cortical injury (Chollet et al. 1991; Fisher 1992; Caramia et al. 2000; Feydy et al. 2002), relatively little is known about the properties of output from motor cortex to ipsilateral muscles. Our aim in this study was to characterize the organization of output effects on hindlimb muscles from ipsilateral motor cortex using stimulus-triggered averaging of EMG activity. Stimulus-triggered averages of EMG activity were computed from microstimuli applied at 60–120 μA to sites in both contralateral and ipsilateral M1 of macaque monkeys during the performance of a hindlimb push–pull task. Although the poststimulus effects (PStEs) from ipsilateral M1 were fewer in number and substantially weaker, clear and consistent effects were obtained at an intensity of 120 μA. The mean onset latency of ipsilateral poststimulus facilitation was longer than contralateral effects by an average of 0.7 ms. However, the shortest latency effects in ipsilateral muscles were as short as the shortest latency effects in the corresponding contralateral muscles suggesting a minimal synaptic linkage that is equally direct in both cases. PMID:26088970
HTMT-class Latency Tolerant Parallel Architecture for Petaflops Scale Computation
NASA Technical Reports Server (NTRS)
Sterling, Thomas; Bergman, Larry
2000-01-01
Computational Aero Sciences and other numeric intensive computation disciplines demand computing throughputs substantially greater than the Teraflops scale systems only now becoming available. The related fields of fluids, structures, thermal, combustion, and dynamic controls are among the interdisciplinary areas that in combination with sufficient resolution and advanced adaptive techniques may force performance requirements towards Petaflops. This will be especially true for compute intensive models such as Navier-Stokes are or when such system models are only part of a larger design optimization computation involving many design points. Yet recent experience with conventional MPP configurations comprising commodity processing and memory components has shown that larger scale frequently results in higher programming difficulty and lower system efficiency. While important advances in system software and algorithms techniques have had some impact on efficiency and programmability for certain classes of problems, in general it is unlikely that software alone will resolve the challenges to higher scalability. As in the past, future generations of high-end computers may require a combination of hardware architecture and system software advances to enable efficient operation at a Petaflops level. The NASA led HTMT project has engaged the talents of a broad interdisciplinary team to develop a new strategy in high-end system architecture to deliver petaflops scale computing in the 2004/5 timeframe. The Hybrid-Technology, MultiThreaded parallel computer architecture incorporates several advanced technologies in combination with an innovative dynamic adaptive scheduling mechanism to provide unprecedented performance and efficiency within practical constraints of cost, complexity, and power consumption. The emerging superconductor Rapid Single Flux Quantum electronics can operate at 100 GHz (the record is 770 GHz) and one percent of the power required by convention semiconductor logic. Wave Division Multiplexing optical communications can approach a peak per fiber bandwidth of 1 Tbps and the new Data Vortex network topology employing this technology can connect tens of thousands of ports providing a bi-section bandwidth on the order of a Petabyte per second with latencies well below 100 nanoseconds, even under heavy loads. Processor-in-Memory (PIM) technology combines logic and memory on the same chip exposing the internal bandwidth of the memory row buffers at low latency. And holographic storage photorefractive storage technologies provide high-density memory with access a thousand times faster than conventional disk technologies. Together these technologies enable a new class of shared memory system architecture with a peak performance in the range of a Petaflops but size and power requirements comparable to today's largest Teraflops scale systems. To achieve high-sustained performance, HTMT combines an advanced multithreading processor architecture with a memory-driven coarse-grained latency management strategy called "percolation", yielding high efficiency while reducing the much of the parallel programming burden. This paper will present the basic system architecture characteristics made possible through this series of advanced technologies and then give a detailed description of the new percolation approach to runtime latency management.
Labanca, Ludimila; Dornas de Oliveira, Leonardo; Vaz de Melo Trindade, Guilherme; de Almeida Pereira, Thiago; Diniz Cunha, Pedro Henrique; Santos Falci Mourão, Marina; Lambertucci, José Roberto
2016-01-01
Background Schistosomal myeloradiculopathy (SMR), the most severe and disabling ectopic form of Schistosoma mansoni infection, is caused by embolized ova eliciting local inflammation in the spinal cord and nerve roots. The treatment involves the use of praziquantel and long-term corticotherapy. The assessment of therapeutic response relies on neurological examination. Supplementary electrophysiological exams may improve prediction and monitoring of functional outcome. Vestibular evoked myogenic potential (VEMP) triggered by galvanic vestibular stimulation (GVS) is a simple, safe, low-cost and noninvasive electrophysiological technique that has been used to test the vestibulospinal tract in motor myelopathies. This paper reports the results of VEMP with GVS in patients with SMR. Methods A cross-sectional comparative study enrolled 22 patients with definite SMR and 22 healthy controls that were submitted to clinical, neurological examination and GVS. Galvanic stimulus was applied in the mastoid bones in a transcranial configuration for testing VEMP, which was recorded by electromyography (EMG) in the gastrocnemii muscles. The VEMP variables of interest were blindly measured by two independent examiners. They were the short-latency (SL) and the medium-latency (ML) components of the biphasic EMG wave. Results VEMP showed the components SL (p = 0.001) and ML (p<0.001) delayed in SMR compared to controls. The delay of SL (p = 0.010) and of ML (p = 0.020) was associated with gait dysfunction. Conclusion VEMP triggered by GVS identified alterations in patients with SMR and provided additional functional information that justifies its use as a supplementary test in motor myelopathies. PMID:27128806
Caporali, Júlia Fonseca de Morais; Utsch Gonçalves, Denise; Labanca, Ludimila; Dornas de Oliveira, Leonardo; Vaz de Melo Trindade, Guilherme; de Almeida Pereira, Thiago; Diniz Cunha, Pedro Henrique; Santos Falci Mourão, Marina; Lambertucci, José Roberto
2016-04-01
Schistosomal myeloradiculopathy (SMR), the most severe and disabling ectopic form of Schistosoma mansoni infection, is caused by embolized ova eliciting local inflammation in the spinal cord and nerve roots. The treatment involves the use of praziquantel and long-term corticotherapy. The assessment of therapeutic response relies on neurological examination. Supplementary electrophysiological exams may improve prediction and monitoring of functional outcome. Vestibular evoked myogenic potential (VEMP) triggered by galvanic vestibular stimulation (GVS) is a simple, safe, low-cost and noninvasive electrophysiological technique that has been used to test the vestibulospinal tract in motor myelopathies. This paper reports the results of VEMP with GVS in patients with SMR. A cross-sectional comparative study enrolled 22 patients with definite SMR and 22 healthy controls that were submitted to clinical, neurological examination and GVS. Galvanic stimulus was applied in the mastoid bones in a transcranial configuration for testing VEMP, which was recorded by electromyography (EMG) in the gastrocnemii muscles. The VEMP variables of interest were blindly measured by two independent examiners. They were the short-latency (SL) and the medium-latency (ML) components of the biphasic EMG wave. VEMP showed the components SL (p = 0.001) and ML (p<0.001) delayed in SMR compared to controls. The delay of SL (p = 0.010) and of ML (p = 0.020) was associated with gait dysfunction. VEMP triggered by GVS identified alterations in patients with SMR and provided additional functional information that justifies its use as a supplementary test in motor myelopathies.
NASA Technical Reports Server (NTRS)
2008-01-01
As Global Positioning Satellite (GPS) applications become more prevalent for land- and air-based vehicles, GPS applications for space vehicles will also increase. The Applied Technology Directorate of Kennedy Space Center (KSC) has developed a lightweight, low-cost GPS Metric Tracking Unit (GMTU), the first of two steps in developing a lightweight, low-cost Space-Based Tracking and Command Subsystem (STACS) designed to meet Range Safety's link margin and latency requirements for vehicle command and telemetry data. The goals of STACS are to improve Range Safety operations and expand tracking capabilities for space vehicles. STACS will track the vehicle, receive commands, and send telemetry data through the space-based asset, which will dramatically reduce dependence on ground-based assets. The other step was the Low-Cost Tracking and Data Relay Satellite System (TDRSS) Transceiver (LCT2), developed by the Wallops Flight Facility (WFF), which allows the vehicle to communicate with a geosynchronous relay satellite. Although the GMTU and LCT2 were independently implemented and tested, the design collaboration of KSC and WFF engineers allowed GMTU and LCT2 to be integrated into one enclosure, leading to the final STACS. In operation, GMTU needs only a radio frequency (RF) input from a GPS antenna and outputs position and velocity data to the vehicle through a serial or pulse code modulation (PCM) interface. GMTU includes one commercial GPS receiver board and a custom board, the Command and Telemetry Processor (CTP) developed by KSC. The CTP design is based on a field-programmable gate array (FPGA) with embedded processors to support GPS functions.
Parallel computing on Unix workstation arrays
NASA Astrophysics Data System (ADS)
Reale, F.; Bocchino, F.; Sciortino, S.
1994-12-01
We have tested arrays of general-purpose Unix workstations used as MIMD systems for massive parallel computations. In particular we have solved numerically a demanding test problem with a 2D hydrodynamic code, generally developed to study astrophysical flows, by exucuting it on arrays either of DECstations 5000/200 on Ethernet LAN, or of DECstations 3000/400, equipped with powerful Alpha processors, on FDDI LAN. The code is appropriate for data-domain decomposition, and we have used a library for parallelization previously developed in our Institute, and easily extended to work on Unix workstation arrays by using the PVM software toolset. We have compared the parallel efficiencies obtained on arrays of several processors to those obtained on a dedicated MIMD parallel system, namely a Meiko Computing Surface (CS-1), equipped with Intel i860 processors. We discuss the feasibility of using non-dedicated parallel systems and conclude that the convenience depends essentially on the size of the computational domain as compared to the relative processor power and network bandwidth. We point out that for future perspectives a parallel development of processor and network technology is important, and that the software still offers great opportunities of improvement, especially in terms of latency times in the message-passing protocols. In conditions of significant gain in terms of speedup, such workstation arrays represent a cost-effective approach to massive parallel computations.
NASA Astrophysics Data System (ADS)
Leggett, C.; Binet, S.; Jackson, K.; Levinthal, D.; Tatarkhanov, M.; Yao, Y.
2011-12-01
Thermal limitations have forced CPU manufacturers to shift from simply increasing clock speeds to improve processor performance, to producing chip designs with multi- and many-core architectures. Further the cores themselves can run multiple threads as a zero overhead context switch allowing low level resource sharing (Intel Hyperthreading). To maximize bandwidth and minimize memory latency, memory access has become non uniform (NUMA). As manufacturers add more cores to each chip, a careful understanding of the underlying architecture is required in order to fully utilize the available resources. We present AthenaMP and the Atlas event loop manager, the driver of the simulation and reconstruction engines, which have been rewritten to make use of multiple cores, by means of event based parallelism, and final stage I/O synchronization. However, initial studies on 8 andl6 core Intel architectures have shown marked non-linearities as parallel process counts increase, with as much as 30% reductions in event throughput in some scenarios. Since the Intel Nehalem architecture (both Gainestown and Westmere) will be the most common choice for the next round of hardware procurements, an understanding of these scaling issues is essential. Using hardware based event counters and Intel's Performance Tuning Utility, we have studied the performance bottlenecks at the hardware level, and discovered optimization schemes to maximize processor throughput. We have also produced optimization mechanisms, common to all large experiments, that address the extreme nature of today's HEP code, which due to it's size, places huge burdens on the memory infrastructure of today's processors.
Modeling a Million-Node Slim Fly Network Using Parallel Discrete-Event Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolfe, Noah; Carothers, Christopher; Mubarak, Misbah
As supercomputers close in on exascale performance, the increased number of processors and processing power translates to an increased demand on the underlying network interconnect. The Slim Fly network topology, a new lowdiameter and low-latency interconnection network, is gaining interest as one possible solution for next-generation supercomputing interconnect systems. In this paper, we present a high-fidelity Slim Fly it-level model leveraging the Rensselaer Optimistic Simulation System (ROSS) and Co-Design of Exascale Storage (CODES) frameworks. We validate our Slim Fly model with the Kathareios et al. Slim Fly model results provided at moderately sized network scales. We further scale the modelmore » size up to n unprecedented 1 million compute nodes; and through visualization of network simulation metrics such as link bandwidth, packet latency, and port occupancy, we get an insight into the network behavior at the million-node scale. We also show linear strong scaling of the Slim Fly model on an Intel cluster achieving a peak event rate of 36 million events per second using 128 MPI tasks to process 7 billion events. Detailed analysis of the underlying discrete-event simulation performance shows that a million-node Slim Fly model simulation can execute in 198 seconds on the Intel cluster.« less
A simple modern correctness condition for a space-based high-performance multiprocessor
NASA Technical Reports Server (NTRS)
Probst, David K.; Li, Hon F.
1992-01-01
A number of U.S. national programs, including space-based detection of ballistic missile launches, envisage putting significant computing power into space. Given sufficient progress in low-power VLSI, multichip-module packaging and liquid-cooling technologies, we will see design of high-performance multiprocessors for individual satellites. In very high speed implementations, performance depends critically on tolerating large latencies in interprocessor communication; without latency tolerance, performance is limited by the vastly differing time scales in processor and data-memory modules, including interconnect times. The modern approach to tolerating remote-communication cost in scalable, shared-memory multiprocessors is to use a multithreaded architecture, and alter the semantics of shared memory slightly, at the price of forcing the programmer either to reason about program correctness in a relaxed consistency model or to agree to program in a constrained style. The literature on multiprocessor correctness conditions has become increasingly complex, and sometimes confusing, which may hinder its practical application. We propose a simple modern correctness condition for a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and the parallel programming system.
Parallel processing architecture for H.264 deblocking filter on multi-core platforms
NASA Astrophysics Data System (ADS)
Prasad, Durga P.; Sonachalam, Sekar; Kunchamwar, Mangesh K.; Gunupudi, Nageswara Rao
2012-03-01
Massively parallel computing (multi-core) chips offer outstanding new solutions that satisfy the increasing demand for high resolution and high quality video compression technologies such as H.264. Such solutions not only provide exceptional quality but also efficiency, low power, and low latency, previously unattainable in software based designs. While custom hardware and Application Specific Integrated Circuit (ASIC) technologies may achieve lowlatency, low power, and real-time performance in some consumer devices, many applications require a flexible and scalable software-defined solution. The deblocking filter in H.264 encoder/decoder poses difficult implementation challenges because of heavy data dependencies and the conditional nature of the computations. Deblocking filter implementations tend to be fixed and difficult to reconfigure for different needs. The ability to scale up for higher quality requirements such as 10-bit pixel depth or a 4:2:2 chroma format often reduces the throughput of a parallel architecture designed for lower feature set. A scalable architecture for deblocking filtering, created with a massively parallel processor based solution, means that the same encoder or decoder will be deployed in a variety of applications, at different video resolutions, for different power requirements, and at higher bit-depths and better color sub sampling patterns like YUV, 4:2:2, or 4:4:4 formats. Low power, software-defined encoders/decoders may be implemented using a massively parallel processor array, like that found in HyperX technology, with 100 or more cores and distributed memory. The large number of processor elements allows the silicon device to operate more efficiently than conventional DSP or CPU technology. This software programing model for massively parallel processors offers a flexible implementation and a power efficiency close to that of ASIC solutions. This work describes a scalable parallel architecture for an H.264 compliant deblocking filter for multi core platforms such as HyperX technology. Parallel techniques such as parallel processing of independent macroblocks, sub blocks, and pixel row level are examined in this work. The deblocking architecture consists of a basic cell called deblocking filter unit (DFU) and dependent data buffer manager (DFM). The DFU can be used in several instances, catering to different performance needs the DFM serves the data required for the different number of DFUs, and also manages all the neighboring data required for future data processing of DFUs. This approach achieves the scalability, flexibility, and performance excellence required in deblocking filters.
NASA Technical Reports Server (NTRS)
1991-01-01
Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)
Event and Pulse Node Hardware Design for Nuclear Fusion Experiments
NASA Astrophysics Data System (ADS)
Fortunato, J. C.; Batista, A.; Sousa, J.; Fernandes, H.; Varandas, C. A. F.
2008-04-01
This article presents an event and pulse node hardware module (EPN) developed for use in control and data acquisition (CODAC) in current and upcoming long discharges nuclear fusion experiments. Its purpose is to allow real time event management and trigger distribution. The use of a mixture of digital signal processing and field programmable gate arrays, with fiber optic channels for event broadcast between CODAC nodes, and short length paths between the EPN and CODAC hardware, allows an effective and low latency communication path. This hardware will be integrated in the ISTTOK CODAC to allow long AC plasma discharges.
Large-N in Volcano Settings: Volcanosri
NASA Astrophysics Data System (ADS)
Lees, J. M.; Song, W.; Xing, G.; Vick, S.; Phillips, D.
2014-12-01
We seek a paradigm shift in the approach we take on volcano monitoring where the compromise from high fidelity to large numbers of sensors is used to increase coverage and resolution. Accessibility, danger and the risk of equipment loss requires that we develop systems that are independent and inexpensive. Furthermore, rather than simply record data on hard disk for later analysis we desire a system that will work autonomously, capitalizing on wireless technology and in field network analysis. To this end we are currently producing a low cost seismic array which will incorporate, at the very basic level, seismological tools for first cut analysis of a volcano in crises mode. At the advanced end we expect to perform tomographic inversions in the network in near real time. Geophone (4 Hz) sensors connected to a low cost recording system will be installed on an active volcano where triggering earthquake location and velocity analysis will take place independent of human interaction. Stations are designed to be inexpensive and possibly disposable. In one of the first implementations the seismic nodes consist of an Arduino Due processor board with an attached Seismic Shield. The Arduino Due processor board contains an Atmel SAM3X8E ARM Cortex-M3 CPU. This 32 bit 84 MHz processor can filter and perform coarse seismic event detection on a 1600 sample signal in fewer than 200 milliseconds. The Seismic Shield contains a GPS module, 900 MHz high power mesh network radio, SD card, seismic amplifier, and 24 bit ADC. External sensors can be attached to either this 24-bit ADC or to the internal multichannel 12 bit ADC contained on the Arduino Due processor board. This allows the node to support attachment of multiple sensors. By utilizing a high-speed 32 bit processor complex signal processing tasks can be performed simultaneously on multiple sensors. Using a 10 W solar panel, second system being developed can run autonomously and collect data on 3 channels at 100Hz for 6 months with the installed 16Gb SD card. Initial designs and test results will be presented and discussed.
GPU real-time processing in NA62 trigger system
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Chiozzi, S.; Cretaro, P.; Di Lorenzo, S.; Fantechi, R.; Fiorini, M.; Frezza, O.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Piandani, R.; Piccini, M.; Pontisso, L.; Rossetti, D.; Simula, F.; Sozzi, M.; Vicini, P.
2017-01-01
A commercial Graphics Processing Unit (GPU) is used to build a fast Level 0 (L0) trigger system tested parasitically with the TDAQ (Trigger and Data Acquisition systems) of the NA62 experiment at CERN. In particular, the parallel computing power of the GPU is exploited to perform real-time fitting in the Ring Imaging CHerenkov (RICH) detector. Direct GPU communication using a FPGA-based board has been used to reduce the data transmission latency. The performance of the system for multi-ring reconstrunction obtained during the NA62 physics run will be presented.
Characterization of the faulted behavior of digital computers and fault tolerant systems
NASA Technical Reports Server (NTRS)
Bavuso, Salvatore J.; Miner, Paul S.
1989-01-01
A development status evaluation is presented for efforts conducted at NASA-Langley since 1977, toward the characterization of the latent fault in digital fault-tolerant systems. Attention is given to the practical, high speed, generalized gate-level logic system simulator developed, as well as to the validation methodology used for the simulator, on the basis of faultable software and hardware simulations employing a prototype MIL-STD-1750A processor. After validation, latency tests will be performed.
1992-09-01
to acquire or develop effective simulation tools to observe the behavior of a RISC implementation as it executes different types of programs . We choose...Performance Computer performance is measured by the amount of the time required to execute a program . Performance encompasses two types of time, elapsed time...and CPU time. Elapsed time is the time required to execute a program from start to finish. It includes latency of input/output activities such as
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, P.Y.; Hao, E.; Patt, Y.
Conditional branches incur a severe performance penalty in wide-issue, deeply pipelined processors. Speculative execution and predicated execution are two mechanisms that have been proposed for reducing this penalty. Speculative execution can completely eliminate the penalty associated with a particular branch, but requires accurate branch prediction to be effective. Predicated execution does not require accurate branch prediction to eliminate the branch penalty, but is not applicable to all branches and can increase the latencies within the program. This paper examines the performance benefit of using both mechanisms to reduce the branch execution penalty. Predicated execution is used to handle the hard-to-protectmore » branches and speculative execution is used to handle the remaining branches. The hard-to-predict branches within the program are determined by profiling. We show that this approach can significantly reduce the branch execution penalty suffered by wide-issue processors.« less
Gilgamesh: A Multithreaded Processor-In-Memory Architecture for Petaflops Computing
NASA Technical Reports Server (NTRS)
Sterling, T. L.; Zima, H. P.
2002-01-01
Processor-in-Memory (PIM) architectures avoid the von Neumann bottleneck in conventional machines by integrating high-density DRAM and CMOS logic on the same chip. Parallel systems based on this new technology are expected to provide higher scalability, adaptability, robustness, fault tolerance and lower power consumption than current MPPs or commodity clusters. In this paper we describe the design of Gilgamesh, a PIM-based massively parallel architecture, and elements of its execution model. Gilgamesh extends existing PIM capabilities by incorporating advanced mechanisms for virtualizing tasks and data and providing adaptive resource management for load balancing and latency tolerance. The Gilgamesh execution model is based on macroservers, a middleware layer which supports object-based runtime management of data and threads allowing explicit and dynamic control of locality and load balancing. The paper concludes with a discussion of related research activities and an outlook to future work.
FPGA based control system for space instrumentation
NASA Astrophysics Data System (ADS)
Di Giorgio, Anna M.; Cerulli Irelli, Pasquale; Nuzzolo, Francesco; Orfei, Renato; Spinoglio, Luigi; Liu, Giovanni S.; Saraceno, Paolo
2008-07-01
The prototype for a general purpose FPGA based control system for space instrumentation is presented, with particular attention to the instrument control application software. The system HW is based on the LEON3FT processor, which gives the flexibility to configure the chip with only the necessary HW functionalities, from simple logic up to small dedicated processors. The instrument control SW is developed in ANSI C and for time critical (<10μs) commanding sequences implements an internal instructions sequencer, triggered via an interrupt service routine based on a HW high priority interrupt.
NASA Astrophysics Data System (ADS)
Parker, Steve C. J.; Hickman, Duncan L.; Smith, Moira I.
2015-05-01
Effective reconnaissance, surveillance and situational awareness, using dual band sensor systems, require the extraction, enhancement and fusion of salient features, with the processed video being presented to the user in an ergonomic and interpretable manner. HALO™ is designed to meet these requirements and provides an affordable, real-time, and low-latency image fusion solution on a low size, weight and power (SWAP) platform. The system has been progressively refined through field trials to increase its operating envelope and robustness. The result is a video processor that improves detection, recognition and identification (DRI) performance, whilst lowering operator fatigue and reaction times in complex and highly dynamic situations. This paper compares the performance of HALO™, both qualitatively and quantitatively, with conventional blended fusion for operation in degraded visual environments (DVEs), such as those experienced during ground and air-based operations. Although image blending provides a simple fusion solution, which explains its common adoption, the results presented demonstrate that its performance is poor compared to the HALO™ fusion scheme in DVE scenarios.
Smart trigger logic for focal plane arrays
Levy, James E; Campbell, David V; Holmes, Michael L; Lovejoy, Robert; Wojciechowski, Kenneth; Kay, Randolph R; Cavanaugh, William S; Gurrieri, Thomas M
2014-03-25
An electronic device includes a memory configured to receive data representing light intensity values from pixels in a focal plane array and a processor that analyzes the received data to determine which light values correspond to triggered pixels, where the triggered pixels are those pixels that meet a predefined set of criteria, and determines, for each triggered pixel, a set of neighbor pixels for which light intensity values are to be stored. The electronic device also includes a buffer that temporarily stores light intensity values for at least one previously processed row of pixels, so that when a triggered pixel is identified in a current row, light intensity values for the neighbor pixels in the previously processed row and for the triggered pixel are persistently stored, as well as a data transmitter that transmits the persistently stored light intensity values for the triggered and neighbor pixels to a data receiver.
Moradi, Saber; Qiao, Ning; Stefanini, Fabio; Indiveri, Giacomo
2018-02-01
Neuromorphic computing systems comprise networks of neurons that use asynchronous events for both computation and communication. This type of representation offers several advantages in terms of bandwidth and power consumption in neuromorphic electronic systems. However, managing the traffic of asynchronous events in large scale systems is a daunting task, both in terms of circuit complexity and memory requirements. Here, we present a novel routing methodology that employs both hierarchical and mesh routing strategies and combines heterogeneous memory structures for minimizing both memory requirements and latency, while maximizing programming flexibility to support a wide range of event-based neural network architectures, through parameter configuration. We validated the proposed scheme in a prototype multicore neuromorphic processor chip that employs hybrid analog/digital circuits for emulating synapse and neuron dynamics together with asynchronous digital circuits for managing the address-event traffic. We present a theoretical analysis of the proposed connectivity scheme, describe the methods and circuits used to implement such scheme, and characterize the prototype chip. Finally, we demonstrate the use of the neuromorphic processor with a convolutional neural network for the real-time classification of visual symbols being flashed to a dynamic vision sensor (DVS) at high speed.
Reducing adaptive optics latency using Xeon Phi many-core processors
NASA Astrophysics Data System (ADS)
Barr, David; Basden, Alastair; Dipper, Nigel; Schwartz, Noah
2015-11-01
The next generation of Extremely Large Telescopes (ELTs) for astronomy will rely heavily on the performance of their adaptive optics (AO) systems. Real-time control is at the heart of the critical technologies that will enable telescopes to deliver the best possible science and will require a very significant extrapolation from current AO hardware existing for 4-10 m telescopes. Investigating novel real-time computing architectures and testing their eligibility against anticipated challenges is one of the main priorities of technology development for the ELTs. This paper investigates the suitability of the Intel Xeon Phi, which is a commercial off-the-shelf hardware accelerator. We focus on wavefront reconstruction performance, implementing a straightforward matrix-vector multiplication (MVM) algorithm. We present benchmarking results of the Xeon Phi on a real-time Linux platform, both as a standalone processor and integrated into an existing real-time controller (RTC). Performance of single and multiple Xeon Phis are investigated. We show that this technology has the potential of greatly reducing the mean latency and variations in execution time (jitter) of large AO systems. We present both a detailed performance analysis of the Xeon Phi for a typical E-ELT first-light instrument along with a more general approach that enables us to extend to any AO system size. We show that systematic and detailed performance analysis is an essential part of testing novel real-time control hardware to guarantee optimal science results.
Using triggered operations to offload collective communication operations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barrett, Brian W.; Hemmert, K. Scott; Underwood, Keith Douglas
2010-04-01
Efficient collective operations are a major component of application scalability. Offload of collective operations onto the network interface reduces many of the latencies that are inherent in network communications and, consequently, reduces the time to perform the collective operation. To support offload, it is desirable to expose semantic building blocks that are simple to offload and yet powerful enough to implement a variety of collective algorithms. This paper presents the implementation of barrier and broadcast leveraging triggered operations - a semantic building block for collective offload. Triggered operations are shown to be both semantically powerful and capable of improving performance.
ATLAS level-1 calorimeter trigger: Run-2 performance and Phase-1 upgrades
NASA Astrophysics Data System (ADS)
Carlson, Ben; Hong, Tae Min; Atlas Collaboration
2017-01-01
The Run-2 performance and Phase-1 upgrade are presented for the hardware-based level-1 calorimeter trigger (L1Calo) for the ATLAS Experiment. This trigger has a latency of about 2.2 microseconds to make a decision to help ATLAS select about 100 kHz of the most interesting collisions from the nominal LHC rate of 40 MHz. We summarize the upgrade after Run-1 (2009-2012) and discuss its performance in Run-2 (2015-current). We also outline the on-going Phase-1 upgrade for the next run (2021-2024) and its expected performance.
Verwey, Willem B; Lammens, Robin; van Honk, Jack
2002-01-01
Participants practiced two discrete six-key sequences for a total of 420 trials. The 1 x 6 sequence had a unique order of key presses while the 2 x 3 sequence involved repetition of a three-key segment. Both sequences showed a long interkey interval halfway the sequence indicating hierarchical sequence control in that not only the 2 x 3 but also the 1 x 6 sequence was executed as two successive motor chunks. Besides, the second part of both sequences was executed faster than the first part. This supports the earlier notion of a motor processor executing the elements of familiar motor chunks and a cognitive processor triggering either these motor chunks or individual sequence elements. Low-frequency, off-line transcranial magnetic stimulation (TMS) of the supplementary motor area (SMA) counteracted normal improvement with practice of key presses at all sequence positions. Together, these results are in line with the notion that with moderate practice, the SMA executes short sequence fragments that are concatenated by other brain structures.
Real-time implementing wavefront reconstruction for adaptive optics
NASA Astrophysics Data System (ADS)
Wang, Caixia; Li, Mei; Wang, Chunhong; Zhou, Luchun; Jiang, Wenhan
2004-12-01
The capability of real time wave-front reconstruction is important for an adaptive optics (AO) system. The bandwidth of system and the real-time processing ability of the wave-front processor is mainly affected by the speed of calculation. The system requires enough number of subapertures and high sampling frequency to compensate atmospheric turbulence. The number of reconstruction operation is increased accordingly. Since the performance of AO system improves with the decrease of calculation latency, it is necessary to study how to increase the speed of wavefront reconstruction. There are two methods to improve the real time of the reconstruction. One is to convert the wavefront reconstruction matrix, such as by wavelet or FFT. The other is enhancing the performance of the processing element. Analysis shows that the latency cutting is performed with the cost of reconstruction precision by the former method. In this article, the latter method is adopted. From the characteristic of the wavefront reconstruction algorithm, a systolic array by FPGA is properly designed to implement real-time wavefront reconstruction. The system delay is reduced greatly by the utilization of pipeline and parallel processing. The minimum latency of reconstruction is the reconstruction calculation of one subaperture.
Multi-Threaded Algorithms for GPGPU in the ATLAS High Level Trigger
NASA Astrophysics Data System (ADS)
Conde Muíño, P.; ATLAS Collaboration
2017-10-01
General purpose Graphics Processor Units (GPGPU) are being evaluated for possible future inclusion in an upgraded ATLAS High Level Trigger farm. We have developed a demonstrator including GPGPU implementations of Inner Detector and Muon tracking and Calorimeter clustering within the ATLAS software framework. ATLAS is a general purpose particle physics experiment located on the LHC collider at CERN. The ATLAS Trigger system consists of two levels, with Level-1 implemented in hardware and the High Level Trigger implemented in software running on a farm of commodity CPU. The High Level Trigger reduces the trigger rate from the 100 kHz Level-1 acceptance rate to 1.5 kHz for recording, requiring an average per-event processing time of ∼ 250 ms for this task. The selection in the high level trigger is based on reconstructing tracks in the Inner Detector and Muon Spectrometer and clusters of energy deposited in the Calorimeter. Performing this reconstruction within the available farm resources presents a significant challenge that will increase significantly with future LHC upgrades. During the LHC data taking period starting in 2021, luminosity will reach up to three times the original design value. Luminosity will increase further to 7.5 times the design value in 2026 following LHC and ATLAS upgrades. Corresponding improvements in the speed of the reconstruction code will be needed to provide the required trigger selection power within affordable computing resources. Key factors determining the potential benefit of including GPGPU as part of the HLT processor farm are: the relative speed of the CPU and GPGPU algorithm implementations; the relative execution times of the GPGPU algorithms and serial code remaining on the CPU; the number of GPGPU required, and the relative financial cost of the selected GPGPU. We give a brief overview of the algorithms implemented and present new measurements that compare the performance of various configurations exploiting GPGPU cards.
L1 track trigger for the CMS HL-LHC upgrade using AM chips and FPGAs
NASA Astrophysics Data System (ADS)
Fedi, Giacomo
2017-08-01
The increase of luminosity at the HL-LHC will require the introduction of tracker information in CMS's Level-1 trigger system to maintain an acceptable trigger rate when selecting interesting events, despite the order of magnitude increase in minimum bias interactions. To meet the latency requirements, dedicated hardware has to be used. This paper presents the results of tests of a prototype system (pattern recognition ezzanine) as core of pattern recognition and track fitting for the CMS experiment, combining the power of both associative memory custom ASICs and modern Field Programmable Gate Array (FPGA) devices. The mezzanine uses the latest available associative memory devices (AM06) and the most modern Xilinx Ultrascale FPGAs. The results of the test for a complete tower comprising about 0.5 million patterns is presented, using as simulated input events traversing the upgraded CMS detector. The paper shows the performance of the pattern matching, track finding and track fitting, along with the latency and processing time needed. The pT resolution over pT of the muons measured using the reconstruction algorithm is at the order of 1% in the range 3-100 GeV/c.
NASA Technical Reports Server (NTRS)
Thronson, Harley; Valinia, Azita; Bleacher, Jacob; Eigenbrode, Jennifer; Garvin, Jim; Petro, Noah
2014-01-01
We suggest that the International Space Station be used to examine the application and validation of low-latency telepresence for surface exploration from space as an alternative, precursor, or potentially as an adjunct to astronaut "boots on the ground." To this end, controlled experiments that build upon and complement ground-based analog field studies will be critical for assessing the effects of different latencies (0 to 500 milliseconds), task complexity, and alternate forms of feedback to the operator. These experiments serve as an example of a pathfinder for NASA's roadmap of missions to Mars with low-latency telerobotic exploration as a precursor to astronaut's landing on the surface to conduct geological tasks.
A digital retina-like low-level vision processor.
Mertoguno, S; Bourbakis, N G
2003-01-01
This correspondence presents the basic design and the simulation of a low level multilayer vision processor that emulates to some degree the functional behavior of a human retina. This retina-like multilayer processor is the lower part of an autonomous self-organized vision system, called Kydon, that could be used on visually impaired people with a damaged visual cerebral cortex. The Kydon vision system, however, is not presented in this paper. The retina-like processor consists of four major layers, where each of them is an array processor based on hexagonal, autonomous processing elements that perform a certain set of low level vision tasks, such as smoothing and light adaptation, edge detection, segmentation, line recognition and region-graph generation. At each layer, the array processor is a 2D array of k/spl times/m hexagonal identical autonomous cells that simultaneously execute certain low level vision tasks. Thus, the hardware design and the simulation at the transistor level of the processing elements (PEs) of the retina-like processor and its simulated functionality with illustrative examples are provided in this paper.
Voltage scheduling for low power/energy
NASA Astrophysics Data System (ADS)
Manzak, Ali
2001-07-01
Power considerations have become an increasingly dominant factor in the design of both portable and desk-top systems. An effective way to reduce power consumption is to lower the supply voltage since voltage is quadratically related to power. This dissertation considers the problem of lowering the supply voltage at (i) the system level and at (ii) the behavioral level. At the system level, the voltage of the variable voltage processor is dynamically changed with the work load. Processors with limited sized buffers as well as those with very large buffers are considered. Given the task arrival times, deadline times, execution times, periods and switching activities, task scheduling algorithms that minimize energy or peak power are developed for the processors equipped with very large buffers. A relation between the operating voltages of the tasks for minimum energy/power is determined using the Lagrange multiplier method, and an iterative algorithm that utilizes this relation is developed. Experimental results show that the voltage assignment obtained by the proposed algorithm is very close (0.1% error) to that of the optimal energy assignment and the optimal peak power (1% error) assignment. Next, on-line and off-fine minimum energy task scheduling algorithms are developed for processors with limited sized buffers. These algorithms have polynomial time complexity and present optimal (off-line) and close-to-optimal (on-line) solutions. A procedure to calculate the minimum buffer size given information about the size of the task (maximum, minimum), execution time (best case, worst case) and deadlines is also presented. At the behavioral level, resources operating at multiple voltages are used to minimize power while maintaining the throughput. Such a scheme has the advantage of allowing modules on the critical paths to be assigned to the highest voltage levels (thus meeting the required timing constraints) while allowing modules on non-critical paths to be assigned to lower voltage levels (thus reducing the power consumption). A polynomial time resource and latency constrained scheduling algorithm is developed to distribute the available slack among the nodes such that power consumption is minimum. The algorithm is iterative and utilizes the slack based on the Lagrange multiplier method.
Advanced LIGO low-latency searches
NASA Astrophysics Data System (ADS)
Kanner, Jonah; LIGO Scientific Collaboration, Virgo Collaboration
2016-06-01
Advanced LIGO recently made the first detection of gravitational waves from merging binary black holes. The signal was first identified by a low-latency analysis, which identifies gravitational-wave transients within a few minutes of data collection. More generally, Advanced LIGO transients are sought with a suite of automated tools, which collectively identify events, evaluate statistical significance, estimate source position, and attempt to characterize source properties. This low-latency effort is enabling a broad multi-messenger approach to the science of compact object mergers and other transients. This talk will give an overview of the low-latency methodology and recent results.
Can High Bandwidth and Latency Justify Large Cache Blocks in Scalable Multiprocessors?
1994-01-01
400 MB/second. 4 Dubnicki’s work used trace-driven simulation, with traces collected on an 8-processor machine. We would expect such small-scale...312 1 6 32 64 of odk Sb* Bad64.M Figure 17: Miss rate of Ind Blocked LU. Figure 18: MCPR of Ind Blocked LU. overall miss rate of TGauss is a factor of...easily. 17 (’his approach assunics that the model paramelers we collect from simulations with infinite band- width (such as the miss rate and the
NASA Astrophysics Data System (ADS)
Covarelli, R.
2009-12-01
At the startup of the LHC, the CMS data acquisition is expected to be able to sustain an event readout rate of up to 100 kHz from the Level-1 trigger. These events will be read into a large processor farm which will run the "High-Level Trigger" (HLT) selection algorithms and will output a rate of about 150 Hz for permanent data storage. In this report HLT performances are shown for selections based on muons, electrons, photons, jets, missing transverse energy, τ leptons and b quarks: expected efficiencies, background rates and CPU time consumption are reported as well as relaxation criteria foreseen for a LHC startup instantaneous luminosity.
VLITE-Fast: A Real-time, 350 MHz Commensal VLA Survey for Fast Transients
NASA Astrophysics Data System (ADS)
Kerr, Matthew; Ray, Paul S.; Kassim, Namir E.; Clarke, Tracy; Deneva, Julia; Polisensky, Emil
2018-01-01
The VLITE (VLA Low Band Ionosphere and Transient Experiment; http://vlite.nrao.edu) program operates commensally during all Very Large Array observations, collecting data from 320 to 384 MHz. Recently expanded to include 16 antennas, the large field of view and huge time on sky offer good coverage of the transient, low-frequency sky. We describe the VLITE-Fast system, a GPU-based signal processor capable of detecting short (<1s) transients in real time and triggering recording of baseband voltage for offline imaging. In the case of Fast Radio Bursts, this offers the opportunity for discovering host galaxies of non-repeating FRBs, and in the case of single pulses, the identification of pulsar positions for dedicated follow-up. We describe the observing system, techniques for mitigating interference, and initial results from searches for FRBs.
NASA Technical Reports Server (NTRS)
Davies, Diane K.; Brown, Molly E.; Green, David S.; Michael, Karen A.; Murray, John J.; Justice, Christopher O.; Soja, Amber J.
2016-01-01
It is widely accepted that time-sensitive remote sensing data serve the needs of decision makers in the applications communities and yet to date, a comprehensive portfolio of NASA low latency datasets has not been available. This paper will describe the NASA low latency, or Near-Real Time (NRT), portfolio, how it was developed and plans to make it available online through a portal that leverages the existing EOSDIS capabilities such as the Earthdata Search Client (https:search.earthdata.nasa.gov), the Common Metadata Repository (CMR) and the Global Imagery Browse Service (GIBS). This paper will report on the outcomes of a NASA Workshop to Develop a Portfolio of Low Latency Datasets for Time-Sensitive Applications (27-29 September 2016 at NASA Langley Research Center, Hampton VA). The paper will also summarize findings and recommendations from the meeting outlining perceived shortfalls and opportunities for low latency research and application science.
Research on low-latency MAC protocols for wireless sensor networks
NASA Astrophysics Data System (ADS)
He, Chenguang; Sha, Xuejun; Lee, Chankil
2007-11-01
Energy-efficient should not be the only design goal in MAC protocols for wireless sensor networks, which involve the use of battery-operated computing and sensing devices. Low-latency operation becomes the same important as energy-efficient in the case that the traffic load is very heavy or the real-time constrain is used in applications like tracking or locating. This paper introduces some causes of traditional time delays which are inherent in a multi-hops network using existing WSN MAC protocols, illuminates the importance of low-latency MAC design for wireless sensor networks, and presents three MACs as examples of low-latency protocols designed specially for sleep delay, wait delay and wakeup delay in wireless sensor networks, respectively. The paper also discusses design trade-offs with emphasis on low-latency and points out their advantages and disadvantages, together with some design considerations and suggestions for MAC protocols for future applications and researches.
Low-power, transparent optical network interface for high bandwidth off-chip interconnects.
Liboiron-Ladouceur, Odile; Wang, Howard; Garg, Ajay S; Bergman, Keren
2009-04-13
The recent emergence of multicore architectures and chip multiprocessors (CMPs) has accelerated the bandwidth requirements in high-performance processors for both on-chip and off-chip interconnects. For next generation computing clusters, the delivery of scalable power efficient off-chip communications to each compute node has emerged as a key bottleneck to realizing the full computational performance of these systems. The power dissipation is dominated by the off-chip interface and the necessity to drive high-speed signals over long distances. We present a scalable photonic network interface approach that fully exploits the bandwidth capacity offered by optical interconnects while offering significant power savings over traditional E/O and O/E approaches. The power-efficient interface optically aggregates electronic serial data streams into a multiple WDM channel packet structure at time-of-flight latencies. We demonstrate a scalable optical network interface with 70% improvement in power efficiency for a complete end-to-end PCI Express data transfer.
Low Latency Audio Video: Potentials for Collaborative Music Making through Distance Learning
ERIC Educational Resources Information Center
Riley, Holly; MacLeod, Rebecca B.; Libera, Matthew
2016-01-01
The primary purpose of this study was to examine the potential of LOw LAtency (LOLA), a low latency audio visual technology designed to allow simultaneous music performance, as a distance learning tool for musical styles in which synchronous playing is an integral aspect of the learning process (e.g., jazz, folk styles). The secondary purpose was…
Kim, Hyoji; Choi, Hoyun
2015-01-01
ABSTRACT Epstein-Barr virus (EBV) is a human gammaherpesvirus associated with a variety of tumor types. EBV can establish latency or undergo lytic replication in host cells. In general, EBV remains latent in tumors and expresses a limited repertoire of latent proteins to avoid host immune surveillance. When the lytic cycle is triggered by some as-yet-unknown form of stimulation, lytic gene expression and progeny virus production commence. Thus far, the exact mechanism of EBV latency maintenance and the in vivo triggering signal for lytic induction have yet to be elucidated. Previously, we have shown that the EBV microRNA miR-BART20-5p directly targets the immediate early genes BRLF1 and BZLF1 as well as Bcl-2-associated death promoter (BAD) in EBV-associated gastric carcinoma. In this study, we found that both mRNA and protein levels of BRLF1 and BZLF1 were suppressed in cells following BAD knockdown and increased after BAD overexpression. Progeny virus production was also downregulated by specific knockdown of BAD. Our results demonstrated that caspase-3-dependent apoptosis is a prerequisite for BAD-mediated EBV lytic cycle induction. Therefore, our data suggest that miR-BART20-5p plays an important role in latency maintenance and tumor persistence of EBV-associated gastric carcinoma by inhibiting BAD-mediated caspase-3-dependent apoptosis, which would trigger immediate early gene expression. IMPORTANCE EBV has an ability to remain latent in host cells, including EBV-associated tumor cells hiding from immune surveillance. However, the exact molecular mechanisms of EBV latency maintenance remain poorly understood. Here, we demonstrated that miR-BART20-5p inhibited the expression of EBV immediate early genes indirectly, by suppressing BAD-induced caspase-3-dependent apoptosis, in addition to directly, as we previously reported. Our study suggests that EBV-associated tumor cells might endure apoptotic stress to some extent and remain latent with the aid of miR-BART20-5p. Blocking the expression or function of BART20-5p may expedite EBV-associated tumor cell death via immune attack and apoptosis. PMID:26581978
Warabi, Tateo; Furuyama, Hiroyasu; Sugai, Eri; Kato, Masamichi; Yanagisawa, Nobuo
2018-01-01
This study examined how gait bradykinesia is changed by the motor programming in Parkinson's disease. Thirty-five idiopathic Parkinson's disease patients and nine age-matched healthy subjects participated in this study. After the patients fixated on a visual-fixation target (conditioning-stimulus), the voluntary-gait was triggered by a visual on-stimulus. While the subject walked on a level floor, soleus, tibialis anterior EMG latencies, and the y-axis-vector of the sole-floor reaction force were examined. Three paradigms were used to distinguish between the off-/on-latencies. The gap-task: the visual-fixation target was turned off; 200 ms before the on-stimulus was engaged (resulting in a 200 ms-gap). EMG latency was not influenced by the visual-fixation target. The overlap-task: the on-stimulus was turned on during the visual-fixation target presentation (200 ms-overlap). The no-gap-task: the fixation target was turned off and the on-stimulus was turned on simultaneously. The onset of EMG pause following the tonic soleus EMG was defined as the off-latency of posture (termination). The onset of the tibialis anterior EMG burst was defined as the on-latency of gait (initiation). In the gap-task, the on-latency was unchanged in all of the subjects. In Parkinson's disease, the visual-fixation target prolonged both the off-/on-latencies in the overlap-task. In all tasks, the off-latency was prolonged and the off-/on-latencies were unsynchronized, which changed the synergic movement to a slow, short-step-gait. The synergy of gait was regulated by two independent sensory-motor programs of the off- and on-latency levels. In Parkinson's disease, the delayed gait initiation was due to the difficulty in terminating the sensory-motor program which controls the subject's fixation. The dynamic gait bradykinesia was involved in the difficulty (long off-latency) in terminating the motor program of the prior posture/movement.
Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Secchi, Simone; Tumeo, Antonino; Villa, Oreste
Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy inmore » reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.« less
Tests with beam setup of the TileCal phase-II upgrade electronics
NASA Astrophysics Data System (ADS)
Reward Hlaluku, Dingane
2017-09-01
The LHC has planned a series of upgrades culminating in the High Luminosity LHC which will have an average luminosity 5-7 times larger than the nominal Run-2 value. The ATLAS Tile calorimeter plans to introduce a new readout architecture by completely replacing the back-end and front-end electronics for the High Luminosity LHC. The photomultiplier signals will be fully digitized and transferred for every bunch crossing to the off-detector Tile PreProcessor. The Tile PreProcessor will further provide preprocessed digital data to the first level of trigger with improved spatial granularity and energy resolution in contrast to the current analog trigger signals. A single super-drawer module commissioned with the phase-II upgrade electronics is to be inserted into the real detector to evaluate and qualify the new readout and trigger concepts in the overall ATLAS data acquisition system. This new super-drawer, so-called hybrid Demonstrator, must provide analog trigger signals for backward compatibility with the current system. This Demonstrator drawer has been inserted into a Tile calorimeter module prototype to evaluate the performance in the lab. In parallel, one more module has been instrumented with two other front-end electronics options based on custom ASICs (QIE and FATALIC) which are under evaluation. These two modules together with three other modules composed of the current system electronics were exposed to different particles and energies in three test-beam campaigns during 2015 and 2016.
Herpesvirus Entry into Host Cells Mediated by Endosomal Low pH.
Nicola, Anthony V
2016-09-01
Herpesviral pathogenesis stems from infection of multiple cell types including the site of latency and cells that support lytic replication. Herpesviruses utilize distinct cellular pathways, including low pH endocytic pathways, to enter different pathophysiologically relevant target cells. This review details the impact of the mildly acidic milieu of endosomes on the entry of herpesviruses, with particular emphasis on herpes simplex virus 1 (HSV-1). Epithelial cells, the portal of primary HSV-1 infection, support entry via low pH endocytosis mechanisms. Mildly acidic pH triggers reversible conformational changes in the HSV-1 class III fusion protein glycoprotein B (gB). In vitro treatment of herpes simplex virions with a similar pH range inactivates infectivity, likely by prematurely activating the viral entry machinery in the absence of a target membrane. How a given herpesvirus mediates both low pH and pH-independent entry events is a key unresolved question. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Scalable NIC-based reduction on large-scale clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moody, A.; Fernández, J. C.; Petrini, F.
2003-01-01
Many parallel algorithms require effiaent support for reduction mllectives. Over the years, researchers have developed optimal reduction algonduns by taking inm account system size, dam size, and complexities of reduction operations. However, all of these algorithm have assumed the faa that the reduction precessing takes place on the host CPU. Modem Network Interface Cards (NICs) sport programmable processors with substantial memory and thus introduce a fresh variable into the equation This raises the following intersting challenge: Can we take advantage of modern NICs to implementJost redudion operations? In this paper, we take on this challenge in the context of large-scalemore » clusters. Through experiments on the 960-node, 1920-processor or ASCI Linux Cluster (ALC) located at the Lawrence Livermore National Laboratory, we show that NIC-based reductions indeed perform with reduced latency and immed consistency over host-based aleorithms for the wmmon case and that these benefits scale as the system grows. In the largest configuration tested--1812 processors-- our NIC-based algorithm can sum a single element vector in 73 ps with 32-bi integers and in 118 with Mbit floating-point numnbers. These results represent an improvement, respeaively, of 121% and 39% with resvect w the {approx}roductionle vel MPI library« less
Oweiss, Karim G
2006-07-01
This paper suggests a new approach for data compression during extracutaneous transmission of neural signals recorded by high-density microelectrode array in the cortex. The approach is based on exploiting the temporal and spatial characteristics of the neural recordings in order to strip the redundancy and infer the useful information early in the data stream. The proposed signal processing algorithms augment current filtering and amplification capability and may be a viable replacement to on chip spike detection and sorting currently employed to remedy the bandwidth limitations. Temporal processing is devised by exploiting the sparseness capabilities of the discrete wavelet transform, while spatial processing exploits the reduction in the number of physical channels through quasi-periodic eigendecomposition of the data covariance matrix. Our results demonstrate that substantial improvements are obtained in terms of lower transmission bandwidth, reduced latency and optimized processor utilization. We also demonstrate the improvements qualitatively in terms of superior denoising capabilities and higher fidelity of the obtained signals.
Measurement of fault latency in a digital avionic miniprocessor
NASA Technical Reports Server (NTRS)
Mcgough, J. G.; Swern, F. L.
1981-01-01
The results of fault injection experiments utilizing a gate-level emulation of the central processor unit of the Bendix BDX-930 digital computer are presented. The failure detection coverage of comparison-monitoring and a typical avionics CPU self-test program was determined. The specific tasks and experiments included: (1) inject randomly selected gate-level and pin-level faults and emulate six software programs using comparison-monitoring to detect the faults; (2) based upon the derived empirical data develop and validate a model of fault latency that will forecast a software program's detecting ability; (3) given a typical avionics self-test program, inject randomly selected faults at both the gate-level and pin-level and determine the proportion of faults detected; (4) determine why faults were undetected; (5) recommend how the emulation can be extended to multiprocessor systems such as SIFT; and (6) determine the proportion of faults detected by a uniprocessor BIT (built-in-test) irrespective of self-test.
Methods for compressible fluid simulation on GPUs using high-order finite differences
NASA Astrophysics Data System (ADS)
Pekkilä, Johannes; Väisälä, Miikka S.; Käpylä, Maarit J.; Käpylä, Petri J.; Anjum, Omer
2017-08-01
We focus on implementing and optimizing a sixth-order finite-difference solver for simulating compressible fluids on a GPU using third-order Runge-Kutta integration. Since graphics processing units perform well in data-parallel tasks, this makes them an attractive platform for fluid simulation. However, high-order stencil computation is memory-intensive with respect to both main memory and the caches of the GPU. We present two approaches for simulating compressible fluids using 55-point and 19-point stencils. We seek to reduce the requirements for memory bandwidth and cache size in our methods by using cache blocking and decomposing a latency-bound kernel into several bandwidth-bound kernels. Our fastest implementation is bandwidth-bound and integrates 343 million grid points per second on a Tesla K40t GPU, achieving a 3 . 6 × speedup over a comparable hydrodynamics solver benchmarked on two Intel Xeon E5-2690v3 processors. Our alternative GPU implementation is latency-bound and achieves the rate of 168 million updates per second.
A new scalable modular data acquisition system for SPECT (PET)
NASA Astrophysics Data System (ADS)
Stenstrom, P.; Rillbert, A.; Bergquist, M.; Habte, F.; Bohm, C.; Larsson, S. A.
1998-06-01
Describes a modular decentralized data acquisition system that continuously samples shaped PMT pulses from a SPECT detector. The pulse waveform data are used by signal processors to accurately reconstruct amplitude and time for each scintillation event. Data acquisition for a PMT channel is triggered in two alternative ways, either when its own signal exceeds a selected digital threshold, or when it receives a trigger pulse from one of its neighboring PMTs. The triggered region is restricted to seven, thirteen or nineteen neighboring PMT channels. Each acquisition module supports three PMT channels and connects to all other modules and a reconstruction computer via Firewire to cover the 72 channels in the Stockholm University/Karolinska Hospital cylindrical SPECT camera.
Milde, Moritz B.; Blum, Hermann; Dietmüller, Alexander; Sumislawska, Dora; Conradt, Jörg; Indiveri, Giacomo; Sandamirskaya, Yulia
2017-01-01
Neuromorphic hardware emulates dynamics of biological neural networks in electronic circuits offering an alternative to the von Neumann computing architecture that is low-power, inherently parallel, and event-driven. This hardware allows to implement neural-network based robotic controllers in an energy-efficient way with low latency, but requires solving the problem of device variability, characteristic for analog electronic circuits. In this work, we interfaced a mixed-signal analog-digital neuromorphic processor ROLLS to a neuromorphic dynamic vision sensor (DVS) mounted on a robotic vehicle and developed an autonomous neuromorphic agent that is able to perform neurally inspired obstacle-avoidance and target acquisition. We developed a neural network architecture that can cope with device variability and verified its robustness in different environmental situations, e.g., moving obstacles, moving target, clutter, and poor light conditions. We demonstrate how this network, combined with the properties of the DVS, allows the robot to avoid obstacles using a simple biologically-inspired dynamics. We also show how a Dynamic Neural Field for target acquisition can be implemented in spiking neuromorphic hardware. This work demonstrates an implementation of working obstacle avoidance and target acquisition using mixed signal analog/digital neuromorphic hardware. PMID:28747883
Milde, Moritz B; Blum, Hermann; Dietmüller, Alexander; Sumislawska, Dora; Conradt, Jörg; Indiveri, Giacomo; Sandamirskaya, Yulia
2017-01-01
Neuromorphic hardware emulates dynamics of biological neural networks in electronic circuits offering an alternative to the von Neumann computing architecture that is low-power, inherently parallel, and event-driven. This hardware allows to implement neural-network based robotic controllers in an energy-efficient way with low latency, but requires solving the problem of device variability, characteristic for analog electronic circuits. In this work, we interfaced a mixed-signal analog-digital neuromorphic processor ROLLS to a neuromorphic dynamic vision sensor (DVS) mounted on a robotic vehicle and developed an autonomous neuromorphic agent that is able to perform neurally inspired obstacle-avoidance and target acquisition. We developed a neural network architecture that can cope with device variability and verified its robustness in different environmental situations, e.g., moving obstacles, moving target, clutter, and poor light conditions. We demonstrate how this network, combined with the properties of the DVS, allows the robot to avoid obstacles using a simple biologically-inspired dynamics. We also show how a Dynamic Neural Field for target acquisition can be implemented in spiking neuromorphic hardware. This work demonstrates an implementation of working obstacle avoidance and target acquisition using mixed signal analog/digital neuromorphic hardware.
Park, Sunmin; Buck, Michael D.; Desai, Chandni; Zhang, Xin; Loginicheva, Ekaterina; Martinez, Jennifer; Freeman, Michael L.; Saitoh, Tatsuya; Akira, Shizuo; Guan, Jun-Lin; He, You-Wen; Blackman, Marcia A.; Handley, Scott A.; Levine, Beth; Green, Douglas R.; Reese, Tiffany A.; Artyomov, Maxim N.; Virgin, Herbert W.
2016-01-01
SUMMARY Host genes that regulate systemic inflammation upon chronic viral infection are incompletely understood. Murine γ-herpesvirus 68 (MHV68) infection is characterized by latency in macrophages, and reactivation is inhibited by Interferon-γ (IFN-γ). Using a Lysozyme-M-cre (LysMcre) expression system, we show that deletion of autophagy-related (Atg) genes Fip200, beclin 1, Atg14, Atg16L1, Atg7, Atg3, and Atg5, in the myeloid compartment, inhibited MHV68 reactivation in macrophages. Atg5-deficiency did not alter reactivation from B cells, and effects on reactivation from macrophages were not explained by alterations in productive viral replication or the establishment of latency. Rather, chronic MHV68 infection triggered increased systemic inflammation, increased T cell production of IFN-γ and an IFN-γ-induced transcriptional signature in macrophages from Atg gene-deficient mice. The Atg5-related reactivation defect was partially reversed by neutralization of IFN-γ. Thus Atg genes in myeloid cells dampen virus-induced systemic inflammation, creating an environment that fosters efficient MHV68 reactivation from latency. PMID:26764599
On the Floating Point Performance of the i860 Microprocessor
NASA Technical Reports Server (NTRS)
Lee, King; Kutler, Paul (Technical Monitor)
1997-01-01
The i860 microprocessor is a pipelined processor that can deliver two double precision floating point results every clock. It is being used in the Touchstone project to develop a teraflop computer by the year 2000. With such high computational capabilities it was expected that memory bandwidth would limit performance on many kernels. Measured performance of three kernels showed performance is less than what memory bandwidth limitations would predict. This paper develops a model that explains the discrepancy in terms of memory latencies and points to some problems involved in moving data from memory to the arithmetic pipelines.
Vision systems for manned and robotic ground vehicles
NASA Astrophysics Data System (ADS)
Sanders-Reed, John N.; Koon, Phillip L.
2010-04-01
A Distributed Aperture Vision System for ground vehicles is described. An overview of the hardware including sensor pod, processor, video compression, and displays is provided. This includes a discussion of the choice between an integrated sensor pod and individually mounted sensors, open architecture design, and latency issues as well as flat panel versus head mounted displays. This technology is applied to various ground vehicle scenarios, including closed-hatch operations (operator in the vehicle), remote operator tele-operation, and supervised autonomy for multi-vehicle unmanned convoys. In addition, remote vision for automatic perimeter surveillance using autonomous vehicles and automatic detection algorithms is demonstrated.
Design and scheduling for periodic concurrent error detection and recovery in processor arrays
NASA Technical Reports Server (NTRS)
Wang, Yi-Min; Chung, Pi-Yu; Fuchs, W. Kent
1992-01-01
Periodic application of time-redundant error checking provides the trade-off between error detection latency and performance degradation. The goal is to achieve high error coverage while satisfying performance requirements. We derive the optimal scheduling of checking patterns in order to uniformly distribute the available checking capability and maximize the error coverage. Synchronous buffering designs using data forwarding and dynamic reconfiguration are described. Efficient single-cycle diagnosis is implemented by error pattern analysis and direct-mapped recovery cache. A rollback recovery scheme using start-up control for local recovery is also presented.
Blood pressure control with selective vagal nerve stimulation and minimal side effects
NASA Astrophysics Data System (ADS)
Plachta, Dennis T. T.; Gierthmuehlen, Mortimer; Cota, Oscar; Espinosa, Nayeli; Boeser, Fabian; Herrera, Taliana C.; Stieglitz, Thomas; Zentner, Joseph
2014-06-01
Objective. Hypertension is the largest threat to patient health and a burden to health care systems. Despite various options, 30% of patients do not respond sufficiently to medical treatment. Mechanoreceptors in the aortic arch relay blood pressure (BP) levels through vagal nerve (VN) fibers to the brainstem and trigger the baroreflex, lowering the BP. Selective electrical stimulation of these nerve fibers reduced BP in rats. However, there is no technique described to localize and stimulate these fibers inside the VN without inadvertent stimulation of non-baroreceptive fibers causing side effects like bradycardia and bradypnea. Approach. We present a novel method for selective VN stimulation to reduce BP without the aforementioned side effects. Baroreceptor compound activity of rat VN (n = 5) was localized using a multichannel cuff electrode, true tripolar recording and a coherent averaging algorithm triggered by BP or electrocardiogram. Main results. Tripolar stimulation over electrodes near the barofibers reduced the BP without triggering significant bradycardia and bradypnea. The BP drop was adjusted to 60% of the initial value by varying the stimulation pulse width and duration, and lasted up to five times longer than the stimulation. Significance. The presented method is robust to impedance changes, independent of the electrode's relative position, does not compromise the nerve and can run on implantable, ultra-low power signal processors.
Construction and Evaluation of an Ultra Low Latency Frameless Renderer for VR.
Friston, Sebastian; Steed, Anthony; Tilbury, Simon; Gaydadjiev, Georgi
2016-04-01
Latency - the delay between a user's action and the response to this action - is known to be detrimental to virtual reality. Latency is typically considered to be a discrete value characterising a delay, constant in time and space - but this characterisation is incomplete. Latency changes across the display during scan-out, and how it does so is dependent on the rendering approach used. In this study, we present an ultra-low latency real-time ray-casting renderer for virtual reality, implemented on an FPGA. Our renderer has a latency of ~1 ms from 'tracker to pixel'. Its frameless nature means that the region of the display with the lowest latency immediately follows the scan-beam. This is in contrast to frame-based systems such as those using typical GPUs, for which the latency increases as scan-out proceeds. Using a series of high and low speed videos of our system in use, we confirm its latency of ~1 ms. We examine how the renderer performs when driving a traditional sequential scan-out display on a readily available HMO, the Oculus Rift OK2. We contrast this with an equivalent apparatus built using a GPU. Using captured human head motion and a set of image quality measures, we assess the ability of these systems to faithfully recreate the stimuli of an ideal virtual reality system - one with a zero latency tracker, renderer and display running at 1 kHz. Finally, we examine the results of these quality measures, and how each rendering approach is affected by velocity of movement and display persistence. We find that our system, with a lower average latency, can more faithfully draw what the ideal virtual reality system would. Further, we find that with low display persistence, the sensitivity to velocity of both systems is lowered, but that it is much lower for ours.
Guillain-Barré syndrome following chickenpox: a case series.
Tatarelli, P; Garnero, M; Del Bono, V; Camera, M; Schenone, A; Grandis, M; Benedetti, L; Viscoli, C
2016-01-01
Guillain-Barré syndrome (GBS) is an acute, immune-mediated polyradiculoneuropathy, usually triggered by an infectious episode, mostly of viral origin. Varicella zoster virus (VZV) is a rare cause of GBS, mainly in the case of latent infection reactivation. We report on three adult patients who developed GBS following chickenpox, after a short period of latency. They were promptly treated with intravenous immunoglobulin, and the first one with plasma exchange additionally. All the patients experienced almost complete clinical recovery. Our experience suggests that primary VZV infection constitutes a GBS triggering event.
Initial Performance Results on IBM POWER6
NASA Technical Reports Server (NTRS)
Saini, Subbash; Talcott, Dale; Jespersen, Dennis; Djomehri, Jahed; Jin, Haoqiang; Mehrotra, Piysuh
2008-01-01
The POWER5+ processor has a faster memory bus than that of the previous generation POWER5 processor (533 MHz vs. 400 MHz), but the measured per-core memory bandwidth of the latter is better than that of the former (5.7 GB/s vs. 4.3 GB/s). The reason for this is that in the POWER5+, the two cores on the chip share the L2 cache, L3 cache and memory bus. The memory controller is also on the chip and is shared by the two cores. This serializes the path to memory. For consistently good performance on a wide range of applications, the performance of the processor, the memory subsystem, and the interconnects (both latency and bandwidth) should be balanced. Recognizing this, IBM has designed the Power6 processor so as to avoid the bottlenecks due to the L2 cache, memory controller and buffer chips of the POWER5+. Unlike the POWER5+, each core in the POWER6 has its own L2 cache (4 MB - double that of the Power5+), memory controller and buffer chips. Each core in the POWER6 runs at 4.7 GHz instead of 1.9 GHz in POWER5+. In this paper, we evaluate the performance of a dual-core Power6 based IBM p6-570 system, and we compare its performance with that of a dual-core Power5+ based IBM p575+ system. In this evaluation, we have used the High- Performance Computing Challenge (HPCC) benchmarks, NAS Parallel Benchmarks (NPB), and four real-world applications--three from computational fluid dynamics and one from climate modeling.
Latency in Distributed Acquisition and Rendering for Telepresence Systems.
Ohl, Stephan; Willert, Malte; Staadt, Oliver
2015-12-01
Telepresence systems use 3D techniques to create a more natural human-centered communication over long distances. This work concentrates on the analysis of latency in telepresence systems where acquisition and rendering are distributed. Keeping latency low is important to immerse users in the virtual environment. To better understand latency problems and to identify the source of such latency, we focus on the decomposition of system latency into sub-latencies. We contribute a model of latency and show how it can be used to estimate latencies in a complex telepresence dataflow network. To compare the estimates with real latencies in our prototype, we modify two common latency measurement methods. This presented methodology enables the developer to optimize the design, find implementation issues and gain deeper knowledge about specific sources of latency.
Real-time phase correlation based integrated system for seizure detection
NASA Astrophysics Data System (ADS)
Romaine, James B.; Delgado-Restituto, Manuel; Leñero-Bardallo, Juan A.; Rodríguez-Vázquez, Ángel
2017-05-01
This paper reports a low area, low power, integer-based digital processor for the calculation of phase synchronization between two neural signals. The processor calculates the phase-frequency content of a signal by identifying the specific time periods associated with two consecutive minima. The simplicity of this phase-frequency content identifier allows for the digital processor to utilize only basic digital blocks, such as registers, counters, adders and subtractors, without incorporating any complex multiplication and or division algorithms. In fact, the processor, fabricated in a 0.18μm CMOS process, only occupies an area of 0.0625μm2 and consumes 12.5nW from a 1.2V supply voltage when operated at 128kHz. These low-area, low-power features make the proposed processor a valuable computing element in closed loop neural prosthesis for the treatment of neural diseases, such as epilepsy, or for extracting functional connectivity maps between different recording sites in the brain.
Random access with adaptive packet aggregation in LTE/LTE-A.
Zhou, Kaijie; Nikaein, Navid
While random access presents a promising solution for efficient uplink channel access, the preamble collision rate can significantly increase when massive number of devices simultaneously access the channel. To address this issue and improve the reliability of the random access, an adaptive packet aggregation method is proposed. With the proposed method, a device does not trigger a random access for every single packet. Instead, it starts a random access when the number of aggregated packets reaches a given threshold. This method reduces the packet collision rate at the expense of an extra latency, which is used to accumulate multiple packets into a single transmission unit. Therefore, the tradeoff between packet loss rate and channel access latency has to be carefully selected. We use semi-Markov model to derive the packet loss rate and channel access latency as functions of packet aggregation number. Hence, the optimal amount of aggregated packets can be found, which keeps the loss rate below the desired value while minimizing the access latency. We also apply for the idea of packet aggregation for power saving, where a device aggregates as many packets as possible until the latency constraint is reached. Simulations are carried out to evaluate our methods. We find that the packet loss rate and/or power consumption are significantly reduced with the proposed method.
NASA Astrophysics Data System (ADS)
Takeda, Sawako; Tashiro, Makoto S.; Ishisaki, Yoshitaka; Tsujimoto, Masahiro; Seta, Hiromi; Shimoda, Yuya; Yamaguchi, Sunao; Uehara, Sho; Terada, Yukikatsu; Fujimoto, Ryuichi; Mitsuda, Kazuhisa
2014-07-01
The soft X-ray spectrometer (SXS) aboard ASTRO-H is equipped with dedicated digital signal processing units called pulse shape processors (PSPs). The X-ray microcalorimeter system SXS has 36 sensor pixels, which are operated at 50 mK to measure heat input of X-ray photons and realize an energy resolution of 7 eV FWHM in the range 0.3-12.0 keV. Front-end signal processing electronics are used to filter and amplify the electrical pulse output from the sensor and for analog-to-digital conversion. The digitized pulses from the 36 pixels are multiplexed and are sent to the PSP over low-voltage differential signaling lines. Each of two identical PSP units consists of an FPGA board, which assists the hardware logic, and two CPU boards, which assist the onboard software. The FPGA board triggers at every pixel event and stores the triggering information as a pulse waveform in the installed memory. The CPU boards read the event data to evaluate pulse heights by an optimal filtering algorithm. The evaluated X-ray photon data (including the pixel ID, energy, and arrival time information) are transferred to the satellite data recorder along with event quality information. The PSP units have been developed and tested with the engineering model (EM) and the flight model. Utilizing the EM PSP, we successfully verified the entire hardware system and the basic software design of the PSPs, including their communication capability and signal processing performance. In this paper, we show the key metrics of the EM test, such as accuracy and synchronicity of sampling clocks, event grading capability, and resultant energy resolution.
Industrial WSN Based on IR-UWB and a Low-Latency MAC Protocol
NASA Astrophysics Data System (ADS)
Reinhold, Rafael; Underberg, Lisa; Wulf, Armin; Kays, Ruediger
2016-07-01
Wireless sensor networks for industrial communication require high reliability and low latency. As current wireless sensor networks do not entirely meet these requirements, novel system approaches need to be developed. Since ultra wideband communication systems seem to be a promising approach, this paper evaluates the performance of the IEEE 802.15.4 impulse-radio ultra-wideband physical layer and the IEEE 802.15.4 Low Latency Deterministic Network (LLDN) MAC for industrial applications. Novel approaches and system adaptions are proposed to meet the application requirements. In this regard, a synchronization approach based on circular average magnitude difference functions (CAMDF) and on a clean template (CT) is presented for the correlation receiver. An adapted MAC protocol titled aggregated low latency (ALL) MAC is proposed to significantly reduce the resulting latency. Based on the system proposals, a hardware prototype has been developed, which proves the feasibility of the system and visualizes the real-time performance of the MAC protocol.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, K.; Chen, H.; Wu, W.
We present that in the upgrade of ATLAS experiment, the front-end electronics components are subjected to a large radiation background. Meanwhile high speed optical links are required for the data transmission between the on-detector and off-detector electronics. The GBT architecture and the Versatile Link (VL) project are designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed data transmission which is called GBT link. In the ATLAS upgrade, besides the link with on-detector, the GBT link is also used between different off-detector systems. The GBTX ASIC is designed for the on-detector front-end, correspondingly for the off-detector electronics, themore » GBT architecture is implemented in Field Programmable Gate Arrays (FPGA). CERN launches the GBT-FPGA project to provide examples in different types of FPGA. In the ATLAS upgrade framework, the Front-End LInk eXchange (FELIX) system is used to interface the front end electronics of several ATLAS subsystems. The GBT link is used between them, to transfer the detector data and the timing, trigger, control and monitoring information. The trigger signal distributed in the down-link from FELIX to the front-end requires a fixed and low latency. In this paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve a lower fixed latency. For FELIX, a common firmware will be used to interface different front-ends with support of both GBT modes: the forward error correction mode and the wide mode. The modified GBT-FPGA core has the ability to switch between the GBT modes without FPGA reprogramming. Finally, the system clock distribution of the multi-channel FELIX firmware is also discussed in this paper.« less
Hirschauer, Thomas J; Buford, John A
2015-04-01
Neurons in the pontomedullary reticular formation (PMRF) give rise to the reticulospinal tract. The motor output of the PMRF was investigated using stimulus-triggered averaging of electromyography (EMG) and force recordings in two monkeys (M. fascicularis). EMG was recorded from 12 pairs of upper limb muscles, and forces were detected using two isometric force-sensitive handles. Of 150 stimulation sites, 105 (70.0%) produced significant force responses, and 139 (92.5%) produced significant EMG responses. Based on the average flexor EMG onset latency of 8.3 ms and average force onset latency of 15.9 ms poststimulation, an electromechanical delay of ∼7.6 ms was calculated. The magnitude of force responses (∼10 mN) was correlated with the average change in EMG activity (P < 0.001). A multivariate linear regression analysis was used to estimate the contribution of each muscle to force generation, with flexors and extensors exhibiting antagonistic effects. A predominant force output pattern of ipsilateral flexion and contralateral extension was observed in response to PMRF stimulation, with 65.3% of significant ipsilateral force responses directed medially and posteriorly (P < 0.001) and 78.6% of contralateral responses directed laterally and anteriorly (P < 0.001). This novel approach permits direct measurement of force outputs evoked by central nervous system microstimulation. Despite the small magnitude of poststimulus EMG effects, low-intensity single-pulse microstimulation of the PMRF evoked detectable forces. The forces, showing the combined effect of all muscle activity in the arms, are consistent with reciprocal pattern of force outputs from the PMRF detectable with stimulus-triggered averaging of EMG. Copyright © 2015 the American Physiological Society.
Reconstructing the calibrated strain signal in the Advanced LIGO detectors
NASA Astrophysics Data System (ADS)
Viets, A. D.; Wade, M.; Urban, A. L.; Kandhasamy, S.; Betzwieser, J.; Brown, Duncan A.; Burguet-Castell, J.; Cahillane, C.; Goetz, E.; Izumi, K.; Karki, S.; Kissel, J. S.; Mendell, G.; Savage, R. L.; Siemens, X.; Tuyenbayev, D.; Weinstein, A. J.
2018-05-01
Advanced LIGO’s raw detector output needs to be calibrated to compute dimensionless strain h(t) . Calibrated strain data is produced in the time domain using both a low-latency, online procedure and a high-latency, offline procedure. The low-latency h(t) data stream is produced in two stages, the first of which is performed on the same computers that operate the detector’s feedback control system. This stage, referred to as the front-end calibration, uses infinite impulse response (IIR) filtering and performs all operations at a 16 384 Hz digital sampling rate. Due to several limitations, this procedure currently introduces certain systematic errors in the calibrated strain data, motivating the second stage of the low-latency procedure, known as the low-latency gstlal calibration pipeline. The gstlal calibration pipeline uses finite impulse response (FIR) filtering to apply corrections to the output of the front-end calibration. It applies time-dependent correction factors to the sensing and actuation components of the calibrated strain to reduce systematic errors. The gstlal calibration pipeline is also used in high latency to recalibrate the data, which is necessary due mainly to online dropouts in the calibrated data and identified improvements to the calibration models or filters.
Evaluation of an Adaptive Automation Trigger Based on Task Performance, Priority, and Frequency
2013-06-01
with dual Intel ® Xeon ® CPU x5550 processors @ 2.67 GHz each, 12.0 GB RAM, and a 1.5 GB PCIe nVidia Quadro FX 4800 graphics card (Microsoft...Cole Publishing Company . Miller, C. A., & Parasuraman, R. (2007). Designing for flexible interaction between humans and automation: Delegation
NASA Astrophysics Data System (ADS)
Ying, Kai; Kowalski, John M.; Nogami, Toshizo; Yin, Zhanping; Sheng, Jia
2018-01-01
5G systems are supposed to support coexistence of multiple services such as ultra reliable low latency communications (URLLC) and enhanced mobile broadband (eMBB) communications. The target of eMBB communications is to meet the high-throughput requirement while URLLC are used for some high priority services. Due to the sporadic nature and low latency requirement, URLLC transmission may pre-empt the resource of eMBB transmission. Our work is to analyze the URLLC impact on eMBB transmission in mobile front-haul. Then, some solutions are proposed to guarantee the reliability/latency requirements for URLLC services and minimize the impact to eMBB services at the same time.
Precision Quantum Control and Error-Suppressing Quantum Firmware for Robust Quantum Computing
2014-09-24
Biercuk, Lorenza Viola. Long-time Low - latency Quantum Memory by Dynamical Decoupling, arXiv:1206.6087v1 (06 2012) L. Viola, G. A. Paz-Silva . A...International Patent Application (PCT/AU2013/000649) D. Hayes, K. Khodjasteh L. Viola, M.J. Biercuk, “Long-time low - latency quantum memory by dynamical...Khodjasteh L. Viola, M.J. Biercuk, University of Sydney A28 Physics Road Sydney NS 2006 Long-time low - latency quantum membory by dynamical decoupling
SpaceCubeX: A Framework for Evaluating Hybrid Multi-Core CPU FPGA DSP Architectures
NASA Technical Reports Server (NTRS)
Schmidt, Andrew G.; Weisz, Gabriel; French, Matthew; Flatley, Thomas; Villalpando, Carlos Y.
2017-01-01
The SpaceCubeX project is motivated by the need for high performance, modular, and scalable on-board processing to help scientists answer critical 21st century questions about global climate change, air quality, ocean health, and ecosystem dynamics, while adding new capabilities such as low-latency data products for extreme event warnings. These goals translate into on-board processing throughput requirements that are on the order of 100-1,000 more than those of previous Earth Science missions for standard processing, compression, storage, and downlink operations. To study possible future architectures to achieve these performance requirements, the SpaceCubeX project provides an evolvable testbed and framework that enables a focused design space exploration of candidate hybrid CPU/FPGA/DSP processing architectures. The framework includes ArchGen, an architecture generator tool populated with candidate architecture components, performance models, and IP cores, that allows an end user to specify the type, number, and connectivity of a hybrid architecture. The framework requires minimal extensions to integrate new processors, such as the anticipated High Performance Spaceflight Computer (HPSC), reducing time to initiate benchmarking by months. To evaluate the framework, we leverage a wide suite of high performance embedded computing benchmarks and Earth science scenarios to ensure robust architecture characterization. We report on our projects Year 1 efforts and demonstrate the capabilities across four simulation testbed models, a baseline SpaceCube 2.0 system, a dual ARM A9 processor system, a hybrid quad ARM A53 and FPGA system, and a hybrid quad ARM A53 and DSP system.
Fault and Error Latency Under Real Workload: an Experimental Study. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Chillarege, Ram
1986-01-01
A practical methodology for the study of fault and error latency is demonstrated under a real workload. This is the first study that measures and quantifies the latency under real workload and fills a major gap in the current understanding of workload-failure relationships. The methodology is based on low level data gathered on a VAX 11/780 during the normal workload conditions of the installation. Fault occurrence is simulated on the data, and the error generation and discovery process is reconstructed to determine latency. The analysis proceeds to combine the low level activity data with high level machine performance data to yield a better understanding of the phenomena. A strong relationship exists between latency and workload and that relationship is quantified. The sampling and reconstruction techniques used are also validated. Error latency in the memory where the operating system resides was studied using data on the physical memory access. Fault latency in the paged section of memory was determined using data from physical memory scans. Error latency in the microcontrol store was studied using data on the microcode access and usage.
Flood replenishment: a new method of processor control.
Frank, E D; Gray, J E; Wilken, D A
1980-01-01
In mechanized radiographic film processors that process medium to low volumes of film, roll films, and those that process single-emulsion films from nuclear medicine scans, computed tomography, and ultrasound, it is difficult to maintain the developer solution at a stable processing level. We describe our experience using flood replenishment, which is a method in which developer replenisher containing starter solution is introduced in the processor at timed intervals, independent of the number of films being processed. By this process, a stable level of developer activity is maintained in a processor used to develop a medium to low volume of single-emulsion film.
FPGA-based trigger system for the LUX dark matter experiment
NASA Astrophysics Data System (ADS)
Akerib, D. S.; Araújo, H. M.; Bai, X.; Bailey, A. J.; Balajthy, J.; Beltrame, P.; Bernard, E. P.; Bernstein, A.; Biesiadzinski, T. P.; Boulton, E. M.; Bradley, A.; Bramante, R.; Cahn, S. B.; Carmona-Benitez, M. C.; Chan, C.; Chapman, J. J.; Chiller, A. A.; Chiller, C.; Currie, A.; Cutter, J. E.; Davison, T. J. R.; de Viveiros, L.; Dobi, A.; Dobson, J. E. Y.; Druszkiewicz, E.; Edwards, B. N.; Faham, C. H.; Fiorucci, S.; Gaitskell, R. J.; Gehman, V. M.; Ghag, C.; Gibson, K. R.; Gilchriese, M. G. D.; Hall, C. R.; Hanhardt, M.; Haselschwardt, S. J.; Hertel, S. A.; Hogan, D. P.; Horn, M.; Huang, D. Q.; Ignarra, C. M.; Ihm, M.; Jacobsen, R. G.; Ji, W.; Kazkaz, K.; Khaitan, D.; Knoche, R.; Larsen, N. A.; Lee, C.; Lenardo, B. G.; Lesko, K. T.; Lindote, A.; Lopes, M. I.; Malling, D. C.; Manalaysay, A. G.; Mannino, R. L.; Marzioni, M. F.; McKinsey, D. N.; Mei, D.-M.; Mock, J.; Moongweluwan, M.; Morad, J. A.; Murphy, A. St. J.; Nehrkorn, C.; Nelson, H. N.; Neves, F.; O`Sullivan, K.; Oliver-Mallory, K. C.; Ott, R. A.; Palladino, K. J.; Pangilinan, M.; Pease, E. K.; Phelps, P.; Reichhart, L.; Rhyne, C.; Shaw, S.; Shutt, T. A.; Silva, C.; Skulski, W.; Solovov, V. N.; Sorensen, P.; Stephenson, S.; Sumner, T. J.; Szydagis, M.; Taylor, D. J.; Taylor, W.; Tennyson, B. P.; Terman, P. A.; Tiedt, D. R.; To, W. H.; Tripathi, M.; Tvrznikova, L.; Uvarov, S.; Verbus, J. R.; Webb, R. C.; White, J. T.; Whitis, T. J.; Witherell, M. S.; Wolfs, F. L. H.; Yin, J.; Young, S. K.; Zhang, C.
2016-05-01
LUX is a two-phase (liquid/gas) xenon time projection chamber designed to detect nuclear recoils resulting from interactions with dark matter particles. Signals from the detector are processed with an FPGA-based digital trigger system that analyzes the incoming data in real-time, with just a few microsecond latency. The system enables first pass selection of events of interest based on their pulse shape characteristics and 3D localization of the interactions. It has been shown to be > 99 % efficient in triggering on S2 signals induced by only few extracted liquid electrons. It is continuously and reliably operating since its full underground deployment in early 2013. This document is an overview of the systems capabilities, its inner workings, and its performance.
Kim, Hyoji; Choi, Hoyun; Lee, Suk Kyeong
2016-02-01
Epstein-Barr virus (EBV) is a human gammaherpesvirus associated with a variety of tumor types. EBV can establish latency or undergo lytic replication in host cells. In general, EBV remains latent in tumors and expresses a limited repertoire of latent proteins to avoid host immune surveillance. When the lytic cycle is triggered by some as-yet-unknown form of stimulation, lytic gene expression and progeny virus production commence. Thus far, the exact mechanism of EBV latency maintenance and the in vivo triggering signal for lytic induction have yet to be elucidated. Previously, we have shown that the EBV microRNA miR-BART20-5p directly targets the immediate early genes BRLF1 and BZLF1 as well as Bcl-2-associated death promoter (BAD) in EBV-associated gastric carcinoma. In this study, we found that both mRNA and protein levels of BRLF1 and BZLF1 were suppressed in cells following BAD knockdown and increased after BAD overexpression. Progeny virus production was also downregulated by specific knockdown of BAD. Our results demonstrated that caspase-3-dependent apoptosis is a prerequisite for BAD-mediated EBV lytic cycle induction. Therefore, our data suggest that miR-BART20-5p plays an important role in latency maintenance and tumor persistence of EBV-associated gastric carcinoma by inhibiting BAD-mediated caspase-3-dependent apoptosis, which would trigger immediate early gene expression. EBV has an ability to remain latent in host cells, including EBV-associated tumor cells hiding from immune surveillance. However, the exact molecular mechanisms of EBV latency maintenance remain poorly understood. Here, we demonstrated that miR-BART20-5p inhibited the expression of EBV immediate early genes indirectly, by suppressing BAD-induced caspase-3-dependent apoptosis, in addition to directly, as we previously reported. Our study suggests that EBV-associated tumor cells might endure apoptotic stress to some extent and remain latent with the aid of miR-BART20-5p. Blocking the expression or function of BART20-5p may expedite EBV-associated tumor cell death via immune attack and apoptosis. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Effect of squatting velocity on hip muscle latency in women with patellofemoral pain syndrome.
Orozco-Chavez, Ignacio; Mendez-Rebolledo, Guillermo
2018-03-01
[Purpose] Neuromuscular activity has been evaluated in patellofemoral pain syndrome but movement velocity has not been considered. The aim was to determine differences in onset latency of hip and knee muscles between individuals with and without patellofemoral pain syndrome during a single leg squat, and whether any differences are dependent on movement velocity. [Subjects and Methods] Twenty-four females with patellofemoral pain syndrome and 24 healthy females participated. Onset latency of gluteus maximus, anterior and posterior gluteus medius, rectus femoris, vastus medialis, vastus lateralis and biceps femoris during a single leg squat at high and low velocity were evaluated. [Results] There was an interaction between velocity and diagnosis for posterior gluteus medius. Healthy subjects showed a later posterior gluteus medius onset latency at low velocity than high velocity; and also later than patellofemoral pain syndrome subjects at low velocity and high velocity. [Conclusion] Patellofemoral pain syndrome subjects presented an altered latency of posterior gluteus medius during a single leg squat and did not generate adaptations to velocity variation, while healthy subjects presented an earlier onset latency in response to velocity increase.
Low-latency situational awareness for UxV platforms
NASA Astrophysics Data System (ADS)
Berends, David C.
2012-06-01
Providing high quality, low latency video from unmanned vehicles through bandwidth-limited communications channels remains a formidable challenge for modern vision system designers. SRI has developed a number of enabling technologies to address this, including the use of SWaP-optimized Systems-on-a-Chip which provide Multispectral Fusion and Contrast Enhancement as well as H.264 video compression. Further, the use of salience-based image prefiltering prior to image compression greatly reduces output video bandwidth by selectively blurring non-important scene regions. Combined with our customization of the VLC open source video viewer for low latency video decoding, SRI developed a prototype high performance, high quality vision system for UxV application in support of very demanding system latency requirements and user CONOPS.
NASA Technical Reports Server (NTRS)
Thronson, Harley A.; Valinia, Azita; Bleacher, Jacob; Eigenbrode, Jennifer; Garvin, Jim; Petro, Noah
2014-01-01
We summarize a proposed experiment to use the International Space Station to formally examine the application and validation of low-latency telepresence for surface exploration from space as an alternative, precursor, or potentially as an adjunct to astronaut "boots on the ground." The approach is to develop and propose controlled experiments, which build upon previous field studies and which will assess the effects of different latencies (0 to 500 msec), task complexity, and alternate forms of feedback to the operator. These experiments serve as an example of a pathfinder for NASA's roadmap of missions to Mars with low-latency telerobotic exploration as a precursor to astronaut's landing on the surface to conduct geological tasks.
The Schultz MIDI Benchmarking Toolbox for MIDI interfaces, percussion pads, and sound cards.
Schultz, Benjamin G
2018-04-17
The Musical Instrument Digital Interface (MIDI) was readily adopted for auditory sensorimotor synchronization experiments. These experiments typically use MIDI percussion pads to collect responses, a MIDI-USB converter (or MIDI-PCI interface) to record responses on a PC and manipulate feedback, and an external MIDI sound module to generate auditory feedback. Previous studies have suggested that auditory feedback latencies can be introduced by these devices. The Schultz MIDI Benchmarking Toolbox (SMIDIBT) is an open-source, Arduino-based package designed to measure the point-to-point latencies incurred by several devices used in the generation of response-triggered auditory feedback. Experiment 1 showed that MIDI messages are sent and received within 1 ms (on average) in the absence of any external MIDI device. Latencies decreased when the baud rate increased above the MIDI protocol default (31,250 bps). Experiment 2 benchmarked the latencies introduced by different MIDI-USB and MIDI-PCI interfaces. MIDI-PCI was superior to MIDI-USB, primarily because MIDI-USB is subject to USB polling. Experiment 3 tested three MIDI percussion pads. Both the audio and MIDI message latencies were significantly greater than 1 ms for all devices, and there were significant differences between percussion pads and instrument patches. Experiment 4 benchmarked four MIDI sound modules. Audio latencies were significantly greater than 1 ms, and there were significant differences between sound modules and instrument patches. These experiments suggest that millisecond accuracy might not be achievable with MIDI devices. The SMIDIBT can be used to benchmark a range of MIDI devices, thus allowing researchers to make informed decisions when choosing testing materials and to arrive at an acceptable latency at their discretion.
Whole-body vibration induces distinct reflex patterns in human soleus muscle.
Karacan, Ilhan; Cidem, Muharrem; Cidem, Mehmet; Türker, Kemal S
2017-06-01
The neuronal mechanisms underlying whole body vibration (WBV)-induced muscular reflex (WBV-IMR) are not well understood. To define a possible pathway for WBV-IMR, this study investigated the effects of WBV amplitude on WBV-IMR latency by surface electromyography analysis of the soleus muscle in human adult volunteers. The tendon (T) reflex was also induced to evaluate the level of presynaptic Ia inhibition during WBV. WBV-IMR latency was shorter when induced by low- as compared to medium- or high-amplitude WBV (33.9±5.3msvs. 43.8±3.6 and 44.1±4.2ms, respectively). There was no difference in latencies between T-reflex elicited before WBV (33.8±2.4ms) and WBV-IMR induced by low-amplitude WBV. Presynaptic Ia inhibition was absent during low-amplitude WBV but was present during medium- and high-amplitude WBV. Consequently, WBV induces short- or long-latency reflexes depending on the vibration amplitude. During low-amplitude WBV, muscle spindle activation may induce the short- but not the long-latency WBV-IMR. Furthermore, unlike the higher amplitude WBV, low-amplitude WBV does not induce presynaptic inhibition at the Ia synaptic terminals. Copyright © 2017 Elsevier Ltd. All rights reserved.
Earthquake Early Warning: New Strategies for Seismic Hardware
NASA Astrophysics Data System (ADS)
Allardice, S.; Hill, P.
2017-12-01
Implementing Earthquake Early Warning System (EEWS) triggering algorithms into seismic networks has been a hot topic of discussion for some years now. With digitizer technology now available, such as the Güralp Minimus, with on average 40-60ms delay time (latency) from earthquake origin to issuing an alert the next step is to provide network operators with a simple interface for on board parameter calculations from a seismic station. A voting mechanism is implemented on board which mitigates the risk of false positives being communicated. Each Minimus can be configured to with a `score' from various sources i.e. Z channel on seismometer, N/S E/W channels on accelerometer and MEMS inside Minimus. If the score exceeds the set threshold then an alert is sent to the `Master Minimus'. The Master Minimus within the network will also be configured as to when the alert should be issued i.e. at least 3 stations must have triggered. Industry standard algorithms focus around the calculation of Peak Ground Acceleration (PGA), Peak Ground Velocity (PGV), Peak Ground Displacement (PGD) and C. Calculating these single station parameters on-board in order to stream only the results could help network operators with possible issues, such as restricted bandwidth. Developments on the Minimus allow these parameters to be calculated and distributed through Common Alert Protocol (CAP). CAP is the XML based data format used for exchanging and describing public warnings and emergencies. Whenever the trigger conditions are met the Minimus can send a signed UDP packet to the configured CAP receiver which can then send the alert via SMS, e-mail or CAP forwarding. Increasing network redundancy is also a consideration when developing these features, therefore the forwarding CAP message can be sent to multiple destinations. This allows for a hierarchical approach by which the single station (or network) parameters can be streamed to another Minimus, or data centre, or both, so that there is no one single point of failure. Developments on the Guralp Minimus to calculate these on board parameters which are capable of streaming single station parameters, accompanied with the ultra-low latency is the next generation of EEWS and Güralps contribution to the community.
Commercial Off-The-Shelf (COTS) Graphics Processing Board (GPB) Radiation Test Evaluation Report
NASA Technical Reports Server (NTRS)
Salazar, George A.; Steele, Glen F.
2013-01-01
Large round trip communications latency for deep space missions will require more onboard computational capabilities to enable the space vehicle to undertake many tasks that have traditionally been ground-based, mission control responsibilities. As a result, visual display graphics will be required to provide simpler vehicle situational awareness through graphical representations, as well as provide capabilities never before done in a space mission, such as augmented reality for in-flight maintenance or Telepresence activities. These capabilities will require graphics processors and associated support electronic components for high computational graphics processing. In an effort to understand the performance of commercial graphics card electronics operating in the expected radiation environment, a preliminary test was performed on five commercial offthe- shelf (COTS) graphics cards. This paper discusses the preliminary evaluation test results of five COTS graphics processing cards tested to the International Space Station (ISS) low earth orbit radiation environment. Three of the five graphics cards were tested to a total dose of 6000 rads (Si). The test articles, test configuration, preliminary results, and recommendations are discussed.
Systems-on-chip approach for real-time simulation of wheel-rail contact laws
NASA Astrophysics Data System (ADS)
Mei, T. X.; Zhou, Y. J.
2013-04-01
This paper presents the development of a systems-on-chip approach to speed up the simulation of wheel-rail contact laws, which can be used to reduce the requirement for high-performance computers and enable simulation in real time for the use of hardware-in-loop for experimental studies of the latest vehicle dynamic and control technologies. The wheel-rail contact laws are implemented using a field programmable gate array (FPGA) device with a design that substantially outperforms modern general-purpose PC platforms or fixed architecture digital signal processor devices in terms of processing time, configuration flexibility and cost. In order to utilise the FPGA's parallel-processing capability, the operations in the contact laws algorithms are arranged in a parallel manner and multi-contact patches are tackled simultaneously in the design. The interface between the FPGA device and the host PC is achieved by using a high-throughput and low-latency Ethernet link. The development is based on FASTSIM algorithms, although the design can be adapted and expanded for even more computationally demanding tasks.
Efficient implementation of a 3-dimensional ADI method on the iPSC/860
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van der Wijngaart, R.F.
1993-12-31
A comparison is made between several domain decomposition strategies for the solution of three-dimensional partial differential equations on a MIMD distributed memory parallel computer. The grids used are structured, and the numerical algorithm is ADI. Important implementation issues regarding load balancing, storage requirements, network latency, and overlap of computations and communications are discussed. Results of the solution of the three-dimensional heat equation on the Intel iPSC/860 are presented for the three most viable methods. It is found that the Bruno-Cappello decomposition delivers optimal computational speed through an almost complete elimination of processor idle time, while providing good memory efficiency.
Memory Network For Distributed Data Processors
NASA Technical Reports Server (NTRS)
Bolen, David; Jensen, Dean; Millard, ED; Robinson, Dave; Scanlon, George
1992-01-01
Universal Memory Network (UMN) is modular, digital data-communication system enabling computers with differing bus architectures to share 32-bit-wide data between locations up to 3 km apart with less than one millisecond of latency. Makes it possible to design sophisticated real-time and near-real-time data-processing systems without data-transfer "bottlenecks". This enterprise network permits transmission of volume of data equivalent to an encyclopedia each second. Facilities benefiting from Universal Memory Network include telemetry stations, simulation facilities, power-plants, and large laboratories or any facility sharing very large volumes of data. Main hub of UMN is reflection center including smaller hubs called Shared Memory Interfaces.
Single CA3 pyramidal cells trigger sharp waves in vitro by exciting interneurones.
Bazelot, Michaël; Teleńczuk, Maria T; Miles, Richard
2016-05-15
The CA3 hippocampal region generates sharp waves (SPW), a population activity associated with neuronal representations. The synaptic mechanisms responsible for the generation of these events still require clarification. Using slices maintained in an interface chamber, we found that the firing of single CA3 pyramidal cells triggers SPW like events at short latencies, similar to those for the induction of firing in interneurons. Multi-electrode records from the CA3 stratum pyramidale showed that pyramidal cells triggered events consisting of putative interneuron spikes followed by field IPSPs. SPW fields consisted of a repetition of these events at intervals of 4-8 ms. Although many properties of induced and spontaneous SPWs were similar, the triggered events tended to be initiated close to the stimulated cell. These data show that the initiation of SPWs in vitro is mediated via pyramidal cell synapses that excite interneurons. They do not indicate why interneuron firing is repeated during a SPW. Sharp waves (SPWs) are a hippocampal population activity that has been linked to neuronal representations. We show that SPWs in the CA3 region of rat hippocampal slices can be triggered by the firing of single pyramidal cells. Single action potentials in almost one-third of pyramidal cells initiated SPWs at latencies of 2-5 ms with probabilities of 0.07-0.76. Initiating pyramidal cells evoked field IPSPs (fIPSPs) at similar latencies when SPWs were not initiated. Similar spatial profiles for fIPSPs and middle components of SPWs suggested that SPW fields reflect repeated fIPSPs. Multiple extracellular records showed that the initiated SPWs tended to start near the stimulated pyramidal cell, whereas spontaneous SPWs could emerge at multiple sites. Single pyramidal cells could initiate two to six field IPSPs with distinct amplitude distributions, typically preceeded by a short-duration extracellular action potential. Comparison of these initiated fields with spontaneously occurring inhibitory field motifs allowed us to identify firing in different interneurones during the spread of SPWs. Propagation away from an initiating pyramidal cell was typically associated with the recruitment of interneurones and field IPSPs that were not activated by the stimulated pyramidal cell. SPW fields initiated by single cells were less variable than spontaneous events, suggesting that more stereotyped neuronal ensembles were activated, although neither the spatial profiles of fields, nor the identities of interneurone firing were identical for initiated events. The effects of single pyramidal cell on network events are thus mediated by different sequences of interneurone firing. © 2016 The Authors. The Journal of Physiology © 2016 The Physiological Society.
Introduction on performance analysis and profiling methodologies for KVM on ARM virtualization
NASA Astrophysics Data System (ADS)
Motakis, Antonios; Spyridakis, Alexander; Raho, Daniel
2013-05-01
The introduction of hardware virtualization extensions on ARM Cortex-A15 processors has enabled the implementation of full virtualization solutions for this architecture, such as KVM on ARM. This trend motivates the need to quantify and understand the performance impact, emerged by the application of this technology. In this work we start looking into some interesting performance metrics on KVM for ARM processors, which can provide us with useful insight that may lead to potential improvements in the future. This includes measurements such as interrupt latency and guest exit cost, performed on ARM Versatile Express and Samsung Exynos 5250 hardware platforms. Furthermore, we discuss additional methodologies that can provide us with a deeper understanding in the future of the performance footprint of KVM. We identify some of the most interesting approaches in this field, and perform a tentative analysis on how these may be implemented in the KVM on ARM port. These take into consideration hardware and software based counters for profiling, and issues related to the limitations of the simulators which are often used, such as the ARM Fast Models platform.
ATCA digital controller hardware for vertical stabilization of plasmas in tokamaks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Batista, A. J. N.; Sousa, J.; Varandas, C. A. F.
2006-10-15
The efficient vertical stabilization (VS) of plasmas in tokamaks requires a fast reaction of the VS controller, for example, after detection of edge localized modes (ELM). For controlling the effects of very large ELMs a new digital control hardware, based on the Advanced Telecommunications Computing Architecture trade mark sign (ATCA), is being developed aiming to reduce the VS digital control loop cycle (down to an optimal value of 10 {mu}s) and improve the algorithm performance. The system has 1 ATCA trade mark sign processor module and up to 12 ATCA trade mark sign control modules, each one with 32 analogmore » input channels (12 bit resolution), 4 analog output channels (12 bit resolution), and 8 digital input/output channels. The Aurora trade mark sign and PCI Express trade mark sign communication protocols will be used for data transport, between modules, with expected latencies below 2 {mu}s. Control algorithms are implemented on a ix86 based processor with 6 Gflops and on field programmable gate arrays with 80 GMACS, interconnected by serial gigabit links in a full mesh topology.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bender, Michael A.; Berry, Jonathan W.; Hammond, Simon D.
A challenge in computer architecture is that processors often cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. With this motivation and the realization that traditional architectures (with all DRAM reachable only via bus) are insufficient to feed groups of modern processing units, vendors have introduced a variety of non-DDR 3D memory technologies (Hybrid Memory Cube (HMC),Wide I/O 2, High Bandwidth Memory (HBM)). These offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. We will call these solutions “near-memory,” andmore » if user-addressable, “scratchpad.” High-performance systems on the market now offer two levels of main memory: near-memory on package and traditional DRAM further away. In the near term we expect the latencies near-memory and DRAM to be similar. Here, it is natural to think of near-memory as another module on the DRAM level of the memory hierarchy. Vendors are expected to offer modes in which the near memory is used as cache, but we believe that this will be inefficient.« less
Resquin, F; Ibañez, J; Gonzalez-Vargas, J; Brunetti, F; Dimbwadyo, I; Alves, S; Carrasco, L; Torres, L; Pons, Jose Luis
2016-08-01
Reaching and grasping are two of the most affected functions after stroke. Hybrid rehabilitation systems combining Functional Electrical Stimulation with Robotic devices have been proposed in the literature to improve rehabilitation outcomes. In this work, we present the combined use of a hybrid robotic system with an EEG-based Brain-Machine Interface to detect the user's movement intentions to trigger the assistance. The platform has been tested in a single session with a stroke patient. The results show how the patient could successfully interact with the BMI and command the assistance of the hybrid system with low latencies. Also, the Feedback Error Learning controller implemented in this system could adjust the required FES intensity to perform the task.
A modular, closed-loop platform for intracranial stimulation in people with neurological disorders.
Sarma, Anish A; Crocker, Britni; Cash, Sydney S; Truccolo, Wilson
2016-08-01
Neuromodulation systems based on electrical stimulation can be used to investigate, probe, and potentially treat a range of neurological disorders. The effects of ongoing neural state and dynamics on stimulation response, and of stimulation parameters on neural state, have broad implications for the development of closed-loop neuro-modulation approaches. We describe the development of a modular, low-latency platform for pre-clinical, closed-loop neuromodulation studies with human participants. We illustrate the uses of the platform in a stimulation case study with a person with epilepsy undergoing neuro-monitoring prior to resective surgery. We demonstrate the efficacy of the system by tracking interictal epileptiform discharges in the local field potential to trigger intracranial electrical stimulation, and show that the response to stimulation depends on the neural state.
Laterodorsal Nucleus of the Thalamus: A Processor of Somatosensory Inputs
BEZDUDNAYA, TATIANA; KELLER, ASAF
2009-01-01
The laterodorsal (LD) nucleus of the thalamus has been considered a “higher order” nucleus that provides inputs to limbic cortical areas. Although its functions are largely unknown, it is often considered to be involved in spatial learning and memory. Here we provide evidence that LD is part of a hitherto unknown pathway for processing somatosensory information. Juxtacellular and extracellular recordings from LD neurons reveal that they respond to vibrissa stimulation with short latency (median = 7 ms) and large magnitude responses (median = 1.2 spikes/stimulus). Most neurons (62%) had large receptive fields, responding to six and more individual vibrissae. Electrical stimulation of the trigeminal nucleus interpolaris (SpVi) evoked short latency responses (median = 3.8 ms) in vibrissa-responsive LD neurons. Labeling produced by anterograde and retrograde neuroanatomical tracers confirmed that LD neurons receive direct inputs from SpVi. Electrophysiological and neuroanatomical analyses revealed also that LD projects upon the cingulate and retrosplenial cortex, but has only sparse projections to the barrel cortex. These findings suggest that LD is part of a novel processing stream involved in spatial orientation and learning related to somatosensory cues. PMID:18273888
Variability in Hoffmann and tendon reflexes in healthy male subjects
NASA Technical Reports Server (NTRS)
Good, E.; Do, S.; Jaweed, M.
1992-01-01
There is a time dependent decrease in amplitude of H- and T-reflexes during Zero-G exposure and subsequently an increase in the amplitude of the H-reflex 2-4 hours after return to a 1-G environment. These alterations have been attributed to the adaptation of the human neurosensory system to gravity. The Hoffman reflex (H-reflex) is an acknowledged method to determine the integrity of the monosynaptic reflex arc. However deep tendon reflexes (DTR's or T-reflexes), elicited by striking the tendon also utilize the entire reflex arc. The objective of this study was to compare the variability in latency and amplitude of the two reflexes in healthy subjects. Methods: Nine healthy male subjects, 27-43 years in age, 161-175 cm in height plus 60-86 Kg in weight, underwent weekly testing for four weeks with a Dan-Tec EMG counterpoint EMG system. Subjects were studied prone and surface EMG electrodes were placed on the right and left soleus muscles. The H-reflex was obtained by stimulating the tibial nerve in the politeal fossa with a 0.2 msec square wave pulse delivered at 2 Hz until the maximum H-reflex was obtained. The T-reflex was invoked by tapping the achilles tendon with a self triggering reflex hammer connected to the EMG system. The latencies and amplitudes for the H- and T-reflexes were measured. Results: These data indicate that the amplitudes of these reflexes varied considerably. However, latencies to invoked responses were consistent. The latency of the T-reflex was approximately 3-5 msec longer than the H-reflex. Conclusion: The T-reflex is easily obtained, requires less time, and is more comfortable to perform. Qualitative data can be obtained by deploying self triggering, force plated reflex hammers both in the 1-G and Zero-G environment.
NASA Astrophysics Data System (ADS)
Torii, Tetsuya; Sato, Aya; Iwahashi, Masakuni; Iramina, Keiji
2012-04-01
The present study analyzed the effects of repetitive transcranial magnetic stimulation (rTMS) on brain activity. P300 latency of event-related potential (ERP) was used to evaluate the effects of low-frequency and short-term rTMS by stimulating the supramarginal gyrus (SMG), which is considered to be the related area of P300 origin. In addition, the prolonged stimulation effects on P300 latency were analyzed after applying rTMS. A figure-eight coil was used to stimulate left-right SMG, and intensity of magnetic stimulation was 80% of motor threshold. A total of 100 magnetic pulses were applied for rTMS. The effects of stimulus frequency at 0.5 or 1 Hz were determined. Following rTMS, an odd-ball task was performed and P300 latency of ERP was measured. The odd-ball task was performed at 5, 10, and 15 min post-rTMS. ERP was measured prior to magnetic stimulation as a control. Electroencephalograph (EEG) was measured at Fz, Cz, and Pz that were indicated by the international 10-20 electrode system. Results demonstrated that different effects on P300 latency occurred between 0.5-1 Hz rTMS. With 1 Hz low-frequency magnetic stimulation to the left SMG, P300 latency decreased. Compared to the control, the latency time difference was approximately 15 ms at Cz. This decrease continued for approximately 10 min post-rTMS. In contrast, 0.5 Hz rTMS resulted in delayed P300 latency. Compared to the control, the latency time difference was approximately 20 ms at Fz, and this delayed effect continued for approximately 15 min post-rTMS. Results demonstrated that P300 latency varied according to rTMS frequency. Furthermore, the duration of the effect was not similar for stimulus frequency of low-frequency rTMS.
Impact of wave propagation delay on latency in optical communication systems
NASA Astrophysics Data System (ADS)
Kawanishi, Tetsuya; Kanno, Atsushi; Yoshida, Yuki; Kitayama, Ken-ichi
2012-12-01
Latency is an important figure to describe performance of transmission systems for particular applications, such as data transfer for earthquake early warning, transaction for financial businesses, interactive services such as online games, etc. Latency consists of delay due to signal processing at nodes and transmitters, and of signal propagation delay due to propagation of electromagnetic waves. The lower limit of the latency in transmission systems using conventional single mode fibers (SMFs) depends on wave propagation speed in the SMFs which is slower than c. Photonic crystal fibers, holly fibers and large core fibers can have low effective refractive indices, and can transfer light faster than in SMFs. In free-space optical systems, signals propagate with the speed c, so that the latency could be smaller than in optical fibers. For example, LEO satellites would transmit data faster than optical submarine cables, when the transmission distance is longer than a few thousand kilometers. This paper will discuss combination of various transmission media to reduce negative impact of the latency, as well as applications of low-latency systems.
Latency causes and reduction in optical metro networks
NASA Astrophysics Data System (ADS)
Bobrovs, Vjaceslavs; Spolitis, Sandis; Ivanovs, Girts
2013-12-01
The dramatic growth of transmitted information in fiber optical networks is leading to a concern about the network latency for high-speed reliable services like financial transactions, telemedicine, virtual and augmented reality, surveillance, and other applications. In order to ensure effective latency engineering, the delay variability needs to be accurately monitored and measured, in order to control it. This paper in brief describes causes of latency in fiber optical metro networks. Several available latency reduction techniques and solutions are also discussed, namely concerning usage of different chromatic dispersion compensation methods, low-latency amplifiers, optical fibers as well as other network elements.
Real Time Global Tests of the ALICE High Level Trigger Data Transport Framework
NASA Astrophysics Data System (ADS)
Becker, B.; Chattopadhyay, S.; Cicalo, C.; Cleymans, J.; de Vaux, G.; Fearick, R. W.; Lindenstruth, V.; Richter, M.; Rohrich, D.; Staley, F.; Steinbeck, T. M.; Szostak, A.; Tilsner, H.; Weis, R.; Vilakazi, Z. Z.
2008-04-01
The High Level Trigger (HLT) system of the ALICE experiment is an online event filter and trigger system designed for input bandwidths of up to 25 GB/s at event rates of up to 1 kHz. The system is designed as a scalable PC cluster, implementing several hundred nodes. The transport of data in the system is handled by an object-oriented data flow framework operating on the basis of the publisher-subscriber principle, being designed fully pipelined with lowest processing overhead and communication latency in the cluster. In this paper, we report the latest measurements where this framework has been operated on five different sites over a global north-south link extending more than 10,000 km, processing a ldquoreal-timerdquo data flow.
FPGA-based trigger system for the LUX dark matter experiment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Akerib, D. S.; Araújo, H. M.; Bai, X.
LUX is a two-phase (liquid/gas) xenon time projection chamber designed to detect nuclear recoils resulting from interactions with dark matter particles. Signals from the detector are processed with an FPGA-based digital trigger system that analyzes the incoming data in real-time, with just a few microsecond latency. The system enables first pass selection of events of interest based on their pulse shape characteristics and 3D localization of the interactions. It has been shown to be >99% efficient in triggering on S2 signals induced by only few extracted liquid electrons. It is continuously and reliably operating since its full underground deployment inmore » early 2013. This document is an overview of the systems capabilities, its inner workings, and its performance.« less
FPGA-based trigger system for the LUX dark matter experiment
Akerib, D. S.; Araújo, H. M.; Bai, X.; ...
2016-02-17
We present that LUX is a two-phase (liquid/gas) xenon time projection chamber designed to detect nuclear recoils resulting from interactions with dark matter particles. Signals from the detector are processed with an FPGA-based digital trigger system that analyzes the incoming data in real-time, with just a few microsecond latency. The system enables first pass selection of events of interest based on their pulse shape characteristics and 3D localization of the interactions. It has been shown to be > 99% efficient in triggering on S2 signals induced by only few extracted liquid electrons. It is continuously and reliably operating since itsmore » full underground deployment in early 2013. Finally, this document is an overview of the systems capabilities, its inner workings, and its performance.« less
FPGA-based trigger system for the LUX dark matter experiment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Akerib, D. S.; Araújo, H. M.; Bai, X.
We present that LUX is a two-phase (liquid/gas) xenon time projection chamber designed to detect nuclear recoils resulting from interactions with dark matter particles. Signals from the detector are processed with an FPGA-based digital trigger system that analyzes the incoming data in real-time, with just a few microsecond latency. The system enables first pass selection of events of interest based on their pulse shape characteristics and 3D localization of the interactions. It has been shown to be > 99% efficient in triggering on S2 signals induced by only few extracted liquid electrons. It is continuously and reliably operating since itsmore » full underground deployment in early 2013. Finally, this document is an overview of the systems capabilities, its inner workings, and its performance.« less
A neural network z-vertex trigger for Belle II
NASA Astrophysics Data System (ADS)
Neuhaus, S.; Skambraks, S.; Abudinen, F.; Chen, Y.; Feindt, M.; Frühwirth, R.; Heck, M.; Kiesling, C.; Knoll, A.; Paul, S.; Schieck, J.
2015-05-01
We present the concept of a track trigger for the Belle II experiment, based on a neural network approach, that is able to reconstruct the z (longitudinal) position of the event vertex within the latency of the first level trigger. The trigger will thus be able to suppress a large fraction of the dominating background from events outside of the interaction region. The trigger uses the drift time information of the hits from the Central Drift Chamber (CDC) of Belle II within narrow cones in polar and azimuthal angle as well as in transverse momentum (sectors), and estimates the z-vertex without explicit track reconstruction. The preprocessing for the track trigger is based on the track information provided by the standard CDC trigger. It takes input from the 2D (r — φ) track finder, adds information from the stereo wires of the CDC, and finds the appropriate sectors in the CDC for each track in a given event. Within each sector, the z-vertex of the associated track is estimated by a specialized neural network, with a continuous output corresponding to the scaled z-vertex. The input values for the neural network are calculated from the wire hits of the CDC.
NASA Astrophysics Data System (ADS)
Edwards, A. W.; Blackler, K.; Gill, R. D.; van der Goot, E.; Holm, J.
1990-10-01
Based upon the experience gained with the present soft x-ray data acquisition system, new techniques are being developed which make extensive use of digital signal processors (DSPs). Digital filters make 13 further frequencies available in real time from the input sampling frequency of 200 kHz. In parallel, various algorithms running on further DSPs generate triggers in response to a range of events in the plasma. The sawtooth crash can be detected, for example, with a delay of only 50 μs from the onset of the collapse. The trigger processor interacts with the digital filter boards to ensure data of the appropriate frequency is recorded throughout a plasma discharge. An independent link is used to pass 780 and 24 Hz filtered data to a network of transputers. A full tomographic inversion and display of the 24 Hz data is carried out in real time using this 15 transputer array. The 780 Hz data are stored for immediate detailed playback following the pulse. Such a system could considerably improve the quality of present plasma diagnostic data which is, in general, sampled at one fixed frequency throughout a discharge. Further, it should provide valuable information towards designing diagnostic data acquisition systems for future long pulse operation machines when a high degree of real-time processing will be required, while retaining the ability to detect, record, and analyze events of interest within such long plasma discharges.
Sobol, Wlad T
2002-01-01
A simple kinetic model that describes the time evolution of the chemical concentration of an arbitrary compound within the tank of an automatic film processor is presented. It provides insights into the kinetics of chemistry concentration inside the processor's tank; the results facilitate the tasks of processor tuning and quality control (QC). The model has successfully been used in several troubleshooting sessions of low-volume mammography processors for which maintaining consistent QC tracking was difficult due to fluctuations of bromide levels in the developer tank.
Improving land vehicle situational awareness using a distributed aperture system
NASA Astrophysics Data System (ADS)
Fortin, Jean; Bias, Jason; Wells, Ashley; Riddle, Larry; van der Wal, Gooitzen; Piacentino, Mike; Mandelbaum, Robert
2005-05-01
U.S. Army Research, Development, and Engineering Command (RDECOM) Communications Electronics Research, Development and Engineering Center (CERDEC) Night Vision and Electronic Sensors Directorate (NVESD) has performed early work to develop a Distributed Aperture System (DAS). The DAS aims at improving the situational awareness of armored fighting vehicle crews under closed-hatch conditions. The concept is based on a plurality of sensors configured to create a day and night dome of surveillance coupled with heads up displays slaved to the operator's head to give a "glass turret" feel. State-of-the-art image processing is used to produce multiple seamless hemispherical views simultaneously available to the vehicle commander, crew members and dismounting infantry. On-the-move automatic cueing of multiple moving/pop-up low silhouette threats is also done with the possibility to save/revisit/share past events. As a first step in this development program, a contract was awarded to United Defense to further develop the Eagle VisionTM system. The second-generation prototype features two camera heads, each comprising four high-resolution (2048x1536) color sensors, and each covering a field of view of 270°hx150°v. High-bandwidth digital links interface the camera heads with a field programmable gate array (FPGA) based custom processor developed by Sarnoff Corporation. The processor computes the hemispherical stitch and warp functions required for real-time, low latency, immersive viewing (360°hx120°v, 30° down) and generates up to six simultaneous extended graphics array (XGA) video outputs for independent display either on a helmet-mounted display (with associated head tracking device) or a flat panel display (and joystick). The prototype is currently in its last stage of development and will be integrated on a vehicle for user evaluation and testing. Near-term improvements include the replacement of the color camera heads with a pixel-level fused combination of uncooled long wave infrared (LWIR) and low light level intensified imagery. It is believed that the DAS will significantly increase situational awareness by providing the users with a day and night, wide area coverage, immersive visualization capability.
Jordan, Scott
2018-01-24
Scott Jordan on "Advances in high-throughput speed, low-latency communication for embedded instrumentation" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
Low Latency Messages on Distributed Memory Multiprocessors
Rosing, Matt; Saltz, Joel
1995-01-01
This article describes many of the issues in developing an efficient interface for communication on distributed memory machines. Although the hardware component of message latency is less than 1 ws on many distributed memory machines, the software latency associated with sending and receiving typed messages is on the order of 50 μs. The reason for this imbalance is that the software interface does not match the hardware. By changing the interface to match the hardware more closely, applications with fine grained communication can be put on these machines. This article describes several tests performed and many of the issues involvedmore » in supporting low latency messages on distributed memory machines.« less
A low power biomedical signal processor ASIC based on hardware software codesign.
Nie, Z D; Wang, L; Chen, W G; Zhang, T; Zhang, Y T
2009-01-01
A low power biomedical digital signal processor ASIC based on hardware and software codesign methodology was presented in this paper. The codesign methodology was used to achieve higher system performance and design flexibility. The hardware implementation included a low power 32bit RISC CPU ARM7TDMI, a low power AHB-compatible bus, and a scalable digital co-processor that was optimized for low power Fast Fourier Transform (FFT) calculations. The co-processor could be scaled for 8-point, 16-point and 32-point FFTs, taking approximate 50, 100 and 150 clock circles, respectively. The complete design was intensively simulated using ARM DSM model and was emulated by ARM Versatile platform, before conducted to silicon. The multi-million-gate ASIC was fabricated using SMIC 0.18 microm mixed-signal CMOS 1P6M technology. The die area measures 5,000 microm x 2,350 microm. The power consumption was approximately 3.6 mW at 1.8 V power supply and 1 MHz clock rate. The power consumption for FFT calculations was less than 1.5 % comparing with the conventional embedded software-based solution.
Farabet, Clément; Paz, Rafael; Pérez-Carrasco, Jose; Zamarreño-Ramos, Carlos; Linares-Barranco, Alejandro; LeCun, Yann; Culurciello, Eugenio; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabe
2012-01-01
Most scene segmentation and categorization architectures for the extraction of features in images and patches make exhaustive use of 2D convolution operations for template matching, template search, and denoising. Convolutional Neural Networks (ConvNets) are one example of such architectures that can implement general-purpose bio-inspired vision systems. In standard digital computers 2D convolutions are usually expensive in terms of resource consumption and impose severe limitations for efficient real-time applications. Nevertheless, neuro-cortex inspired solutions, like dedicated Frame-Based or Frame-Free Spiking ConvNet Convolution Processors, are advancing real-time visual processing. These two approaches share the neural inspiration, but each of them solves the problem in different ways. Frame-Based ConvNets process frame by frame video information in a very robust and fast way that requires to use and share the available hardware resources (such as: multipliers, adders). Hardware resources are fixed- and time-multiplexed by fetching data in and out. Thus memory bandwidth and size is important for good performance. On the other hand, spike-based convolution processors are a frame-free alternative that is able to perform convolution of a spike-based source of visual information with very low latency, which makes ideal for very high-speed applications. However, hardware resources need to be available all the time and cannot be time-multiplexed. Thus, hardware should be modular, reconfigurable, and expansible. Hardware implementations in both VLSI custom integrated circuits (digital and analog) and FPGA have been already used to demonstrate the performance of these systems. In this paper we present a comparison study of these two neuro-inspired solutions. A brief description of both systems is presented and also discussions about their differences, pros and cons. PMID:22518097
Farabet, Clément; Paz, Rafael; Pérez-Carrasco, Jose; Zamarreño-Ramos, Carlos; Linares-Barranco, Alejandro; Lecun, Yann; Culurciello, Eugenio; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabe
2012-01-01
Most scene segmentation and categorization architectures for the extraction of features in images and patches make exhaustive use of 2D convolution operations for template matching, template search, and denoising. Convolutional Neural Networks (ConvNets) are one example of such architectures that can implement general-purpose bio-inspired vision systems. In standard digital computers 2D convolutions are usually expensive in terms of resource consumption and impose severe limitations for efficient real-time applications. Nevertheless, neuro-cortex inspired solutions, like dedicated Frame-Based or Frame-Free Spiking ConvNet Convolution Processors, are advancing real-time visual processing. These two approaches share the neural inspiration, but each of them solves the problem in different ways. Frame-Based ConvNets process frame by frame video information in a very robust and fast way that requires to use and share the available hardware resources (such as: multipliers, adders). Hardware resources are fixed- and time-multiplexed by fetching data in and out. Thus memory bandwidth and size is important for good performance. On the other hand, spike-based convolution processors are a frame-free alternative that is able to perform convolution of a spike-based source of visual information with very low latency, which makes ideal for very high-speed applications. However, hardware resources need to be available all the time and cannot be time-multiplexed. Thus, hardware should be modular, reconfigurable, and expansible. Hardware implementations in both VLSI custom integrated circuits (digital and analog) and FPGA have been already used to demonstrate the performance of these systems. In this paper we present a comparison study of these two neuro-inspired solutions. A brief description of both systems is presented and also discussions about their differences, pros and cons.
Extended Logic Intelligent Processing System for a Sensor Fusion Processor Hardware
NASA Technical Reports Server (NTRS)
Stoica, Adrian; Thomas, Tyson; Li, Wei-Te; Daud, Taher; Fabunmi, James
2000-01-01
The paper presents the hardware implementation and initial tests from a low-power, highspeed reconfigurable sensor fusion processor. The Extended Logic Intelligent Processing System (ELIPS) is described, which combines rule-based systems, fuzzy logic, and neural networks to achieve parallel fusion of sensor signals in compact low power VLSI. The development of the ELIPS concept is being done to demonstrate the interceptor functionality which particularly underlines the high speed and low power requirements. The hardware programmability allows the processor to reconfigure into different machines, taking the most efficient hardware implementation during each phase of information processing. Processing speeds of microseconds have been demonstrated using our test hardware.
Selective attention to a facial feature with and without facial context: an ERP-study.
Wijers, A A; Van Besouw, N J P; Mulder, G
2002-04-01
The present experiment addressed the question whether selectively attending to a facial feature (mouth shape) would benefit from the presence of a correct facial context. Subjects attended selectively to one of two possible mouth shapes belonging to photographs of a face with a happy or sad expression, respectively. These mouths were presented randomly either in isolation, embedded in the original photos, or in an exchanged facial context. The ERP effect of attending mouth shape was a lateral posterior negativity, anterior positivity with an onset latency of 160-200 ms; this effect was completely unaffected by the type of facial context. When the mouth shape and the facial context conflicted, this resulted in a medial parieto-occipital positivity with an onset latency of 180 ms, independent of the relevance of the mouth shape. Finally, there was a late (onset at approx. 400 ms) expression (happy vs. sad) effect, which was strongly lateralized to the right posterior hemisphere and was most prominent for attended stimuli in the correct facial context. For the isolated mouth stimuli, a similarly distributed expression effect was observed at an earlier latency range (180-240 ms). These data suggest the existence of separate, independent and neuroanatomically segregated processors engaged in the selective processing of facial features and the detection of contextual congruence and emotional expression of face stimuli. The data do not support that early selective attention processes benefit from top-down constraints provided by the correct facial context.
An information-theoretic approach to the gravitational-wave burst detection problem
NASA Astrophysics Data System (ADS)
Katsavounidis, E.; Lynch, R.; Vitale, S.; Essick, R.; Robinet, F.
2016-03-01
The advanced era of gravitational-wave astronomy, with data collected in part by the LIGO gravitational-wave interferometers, has begun as of fall 2015. One potential type of detectable gravitational waves is short-duration gravitational-wave bursts, whose waveforms can be difficult to predict. We present the framework for a new detection algorithm - called oLIB - that can be used in relatively low-latency to turn calibrated strain data into a detection significance statement. This pipeline consists of 1) a sine-Gaussian matched-filter trigger generator based on the Q-transform - known as Omicron -, 2) incoherent down-selection of these triggers to the most signal-like set, and 3) a fully coherent analysis of this signal-like set using the Markov chain Monte Carlo (MCMC) Bayesian evidence calculator LALInferenceBurst (LIB). We optimally extract this information by using a likelihood-ratio test (LRT) to map these search statistics into a significance statement. Using representative archival LIGO data, we show that the algorithm can detect gravitational-wave burst events of realistic strength in realistic instrumental noise with good detection efficiencies across different burst waveform morphologies. With support from the National Science Foundation under Grant PHY-0757058.
Self-triggering readout system for the neutron lifetime experiment PENeLOPE
NASA Astrophysics Data System (ADS)
Gaisbauer, D.; Bai, Y.; Konorov, I.; Paul, S.; Steffen, D.
2016-02-01
PENeLOPE is a neutron lifetime measurement developed at the Technische Universität München and located at the Forschungs-Neutronenquelle Heinz Maier-Leibnitz (FRM II) aiming to achieve a precision of 0.1 seconds. The detector for PENeLOPE consists of about 1250 Avalanche Photodiodes (APDs) with a total active area of 1225 cm2. The decay proton detector and electronics will be operated at a high electrostatic potential of -30 kV and a magnetic field of 0.6 T. This includes shaper, preamplifier, ADC and FPGA cards. In addition, the APDs will be cooled to 77 K. The 1250 APDs are divided into 14 groups of 96 channels, including spares. A 12-bit ADC digitizes the detector signals with 1 MSps. A firmware was developed for the detector including a self-triggering readout with continuous pedestal calculation and configurable signal detection. The data transmission and configuration is done via the Switched Enabling Protocol (SEP). It is a time-division multiplexing low layer protocol which provides determined latency for time critical messages, IPBus, and JTAG interfaces. The network has a n:1 topology, reducing the number of optical links.
Post-cardiac injury syndrome: an atypical case following percutaneous coronary intervention.
Paiardi, Silvia; Cannata, Francesco; Ciccarelli, Michele; Voza, Antonio
2017-12-01
Post-cardiac injury syndrome (PCIS) is a syndrome characterized by pericardial and/or pleural effusion, triggered by a cardiac injury, usually a myocardial infarction or cardiac surgery, rarely a minor cardiovascular percutaneous procedure. Nowadays, the post-cardiac injury syndrome, is regaining importance and interest as an emerging cause of pericarditis, especially in developed countries, due to a great and continuous increase in the number and complexity of percutaneous cardiologic procedures. The etiopathogenesis seems mediated by the immunitary system producing immune complexes, which deposit in the pericardium and pleura and trigger an inflammatory response. We present the atypical case of a 76-year-old man presenting with a hydro-pneumothorax, low-grade fever and elevated inflammation markers, after two complex percutaneous coronary interventions, executed 30 and 75 days prior. The clinical features of our case are consistent with the diagnostic criteria of PCIS: prior injury of the pericardium and/or myocardium, fever, leucocytosis, elevated inflammatory markers, remarkable steroid responsiveness and latency period. Only one element does not fit with this diagnosis and does not find any further explanation: the air accompanying the pleural effusion, determining a hydro-pneumothorax and requiring a pleural drainage catheter positioning. Copyright © 2017 Elsevier Inc. All rights reserved.
Low-Latency Telerobotic Sample Return and Biomolecular Sequencing for Deep Space Gateway
NASA Astrophysics Data System (ADS)
Lupisella, M.; Bleacher, J.; Lewis, R.; Dworkin, J.; Wright, M.; Burton, A.; Rubins, K.; Wallace, S.; Stahl, S.; John, K.; Archer, D.; Niles, P.; Regberg, A.; Smith, D.; Race, M.; Chiu, C.; Russell, J.; Rampe, E.; Bywaters, K.
2018-02-01
Low-latency telerobotics, crew-assisted sample return, and biomolecular sequencing can be used to acquire and analyze lunar farside and/or Apollo landing site samples. Sequencing can also be used to monitor and study Deep Space Gateway environment and crew health.
Pierce, Paul E.
1986-01-01
A hardware processor is disclosed which in the described embodiment is a memory mapped multiplier processor that can operate in parallel with a 16 bit microcomputer. The multiplier processor decodes the address bus to receive specific instructions so that in one access it can write and automatically perform single or double precision multiplication involving a number written to it with or without addition or subtraction with a previously stored number. It can also, on a single read command automatically round and scale a previously stored number. The multiplier processor includes two concatenated 16 bit multiplier registers, two 16 bit concatenated 16 bit multipliers, and four 16 bit product registers connected to an internal 16 bit data bus. A high level address decoder determines when the multiplier processor is being addressed and first and second low level address decoders generate control signals. In addition, certain low order address lines are used to carry uncoded control signals. First and second control circuits coupled to the decoders generate further control signals and generate a plurality of clocking pulse trains in response to the decoded and address control signals.
Pierce, P.E.
A hardware processor is disclosed which in the described embodiment is a memory mapped multiplier processor that can operate in parallel with a 16 bit microcomputer. The multiplier processor decodes the address bus to receive specific instructions so that in one access it can write and automatically perform single or double precision multiplication involving a number written to it with or without addition or subtraction with a previously stored number. It can also, on a single read command automatically round and scale a previously stored number. The multiplier processor includes two concatenated 16 bit multiplier registers, two 16 bit concatenated 16 bit multipliers, and four 16 bit product registers connected to an internal 16 bit data bus. A high level address decoder determines when the multiplier processor is being addressed and first and second low level address decoders generate control signals. In addition, certain low order address lines are used to carry uncoded control signals. First and second control circuits coupled to the decoders generate further control signals and generate a plurality of clocking pulse trains in response to the decoded and address control signals.
Fast particles identification in programmable form at level-0 trigger by means of the 3D-Flow system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crosetto, Dario B.
1998-10-30
The 3D-Flow Processor system is a new, technology-independent concept in very fast, real-time system architectures. Based on either an FPGA or an ASIC implementation, it can address, in a fully programmable manner, applications where commercially available processors would fail because of throughput requirements. Possible applications include filtering-algorithms (pattern recognition) from the input of multiple sensors, as well as moving any input validated by these filtering-algorithms to a single output channel. Both operations can easily be implemented on a 3D-Flow system to achieve a real-time processing system with a very short lag time. This system can be built either with off-the-shelfmore » FPGAs or, for higher data rates, with CMOS chips containing 4 to 16 processors each. The basic building block of the system, a 3D-Flow processor, has been successfully designed in VHDL code written in ''Generic HDL'' (mostly made of reusable blocks that are synthesizable in different technologies, or FPGAs), to produce a netlist for a four-processor ASIC featuring 0.35 micron CBA (Ceil Base Array) technology at 3.3 Volts, 884 mW power dissipation at 60 MHz and 63.75 mm sq. die size. The same VHDL code has been targeted to three FPGA manufacturers (Altera EPF10K250A, ORCA-Lucent Technologies 0R3T165 and Xilinx XCV1000). A complete set of software tools, the 3D-Flow System Manager, equally applicable to ASIC or FPGA implementations, has been produced to provide full system simulation, application development, real-time monitoring, and run-time fault recovery. Today's technology can accommodate 16 processors per chip in a medium size die, at a cost per processor of less than $5 based on the current silicon die/size technology cost.« less
2008-07-01
generation of process partitioning, a thread pipelining becomes possible. In this paper we briefly summarize the requirements and trends for FADEC based... FADEC environment, presenting a hypothetical realization of an example application. Finally we discuss the application of Time-Triggered...based control applications of the future. 15. SUBJECT TERMS Gas turbine, FADEC , Multi-core processing technology, disturbed based control
Adaptable radiation monitoring system and method
Archer, Daniel E [Livermore, CA; Beauchamp, Brock R [San Ramon, CA; Mauger, G Joseph [Livermore, CA; Nelson, Karl E [Livermore, CA; Mercer, Michael B [Manteca, CA; Pletcher, David C [Sacramento, CA; Riot, Vincent J [Berkeley, CA; Schek, James L [Tracy, CA; Knapp, David A [Livermore, CA
2006-06-20
A portable radioactive-material detection system capable of detecting radioactive sources moving at high speeds. The system has at least one radiation detector capable of detecting gamma-radiation and coupled to an MCA capable of collecting spectral data in very small time bins of less than about 150 msec. A computer processor is connected to the MCA for determining from the spectral data if a triggering event has occurred. Spectral data is stored on a data storage device, and a power source supplies power to the detection system. Various configurations of the detection system may be adaptably arranged for various radiation detection scenarios. In a preferred embodiment, the computer processor operates as a server which receives spectral data from other networked detection systems, and communicates the collected data to a central data reporting system.
Plural-wavelength flame detector that discriminates between direct and reflected radiation
NASA Technical Reports Server (NTRS)
Hall, Gregory H. (Inventor); Barnes, Heidi L. (Inventor); Medelius, Pedro J. (Inventor); Simpson, Howard J. (Inventor); Smith, Harvey S. (Inventor)
1997-01-01
A flame detector employs a plurality of wavelength selective radiation detectors and a digital signal processor programmed to analyze each of the detector signals, and determine whether radiation is received directly from a small flame source that warrants generation of an alarm. The processor's algorithm employs a normalized cross-correlation analysis of the detector signals to discriminate between radiation received directly from a flame and radiation received from a reflection of a flame to insure that reflections will not trigger an alarm. In addition, the algorithm employs a Fast Fourier Transform (FFT) frequency spectrum analysis of one of the detector signals to discriminate between flames of different sizes. In a specific application, the detector incorporates two infrared (IR) detectors and one ultraviolet (UV) detector for discriminating between a directly sensed small hydrogen flame, and reflections from a large hydrogen flame. The signals generated by each of the detectors are sampled and digitized for analysis by the digital signal processor, preferably 250 times a second. A sliding time window of approximately 30 seconds of detector data is created using FIFO memories.
Wensveen, Paul J; Huijser, Léonie A E; Hoek, Lean; Kastelein, Ronald A
2016-01-01
Loudness perception can be studied based on the assumption that sounds of equal loudness elicit equal reaction time (RT; or "response latency"). We measured the underwater RTs of a harbor porpoise to narrowband frequency-modulated sounds and constructed six equal-latency contours. The contours paralleled the audiogram at low sensation levels (high RTs). At high-sensation levels, contours flattened between 0.5 and 31.5 kHz but dropped substantially (RTs shortened) beyond those frequencies. This study suggests that equal-latency-based frequency weighting can emulate noise perception in porpoises for low and middle frequencies but that the RT-loudness correlation is relatively weak for very high frequencies.
Low Latency MAC Protocol in Wireless Sensor Networks Using Timing Offset
NASA Astrophysics Data System (ADS)
Choi, Seung Sik
This paper proposes a low latency MAC protocol that can be used in sensor networks. To extend the lifetime of sensor nodes, the conventional solution is to synchronize active/sleep periods of all sensor nodes. However, due to these synchronized sensor nodes, packets in the intermediate nodes must wait until the next node wakes up before it can forward a packet. This induces a large delay in sensor nodes. To solve this latency problem, a clustered sensor network which uses two types of sensor nodes and layered architecture is considered. Clustered heads in each cluster are synchronized with different timing offsets to reduce the sleep delay. Using this concept, the latency problem can be solved and more efficient power usage can be obtained.
Real-time track-less Cherenkov ring fitting trigger system based on Graphics Processing Units
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Chiozzi, S.; Cretaro, P.; Cotta Ramusino, A.; Di Lorenzo, S.; Fantechi, R.; Fiorini, M.; Frezza, O.; Gianoli, A.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Piandani, R.; Piccini, M.; Pontisso, L.; Rossetti, D.; Simula, F.; Sozzi, M.; Vicini, P.
2017-12-01
The parallel computing power of commercial Graphics Processing Units (GPUs) is exploited to perform real-time ring fitting at the lowest trigger level using information coming from the Ring Imaging Cherenkov (RICH) detector of the NA62 experiment at CERN. To this purpose, direct GPU communication with a custom FPGA-based board has been used to reduce the data transmission latency. The GPU-based trigger system is currently integrated in the experimental setup of the RICH detector of the NA62 experiment, in order to reconstruct ring-shaped hit patterns. The ring-fitting algorithm running on GPU is fed with raw RICH data only, with no information coming from other detectors, and is able to provide more complex trigger primitives with respect to the simple photodetector hit multiplicity, resulting in a higher selection efficiency. The performance of the system for multi-ring Cherenkov online reconstruction obtained during the NA62 physics run is presented.
Processors for wavelet analysis and synthesis: NIFS and TI-C80 MVP
NASA Astrophysics Data System (ADS)
Brooks, Geoffrey W.
1996-03-01
Two processors are considered for image quadrature mirror filtering (QMF). The neuromorphic infrared focal-plane sensor (NIFS) is an existing prototype analog processor offering high speed spatio-temporal Gaussian filtering, which could be used for the QMF low- pass function, and difference of Gaussian filtering, which could be used for the QMF high- pass function. Although not designed specifically for wavelet analysis, the biologically- inspired system accomplishes the most computationally intensive part of QMF processing. The Texas Instruments (TI) TMS320C80 Multimedia Video Processor (MVP) is a 32-bit RISC master processor with four advanced digital signal processors (DSPs) on a single chip. Algorithm partitioning, memory management and other issues are considered for optimal performance. This paper presents these considerations with simulated results leading to processor implementation of high-speed QMF analysis and synthesis.
Master/Programmable-Slave Computer
NASA Technical Reports Server (NTRS)
Smaistrla, David; Hall, William A.
1990-01-01
Unique modular computer features compactness, low power, mass storage of data, multiprocessing, and choice of various input/output modes. Master processor communicates with user via usual keyboard and video display terminal. Coordinates operations of as many as 24 slave processors, each dedicated to different experiment. Each slave circuit card includes slave microprocessor and assortment of input/output circuits for communication with external equipment, with master processor, and with other slave processors. Adaptable to industrial process control with selectable degrees of automatic control, automatic and/or manual monitoring, and manual intervention.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-27
... 7034 Regarding Low Latency Network Connections December 20, 2011. I. Introduction On October 31, 2011...- Location Services'' to establish a program for offering low latency network connections and to establish the initial fees for such connections. The Exchange also proposed administrative modifications to...
Flexible trigger menu implementation on the Global Trigger for the CMS Level-1 trigger upgrade
NASA Astrophysics Data System (ADS)
MATSUSHITA, Takashi;
2017-10-01
The CMS experiment at the Large Hadron Collider (LHC) has continued to explore physics at the high-energy frontier in 2016. The integrated luminosity delivered by the LHC in 2016 was 41 fb-1 with a peak luminosity of 1.5 × 1034 cm-2s-1 and peak mean pile-up of about 50, all exceeding the initial estimations for 2016. The CMS experiment has upgraded its hardware-based Level-1 trigger system to maintain its performance for new physics searches and precision measurements at high luminosities. The Global Trigger is the final step of the CMS Level-1 trigger and implements a trigger menu, a set of selection requirements applied to the final list of objects from calorimeter and muon triggers, for reducing the 40 MHz collision rate to 100 kHz. The Global Trigger has been upgraded with state-of-the-art FPGA processors on Advanced Mezzanine Cards with optical links running at 10 GHz in a MicroTCA crate. The powerful processing resources of the upgraded system enable implementation of more algorithms at a time than previously possible, allowing CMS to be more flexible in how it handles the available trigger bandwidth. Algorithms for a trigger menu, including topological requirements on multi-objects, can be realised in the Global Trigger using the newly developed trigger menu specification grammar. Analysis-like trigger algorithms can be represented in an intuitive manner and the algorithms are translated to corresponding VHDL code blocks to build a firmware. The grammar can be extended in future as the needs arise. The experience of implementing trigger menus on the upgraded Global Trigger system will be presented.
Algorithms for parallel flow solvers on message passing architectures
NASA Technical Reports Server (NTRS)
Vanderwijngaart, Rob F.
1995-01-01
The purpose of this project has been to identify and test suitable technologies for implementation of fluid flow solvers -- possibly coupled with structures and heat equation solvers -- on MIMD parallel computers. In the course of this investigation much attention has been paid to efficient domain decomposition strategies for ADI-type algorithms. Multi-partitioning derives its efficiency from the assignment of several blocks of grid points to each processor in the parallel computer. A coarse-grain parallelism is obtained, and a near-perfect load balance results. In uni-partitioning every processor receives responsibility for exactly one block of grid points instead of several. This necessitates fine-grain pipelined program execution in order to obtain a reasonable load balance. Although fine-grain parallelism is less desirable on many systems, especially high-latency networks of workstations, uni-partition methods are still in wide use in production codes for flow problems. Consequently, it remains important to achieve good efficiency with this technique that has essentially been superseded by multi-partitioning for parallel ADI-type algorithms. Another reason for the concentration on improving the performance of pipeline methods is their applicability in other types of flow solver kernels with stronger implied data dependence. Analytical expressions can be derived for the size of the dynamic load imbalance incurred in traditional pipelines. From these it can be determined what is the optimal first-processor retardation that leads to the shortest total completion time for the pipeline process. Theoretical predictions of pipeline performance with and without optimization match experimental observations on the iPSC/860 very well. Analysis of pipeline performance also highlights the effect of uncareful grid partitioning in flow solvers that employ pipeline algorithms. If grid blocks at boundaries are not at least as large in the wall-normal direction as those immediately adjacent to them, then the first processor in the pipeline will receive a computational load that is less than that of subsequent processors, magnifying the pipeline slowdown effect. Extra compensation is needed for grid boundary effects, even if all grid blocks are equally sized.
Simulation analysis of a microcomputer-based, low-cost Omega navigation system
NASA Technical Reports Server (NTRS)
Lilley, R. W.; Salter, R. J., Jr.
1976-01-01
The current status of research on a proposed micro-computer-based, low-cost Omega Navigation System (ONS) is described. The design approach emphasizes minimum hardware, maximum software, and the use of a low-cost, commercially-available microcomputer. Currently under investigation is the implementation of a low-cost navigation processor and its interface with an omega sensor to complete the hardware-based ONS. Sensor processor functions are simulated to determine how many of the sensor processor functions can be handled by innovative software. An input data base of live Omega ground and flight test data was created. The Omega sensor and microcomputer interface modules used to collect the data are functionally described. Automatic synchronization to the Omega transmission pattern is described as an example of the algorithms developed using this data base.
Fenik, Victor B; Kubin, Leszek
2009-03-01
Carbachol, a cholinergic agonist, and GABA(A) receptor antagonists injected into the pontine dorsomedial reticular formation can trigger rapid eye movement (REM) sleep-like state. Data suggest that GABAergic and cholinergic effects interact to produce this effect but the sites where this occurs have not been delineated. In urethane-anesthetized rats, in which carbachol effectively elicits REM sleep-like episodes (REMSLE), we tested the ability of 10 nL microinjections of carbachol (10 mm) and bicuculline (0.5 or 2 mm) to elicit REMSLE at 47 sites located within the dorsal pontine reticular formation at the levels -8.00 to -10.80 from bregma (B) (Paxinos and Watson, The Rat Brain in Stereotaxic Coordinates, Academic Press, San Diego, 1997). At rostral levels, most carbachol and some bicuculline injections elicited REMSLE with latencies that gradually decreased from 242 to 12 s for carbachol and from 908 to 38 s for bicuculline for more caudal injection sites. As the latencies decreased, the durations of bicuculline-elicited REMSLE increased from 104 s to over 38 min, and the effect was dose dependent, whereas the duration of carbachol-elicited REMSLE changed little (104-354 s). Plots of REMSLE latency versus the antero-posterior coordinates revealed that both drugs were maximally effective near B-8.80. At levels caudal to B-8.80, carbachol was effective at few sites, whereas bicuculline-elicited REMSLE to at least B-9.30 level. Thus, the bicuculline-sensitive sites extended further caudally than those for carbachol and antagonism of GABA(A) receptors both triggered REMSLE and controlled their duration, whereas carbachol effects on REMSLE duration were small or limited by its concurrent REMSLE-opposing actions.
FENIK, VICTOR B.; KUBIN, LESZEK
2017-01-01
SUMMARY Carbachol, a cholinergic agonist, and GABAA receptor antagonists injected into the pontine dorsomedial reticular formation can trigger rapid eye movement (REM) sleep-like state. Data suggest that GABAergic and cholinergic effects interact to produce this effect but the sites where this occurs have not been delineated. In urethane-anesthetized rats, in which carbachol effectively elicits REM sleep-like episodes (REMSLE), we tested the ability of 10 nL microinjections of carbachol (10 mM) and bicuculline (0.5 or 2 mM) to elicit REMSLE at 47 sites located within the dorsal pontine reticular formation at the levels −8.00 to −10.80 from bregma (B) (Paxinos and Watson, The Rat Brain in Stereotaxic Coordinates, Academic Press, San Diego, 1997). At rostral levels, most carbachol and some bicuculline injections elicited REMSLE with latencies that gradually decreased from 242 to 12 s for carbachol and from 908 to 38 s for bicuculline for more caudal injection sites. As the latencies decreased, the durations of bicuculline-elicited REMSLE increased from 104 s to over 38 min, and the effect was dose dependent, whereas the duration of carbachol-elicited REMSLE changed little (104– 354 s). Plots of REMSLE latency versus the antero-posterior coordinates revealed that both drugs were maximally effective near B-8.80. At levels caudal to B-8.80, carbachol was effective at few sites, whereas bicuculline-elicited REMSLE to at least B-9.30 level. Thus, the bicuculline-sensitive sites extended further caudally than those for carbachol and antagonism of GABAA receptors both triggered REMSLE and controlled their duration, whereas carbachol effects on REMSLE duration were small or limited by its concurrent REMSLE-opposing actions. PMID:19021854
Robust media processing on programmable power-constrained systems
NASA Astrophysics Data System (ADS)
McVeigh, Jeff
2005-03-01
To achieve consumer-level quality, media systems must process continuous streams of audio and video data while maintaining exacting tolerances on sampling rate, jitter, synchronization, and latency. While it is relatively straightforward to design fixed-function hardware implementations to satisfy worst-case conditions, there is a growing trend to utilize programmable multi-tasking solutions for media applications. The flexibility of these systems enables support for multiple current and future media formats, which can reduce design costs and time-to-market. This paper provides practical engineering solutions to achieve robust media processing on such systems, with specific attention given to power-constrained platforms. The techniques covered in this article utilize the fundamental concepts of algorithm and software optimization, software/hardware partitioning, stream buffering, hierarchical prioritization, and system resource and power management. A novel enhancement to dynamically adjust processor voltage and frequency based on buffer fullness to reduce system power consumption is examined in detail. The application of these techniques is provided in a case study of a portable video player implementation based on a general-purpose processor running a non real-time operating system that achieves robust playback of synchronized H.264 video and MP3 audio from local storage and streaming over 802.11.
Accelerating Climate and Weather Simulations through Hybrid Computing
NASA Technical Reports Server (NTRS)
Zhou, Shujia; Cruz, Carlos; Duffy, Daniel; Tucker, Robert; Purcell, Mark
2011-01-01
Unconventional multi- and many-core processors (e.g. IBM (R) Cell B.E.(TM) and NVIDIA (R) GPU) have emerged as effective accelerators in trial climate and weather simulations. Yet these climate and weather models typically run on parallel computers with conventional processors (e.g. Intel, AMD, and IBM) using Message Passing Interface. To address challenges involved in efficiently and easily connecting accelerators to parallel computers, we investigated using IBM's Dynamic Application Virtualization (TM) (IBM DAV) software in a prototype hybrid computing system with representative climate and weather model components. The hybrid system comprises two Intel blades and two IBM QS22 Cell B.E. blades, connected with both InfiniBand(R) (IB) and 1-Gigabit Ethernet. The system significantly accelerates a solar radiation model component by offloading compute-intensive calculations to the Cell blades. Systematic tests show that IBM DAV can seamlessly offload compute-intensive calculations from Intel blades to Cell B.E. blades in a scalable, load-balanced manner. However, noticeable communication overhead was observed, mainly due to IP over the IB protocol. Full utilization of IB Sockets Direct Protocol and the lower latency production version of IBM DAV will reduce this overhead.
Massively parallel processor networks with optical express channels
Deri, R.J.; Brooks, E.D. III; Haigh, R.E.; DeGroot, A.J.
1999-08-24
An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination. 3 figs.
Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation
Bender, Michael A.; Berry, Jonathan W.; Hammond, Simon D.; ...
2017-01-03
A challenge in computer architecture is that processors often cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. With this motivation and the realization that traditional architectures (with all DRAM reachable only via bus) are insufficient to feed groups of modern processing units, vendors have introduced a variety of non-DDR 3D memory technologies (Hybrid Memory Cube (HMC),Wide I/O 2, High Bandwidth Memory (HBM)). These offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. We will call these solutions “near-memory,” andmore » if user-addressable, “scratchpad.” High-performance systems on the market now offer two levels of main memory: near-memory on package and traditional DRAM further away. In the near term we expect the latencies near-memory and DRAM to be similar. Here, it is natural to think of near-memory as another module on the DRAM level of the memory hierarchy. Vendors are expected to offer modes in which the near memory is used as cache, but we believe that this will be inefficient.« less
Massively parallel processor networks with optical express channels
Deri, Robert J.; Brooks, III, Eugene D.; Haigh, Ronald E.; DeGroot, Anthony J.
1999-01-01
An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination.
A Performance Prediction Model for a Fault-Tolerant Computer During Recovery and Restoration
NASA Technical Reports Server (NTRS)
Obando, Rodrigo A.; Stoughton, John W.
1995-01-01
The modeling and design of a fault-tolerant multiprocessor system is addressed. Of interest is the behavior of the system during recovery and restoration after a fault has occurred. The multiprocessor systems are based on the Algorithm to Architecture Mapping Model (ATAMM) and the fault considered is the death of a processor. The developed model is useful in the determination of performance bounds of the system during recovery and restoration. The performance bounds include time to recover from the fault, time to restore the system, and determination of any permanent delay in the input to output latency after the system has regained steady state. Implementation of an ATAMM based computer was developed for a four-processor generic VHSIC spaceborne computer (GVSC) as the target system. A simulation of the GVSC was also written on the code used in the ATAMM Multicomputer Operating System (AMOS). The simulation is used to verify the new model for tracking the propagation of the delay through the system and predicting the behavior of the transient state of recovery and restoration. The model is shown to accurately predict the transient behavior of an ATAMM based multicomputer during recovery and restoration.
Baseband processor development for the Advanced Communications Satellite Program
NASA Technical Reports Server (NTRS)
Moat, D.; Sabourin, D.; Stilwell, J.; Mccallister, R.; Borota, M.
1982-01-01
An onboard-baseband-processor concept for a satellite-switched time-division-multiple-access (SS-TDMA) communication system was developed for NASA Lewis Research Center. The baseband processor routes and controls traffic on an individual message basis while providing significant advantages in improved link margins and system flexibility. Key technology developments required to prove the flight readiness of the baseband-processor design are being verified in a baseband-processor proof-of-concept model. These technology developments include serial MSK modems, Clos-type baseband routing switch, a single-chip CMOS maximum-likelihood convolutional decoder, and custom LSL implementation of high-speed, low-power ECL building blocks.
Domeier, Timothy L; Maxwell, Joshua T; Blatter, Lothar A
2012-01-01
β-Adrenergic signalling induces positive inotropic effects on the heart that associate with pro-arrhythmic spontaneous Ca2+ waves. A threshold level of sarcoplasmic reticulum (SR) Ca2+ ([Ca2+]SR) is necessary to trigger Ca2+ waves, and whether the increased incidence of Ca2+ waves during β-adrenergic stimulation is due to an alteration in this threshold remains controversial. Using the low-affinity Ca2+ indicator fluo-5N entrapped within the SR of rabbit ventricular myocytes, we addressed this controversy by directly monitoring [Ca2+]SR and Ca2+ waves during β-adrenergic stimulation. Electrical pacing in elevated extracellular Ca2+ ([Ca2+]o= 7 mm) was used to increase [Ca2+]SR to the threshold where Ca2+ waves were consistently observed. The β-adrenergic agonist isoproterenol (ISO; 1 μm) increased [Ca2+]SR well above the control threshold and consistently triggered Ca2+ waves. However, when [Ca2+]SR was subsequently lowered in the presence of ISO (by lowering [Ca2+]o to 1 mm and partially inhibiting sarcoplasmic/endoplasmic reticulum calcium ATPase with cyclopiazonic acid or thapsigargin), Ca2+ waves ceased to occur at a [Ca2+]SR that was higher than the control threshold. Furthermore, for a set [Ca2+]SR level the refractoriness of wave occurrence (Ca2+ wave latency) was prolonged during β-adrenergic stimulation, and was highly dependent on the extent that [Ca]SR exceeded the wave threshold. These data show that acute β-adrenergic stimulation increases the [Ca2+]SR threshold for Ca2+ waves, and therefore the primary cause of Ca2+ waves is the robust increase in [Ca2+]SR above this higher threshold level. Elevation of the [Ca2+]SR wave threshold and prolongation of wave latency represent potentially protective mechanisms against pro-arrhythmogenic Ca2+ release during β-adrenergic stimulation. PMID:22988136
Laterodorsal nucleus of the thalamus: A processor of somatosensory inputs.
Bezdudnaya, Tatiana; Keller, Asaf
2008-04-20
The laterodorsal (LD) nucleus of the thalamus has been considered a "higher order" nucleus that provides inputs to limbic cortical areas. Although its functions are largely unknown, it is often considered to be involved in spatial learning and memory. Here we provide evidence that LD is part of a hitherto unknown pathway for processing somatosensory information. Juxtacellular and extracellular recordings from LD neurons reveal that they respond to vibrissa stimulation with short latency (median = 7 ms) and large magnitude responses (median = 1.2 spikes/stimulus). Most neurons (62%) had large receptive fields, responding to six and more individual vibrissae. Electrical stimulation of the trigeminal nucleus interpolaris (SpVi) evoked short latency responses (median = 3.8 ms) in vibrissa-responsive LD neurons. Labeling produced by anterograde and retrograde neuroanatomical tracers confirmed that LD neurons receive direct inputs from SpVi. Electrophysiological and neuroanatomical analyses revealed also that LD projects upon the cingulate and retrosplenial cortex, but has only sparse projections to the barrel cortex. These findings suggest that LD is part of a novel processing stream involved in spatial orientation and learning related to somatosensory cues. (c) 2008 Wiley-Liss, Inc.
HodDB: Design and Analysis of a Query Processor for Brick.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fierro, Gabriel; Culler, David
Brick is a recently proposed metadata schema and ontology for describing building components and the relationships between them. It represents buildings as directed labeled graphs using the RDF data model. Using the SPARQL query language, building-agnostic applications query a Brick graph to discover the set of resources and relationships they require to operate. Latency-sensitive applications, such as user interfaces, demand response and modelpredictive control, require fast queries — conventionally less than 100ms. We benchmark a set of popular open-source and commercial SPARQL databases against three real Brick models using seven application queries and find that none of them meet thismore » performance target. This lack of performance can be attributed to design decisions that optimize for queries over large graphs consisting of billions of triples, but give poor spatial locality and join performance on the small dense graphs typical of Brick. We present the design and evaluation of HodDB, a RDF/SPARQL database for Brick built over a node-based index structure. HodDB performs Brick queries 3-700x faster than leading SPARQL databases and consistently meets the 100ms threshold, enabling the portability of important latency-sensitive building applications.« less
HyspIRI Low Latency Concept and Benchmarks
NASA Technical Reports Server (NTRS)
Mandl, Dan
2010-01-01
Topics include HyspIRI low latency data ops concept, HyspIRI data flow, ongoing efforts, experiment with Web Coverage Processing Service (WCPS) approach to injecting new algorithms into SensorWeb, low fidelity HyspIRI IPM testbed, compute cloud testbed, open cloud testbed environment, Global Lambda Integrated Facility (GLIF) and OCC collaboration with Starlight, delay tolerant network (DTN) protocol benchmarking, and EO-1 configuration for preliminary DTN prototype.
Auditory Middle Latency Response and Phonological Awareness in Students with Learning Disabilities
Romero, Ana Carla Leite; Funayama, Carolina Araújo Rodrigues; Capellini, Simone Aparecida; Frizzo, Ana Claudia Figueiredo
2015-01-01
Introduction Behavioral tests of auditory processing have been applied in schools and highlight the association between phonological awareness abilities and auditory processing, confirming that low performance on phonological awareness tests may be due to low performance on auditory processing tests. Objective To characterize the auditory middle latency response and the phonological awareness tests and to investigate correlations between responses in a group of children with learning disorders. Methods The study included 25 students with learning disabilities. Phonological awareness and auditory middle latency response were tested with electrodes placed on the left and right hemispheres. The correlation between the measurements was performed using the Spearman rank correlation coefficient. Results There is some correlation between the tests, especially between the Pa component and syllabic awareness, where moderate negative correlation is observed. Conclusion In this study, when phonological awareness subtests were performed, specifically phonemic awareness, the students showed a low score for the age group, although for the objective examination, prolonged Pa latency in the contralateral via was observed. Negative weak to moderate correlation for Pa wave latency was observed, as was positive weak correlation for Na-Pa amplitude. PMID:26491479
Li, Yan; Alam, Monzurul; Guo, Shanshan; Ting, K H; He, Jufang
2014-07-03
Lower motor neurons in the spinal cord lose supraspinal inputs after complete spinal cord injury, leading to a loss of volitional control below the injury site. Extensive locomotor training with spinal cord stimulation can restore locomotion function after spinal cord injury in humans and animals. However, this locomotion is non-voluntary, meaning that subjects cannot control stimulation via their natural "intent". A recent study demonstrated an advanced system that triggers a stimulator using forelimb stepping electromyographic patterns to restore quadrupedal walking in rats with spinal cord transection. However, this indirect source of "intent" may mean that other non-stepping forelimb activities may false-trigger the spinal stimulator and thus produce unwanted hindlimb movements. We hypothesized that there are distinguishable neural activities in the primary motor cortex during treadmill walking, even after low-thoracic spinal transection in adult guinea pigs. We developed an electronic spinal bridge, called "Motolink", which detects these neural patterns and triggers a "spinal" stimulator for hindlimb movement. This hardware can be head-mounted or carried in a backpack. Neural data were processed in real-time and transmitted to a computer for analysis by an embedded processor. Off-line neural spike analysis was conducted to calculate and preset the spike threshold for "Motolink" hardware. We identified correlated activities of primary motor cortex neurons during treadmill walking of guinea pigs with spinal cord transection. These neural activities were used to predict the kinematic states of the animals. The appropriate selection of spike threshold value enabled the "Motolink" system to detect the neural "intent" of walking, which triggered electrical stimulation of the spinal cord and induced stepping-like hindlimb movements. We present a direct cortical "intent"-driven electronic spinal bridge to restore hindlimb locomotion after complete spinal cord injury.
Implementing An Image Understanding System Architecture Using Pipe
NASA Astrophysics Data System (ADS)
Luck, Randall L.
1988-03-01
This paper will describe PIPE and how it can be used to implement an image understanding system. Image understanding is the process of developing a description of an image in order to make decisions about its contents. The tasks of image understanding are generally split into low level vision and high level vision. Low level vision is performed by PIPE -a high performance parallel processor with an architecture specifically designed for processing video images at up to 60 fields per second. High level vision is performed by one of several types of serial or parallel computers - depending on the application. An additional processor called ISMAP performs the conversion from iconic image space to symbolic feature space. ISMAP plugs into one of PIPE's slots and is memory mapped into the high level processor. Thus it forms the high speed link between the low and high level vision processors. The mechanisms for bottom-up, data driven processing and top-down, model driven processing are discussed.
Radiation Hardened Electronics for Extreme Environments
NASA Technical Reports Server (NTRS)
Keys, Andrew S.; Watson, Michael D.
2007-01-01
The Radiation Hardened Electronics for Space Environments (RHESE) project consists of a series of tasks designed to develop and mature a broad spectrum of radiation hardened and low temperature electronics technologies. Three approaches are being taken to address radiation hardening: improved material hardness, design techniques to improve radiation tolerance, and software methods to improve radiation tolerance. Within these approaches various technology products are being addressed including Field Programmable Gate Arrays (FPGA), Field Programmable Analog Arrays (FPAA), MEMS Serial Processors, Reconfigurable Processors, and Parallel Processors. In addition to radiation hardening, low temperature extremes are addressed with a focus on material and design approaches.
Design of a ``Digital Atlas Vme Electronics'' (DAVE) module
NASA Astrophysics Data System (ADS)
Goodrick, M.; Robinson, D.; Shaw, R.; Postranecky, M.; Warren, M.
2012-01-01
ATLAS-SCT has developed a new ATLAS trigger card, 'Digital Atlas Vme Electronics' (``DAVE''). The unit is designed to provide a versatile array of interface and logic resources, including a large FPGA. It interfaces to both VME bus and USB hosts. DAVE aims to provide exact ATLAS CTP (ATLAS Central Trigger Processor) functionality, with random trigger, simple and complex deadtime, ECR (Event Counter Reset), BCR (Bunch Counter Reset) etc. being generated to give exactly the same conditions in standalone running as experienced in combined runs. DAVE provides additional hardware and a large amount of free firmware resource to allow users to add or change functionality. The combination of the large number of individually programmable inputs and outputs in various formats, with very large external RAM and other components all connected to the FPGA, also makes DAVE a powerful and versatile FPGA utility card.
Overview and Evaluation of Bluetooth Low Energy: An Emerging Low-Power Wireless Technology
Gomez, Carles; Oller, Joaquim; Paradells, Josep
2012-01-01
Bluetooth Low Energy (BLE) is an emerging low-power wireless technology developed for short-range control and monitoring applications that is expected to be incorporated into billions of devices in the next few years. This paper describes the main features of BLE, explores its potential applications, and investigates the impact of various critical parameters on its performance. BLE represents a trade-off between energy consumption, latency, piconet size, and throughput that mainly depends on parameters such as connInterval and connSlaveLatency. According to theoretical results, the lifetime of a BLE device powered by a coin cell battery ranges between 2.0 days and 14.1 years. The number of simultaneous slaves per master ranges between 2 and 5,917. The minimum latency for a master to obtain a sensor reading is 676 μs, although simulation results show that, under high bit error rate, average latency increases by up to three orders of magnitude. The paper provides experimental results that complement the theoretical and simulation findings, and indicates implementation constraints that may reduce BLE performance.
O’NEILL, WILLIAM E.; BRIMIJOIN, W. OWEN
2014-01-01
Mustached bats emit echolocation and communication calls containing both constant frequency (CF) and frequency-modulated (FM) components. Previously we found that 86% of neurons in the ventral division of the external nucleus of the inferior colliculus (ICXv) were directionally selective for linear FM sweeps and that selectivity was dependent on sweep rate. The ICXv projects to the suprageniculate nucleus (Sg) of the medial geniculate body. In this study, we isolated 37 single units in the Sg and measured their responses to best excitatory frequency (BEF) tones and linear 12-kHz upward and downward FM sweeps centered on the BEF. Sweeps were presented at durations of 30, 12, and 4 ms, yielding modulation rates of 400, 1,000, and 3,000 kHz/s. Spike count versus level functions were obtained at each modulation rate and compared with BEF controls. Sg units responded well to both tones and FM sweeps. BEFs clustered at 58 kHz, corresponding to the dominant CF component of the sonar signal. Spike count functions for both tones and sweeps were predominantly non-monotonic. FM directional selectivity was significant in 53–78% of the units, depending on modulation rate and level. Units were classified as up-selective (52%), down-selective (24%), or bi-directional (non-selective, 16%); a few units (8%) showed preferences that were either rate- or level-dependent. Most units showed consistent directional preferences at all SPLs and modulation rates tested, but typically showed stronger selectivity at lower sweep rates. Directional preferences were attributable to suppression of activity by sweeps in the non-preferred direction (~80% of units) and/or facilitation by sweeps in the preferred direction (~20–30%). Latencies for BEF tones ranged from 4.9 to 25.7 ms. Latencies for FM sweeps typically varied linearly with sweep duration. Most FM latency-duration functions had slopes ranging from 0.4 to 0.6, suggesting that the responses were triggered by the BEF. Latencies for BEF tones and FM sweeps were significantly correlated in most Sg units, i.e., the response to FM was temporally related to the occurrence of the BEF in the FM sweep. FM latency declined relative to BEF latency as modulation rate increased, suggesting that at higher rates response is triggered by frequencies in the sweep preceding the BEF. We conclude that Sg and ICXv units have similar, though not identical, response properties. Sg units are predominantly upsweep selective and could respond to either or both the CF and FM components in biosonar signals in a number of echolocation scenarios, as well as to a variety of communication sounds. PMID:12091543
2005 6th Annual Science and Engineering Technology Conference
2005-04-21
BioFAC VBAIDS Hybrid: PCR/Immuno Fast PCR Fast Immunoassay Mass Spec (Pyrolysis) SIBS UV -LIF IR Fluorochrome Charge Detect. BioCADS Trigger Advanced...Weights Beam forming Signal Processing mapped to GPU architecture Vector Processor STAP (STAP-BOY) GaN High Frequency Transistor (WBG-RF) UV Laser...Service anti- counterfeiting • Embedded security strips Technology Limitations and Barriers • Training and cost (training intensive) Land Borders North Land
SABRE: a bio-inspired fault-tolerant electronic architecture.
Bremner, P; Liu, Y; Samie, M; Dragffy, G; Pipe, A G; Tempesti, G; Timmis, J; Tyrrell, A M
2013-03-01
As electronic devices become increasingly complex, ensuring their reliable, fault-free operation is becoming correspondingly more challenging. It can be observed that, in spite of their complexity, biological systems are highly reliable and fault tolerant. Hence, we are motivated to take inspiration for biological systems in the design of electronic ones. In SABRE (self-healing cellular architectures for biologically inspired highly reliable electronic systems), we have designed a bio-inspired fault-tolerant hierarchical architecture for this purpose. As in biology, the foundation for the whole system is cellular in nature, with each cell able to detect faults in its operation and trigger intra-cellular or extra-cellular repair as required. At the next level in the hierarchy, arrays of cells are configured and controlled as function units in a transport triggered architecture (TTA), which is able to perform partial-dynamic reconfiguration to rectify problems that cannot be solved at the cellular level. Each TTA is, in turn, part of a larger multi-processor system which employs coarser grain reconfiguration to tolerate faults that cause a processor to fail. In this paper, we describe the details of operation of each layer of the SABRE hierarchy, and how these layers interact to provide a high systemic level of fault tolerance.
A macrochip interconnection network enabled by silicon nanophotonic devices.
Zheng, Xuezhe; Cunningham, John E; Koka, Pranay; Schwetman, Herb; Lexau, Jon; Ho, Ron; Shubin, Ivan; Krishnamoorthy, Ashok V; Yao, Jin; Mekis, Attila; Pinguet, Thierry
2010-03-01
We present an advanced wavelength-division multiplexing point-to-point network enabled by silicon nanophotonic devices. This network offers strictly non-blocking all-to-all connectivity while maximizing bisection bandwidth, making it ideal for multi-core and multi-processor interconnections. We introduce one of the key components, the nanophotonic grating coupler, and discuss, for the first time, how this device can be useful for practical implementations of the wavelength-division multiplexing network using optical proximity communications. Finite difference time-domain simulation of the nanophotonic grating coupler device indicates that it can be made compact (20 microm x 50 microm), low loss (3.8 dB), and broadband (100 nm). These couplers require subwavelength material modulation at the nanoscale to achieve the desired functionality. We show that optical proximity communication provides unmatched optical I/O bandwidth density to electrical chips, which enables the application of wavelength-division multiplexing point-to-point network in macrochip with unprecedented bandwidth-density. The envisioned physical implementation is discussed. The benefits of such an interconnect network include a 5-6x improvement in latency when compared to a purely electronic implementation. Performance analysis shows that the wavelength-division multiplexing point-to-point network offers better overall performance over other optical network architectures.
Digital Plasma Control System for Alcator C-Mod
NASA Astrophysics Data System (ADS)
Ferrara, M.; Wolfe, S.; Stillerman, J.; Fredian, T.; Hutchinson, I.
2004-11-01
A digital plasma control system (DPCS) has been designed to replace the present C-Mod system, which is based on hybrid analog-digital computer. The initial implementation of DPCS comprises two 64 channel, 16 bit, low-latency cPCI digitizers, each with 16 analog outputs, controlled by a rack-mounted single-processor Linux server, which also serves as the compute engine. A prototype system employing three older 32 channel digitizers was tested during the 2003-04 campaign. The hybrid's linear PID feedback system was emulated by IDL code executing a synchronous loop, using the same target waveforms and control parameters. Reliable real-time operation was accomplished under a standard Linux OS (RH9) by locking memory and disabling interrupts during the plasma pulse. The DPCS-computed outputs agreed to within a few percent with those produced by the hybrid system, except for discrepancies due to offsets and non-ideal behavior of the hybrid circuitry. The system operated reliably, with no sample loss, at more than twice the 10kHz design specification, providing extra time for implementing more advanced control algorithms. The code is fault-tolerant and produces consistent output waveforms even with 10% sample loss.
[NARCOLEPSY WITH CATAPLEXY: TYPE 1 NARCOLEPSY].
Dauvilliers, Yves; Lopez, Régis
2016-06-01
Narcolepsy with cataplexy or narcolepsy type 1 in a rare, disabling sleep disorder, with a prevalence of 20 to 30 per 100,000. Its onset peaks in the second decade. The main features are excessive daytime sleepiness and cataplexy or sudden less of muscle tone triggered by emotional situations. Other less consistent symptoms include hypnagogic hallucinations, sleep paralysis, disturbed nighttime sleep, and weight gain. Narcolepsy with cataplexy remains a clinical diagnosis but nighttime and daytime polysomnography (multiple sleep latency tests) are useful to document mean sleep latency below 8 min and at least two sleep-onset REM periods. HLA typing shows an association with HLA DQB1*0602 in more than 92% of cases but was not included in the new diagnostic criteria. In contrast, a low hypocretin-1/orexin-A levels (values below 110 pg/mL) in the cerebrospinal fluid was highly specific for narcolepsy with cataplexy and was included in the recent diagnostic criteria for narcolepsy. The deficiency of the hypocretin system is well-established in human narcoleptics with a reduction of cerebrospinal fluid hypocretin levels in relation with an early loss of hypocretin neurons. The cause of human narcolepsy remains unknown, however an autoimmune process in most probable acting on a highly genetic background with environmental factors such as streptococcal infections, and H1N1 AS03-adjuvanted vaccine named Pandemrix.
A light hydrocarbon fuel processor producing high-purity hydrogen
NASA Astrophysics Data System (ADS)
Löffler, Daniel G.; Taylor, Kyle; Mason, Dylan
This paper discusses the design process and presents performance data for a dual fuel (natural gas and LPG) fuel processor for PEM fuel cells delivering between 2 and 8 kW electric power in stationary applications. The fuel processor resulted from a series of design compromises made to address different design constraints. First, the product quality was selected; then, the unit operations needed to achieve that product quality were chosen from the pool of available technologies. Next, the specific equipment needed for each unit operation was selected. Finally, the unit operations were thermally integrated to achieve high thermal efficiency. Early in the design process, it was decided that the fuel processor would deliver high-purity hydrogen. Hydrogen can be separated from other gases by pressure-driven processes based on either selective adsorption or permeation. The pressure requirement made steam reforming (SR) the preferred reforming technology because it does not require compression of combustion air; therefore, steam reforming is more efficient in a high-pressure fuel processor than alternative technologies like autothermal reforming (ATR) or partial oxidation (POX), where the combustion occurs at the pressure of the process stream. A low-temperature pre-reformer reactor is needed upstream of a steam reformer to suppress coke formation; yet, low temperatures facilitate the formation of metal sulfides that deactivate the catalyst. For this reason, a desulfurization unit is needed upstream of the pre-reformer. Hydrogen separation was implemented using a palladium alloy membrane. Packed beds were chosen for the pre-reformer and reformer reactors primarily because of their low cost, relatively simple operation and low maintenance. Commercial, off-the-shelf balance of plant (BOP) components (pumps, valves, and heat exchangers) were used to integrate the unit operations. The fuel processor delivers up to 100 slm hydrogen >99.9% pure with <1 ppm CO, <3 ppm CO 2. The thermal efficiency is better than 67% operating at full load. This fuel processor has been integrated with a 5-kW fuel cell producing electricity and hot water.
In-beam experience with a highly granular DAQ and control network: TrbNet
NASA Astrophysics Data System (ADS)
Michel, J.; Korcyl, G.; Maier, L.; Traxler, M.
2013-02-01
Virtually all Data Acquisition Systems (DAQ) for nuclear and particle physics experiments use a large number of Field Programmable Gate Arrays (FPGAs) for data transport and more complex tasks as pattern recognition and data reduction. All these FPGAs in a large system have to share a common state like a trigger number or an epoch counter to keep the system synchronized for a consistent event/epoch building. Additionally, the collected data has to be transported with high bandwidth, optionally via the ubiquitous Ethernet protocol. Furthermore, the FPGAs' internal states and configuration memories have to be accessed for control and monitoring purposes. Another requirement for a modern DAQ-network is the fault-tolerance for intermittent data errors in the form of automatic retransmission of faulty data. As FPGAs suffer from Single Event Effects when exposed to ionizing particles, the system has to deal with failing FPGAs. The TrbNet protocol was developed taking all these requirements into account. Three virtual channels are merged on one physical medium: The trigger/epoch information is transported with the highest priority. The data channel is second in the priority order, while the control channel is the last. Combined with a small frame size of 80 bit this guarantees a low latency data transport: A system with 100 front-ends can be built with a one-way latency of 2.2 us. The TrbNet-protocol was implemented in each of the 550 FPGAs of the HADES upgrade project and has been successfully used during the Au+Au campaign in April 2012. With 2ṡ106/s Au-ions and 3% interaction ratio the accepted trigger rate is 10 kHz while data is written to storage with 150 MBytes/s. Errors are reliably mitigated via the implemented retransmission of packets and auto-shut-down of individual links. TrbNet was also used for full monitoring of the FEE status. The network stack is written in VHDL and was successfully deployed on various Lattice and Xilinx devices. The TrbNet is also used in other experiments, like systems for detector and electronics development for PANDA and CBM at FAIR. As a platform for such set-ups, e.g. for high-channel time measurement with 15 ps resolution, a generic FPGA platform (TRB3) has been developed.
Advanced Avionics and Processor Systems for a Flexible Space Exploration Architecture
NASA Technical Reports Server (NTRS)
Keys, Andrew S.; Adams, James H.; Smith, Leigh M.; Johnson, Michael A.; Cressler, John D.
2010-01-01
The Advanced Avionics and Processor Systems (AAPS) project, formerly known as the Radiation Hardened Electronics for Space Environments (RHESE) project, endeavors to develop advanced avionic and processor technologies anticipated to be used by NASA s currently evolving space exploration architectures. The AAPS project is a part of the Exploration Technology Development Program, which funds an entire suite of technologies that are aimed at enabling NASA s ability to explore beyond low earth orbit. NASA s Marshall Space Flight Center (MSFC) manages the AAPS project. AAPS uses a broad-scoped approach to developing avionic and processor systems. Investment areas include advanced electronic designs and technologies capable of providing environmental hardness, reconfigurable computing techniques, software tools for radiation effects assessment, and radiation environment modeling tools. Near-term emphasis within the multiple AAPS tasks focuses on developing prototype components using semiconductor processes and materials (such as Silicon-Germanium (SiGe)) to enhance a device s tolerance to radiation events and low temperature environments. As the SiGe technology will culminate in a delivered prototype this fiscal year, the project emphasis shifts its focus to developing low-power, high efficiency total processor hardening techniques. In addition to processor development, the project endeavors to demonstrate techniques applicable to reconfigurable computing and partially reconfigurable Field Programmable Gate Arrays (FPGAs). This capability enables avionic architectures the ability to develop FPGA-based, radiation tolerant processor boards that can serve in multiple physical locations throughout the spacecraft and perform multiple functions during the course of the mission. The individual tasks that comprise AAPS are diverse, yet united in the common endeavor to develop electronics capable of operating within the harsh environment of space. Specifically, the AAPS tasks for the Federal fiscal year of 2010 are: Silicon-Germanium (SiGe) Integrated Electronics for Extreme Environments, Modeling of Radiation Effects on Electronics, Radiation Hardened High Performance Processors (HPP), and and Reconfigurable Computing.
Epigenetic dysregulation of epstein-barr virus latency and development of autoimmune disease.
Niller, Hans Helmut; Wolf, Hans; Ay, Eva; Minarovits, Janos
2011-01-01
Epstein-Barr virus (EBV) is ahumanherpesvirus thatpersists in the memory B-cells of the majority of the world population in a latent form. Primary EBV infection is asymptomatic or causes a self-limiting disease, infectious mononucleosis. Virus latency is associated with a wide variety of neoplasms whereof some occur in immune suppressed individuals. Virus production does not occur in strict latency. The expression of latent viral oncoproteins and nontranslated RNAs is under epigenetic control via DNA methylation and histone modifications that results either in a complete silencing of the EBV genome in memory B cells, or in a cell-type dependent usage of a couple of latency promoters in tumor cells, germinal center B cells and lymphoblastoid cells (LCL, transformed by EBV in vitro). Both, latent and lytic EBV proteins elicit a strong immune response. In immune suppressed and infectious mononucleosis patients, an increased viral load can be detected in the blood. Enhanced lytic replication may result in new infection- and transformation-events and thus is a risk factor both for malignant transformation and the development of autoimmune diseases. An increased viral load or a changed presentation of a subset of lytic or latent EBV proteins that cross-react with cellular antigens may trigger pathogenic processes through molecular mimicry that result in multiple sclerosis (MS), systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA).
Maunsell, John H.R.
2012-01-01
Characterizing the functional connectivity between neurons is key for understanding brain function. We recorded spikes and local field potentials (LFP) from multi-electrode arrays implanted in monkey visual cortex to test the hypotheses that spikes generated outward traveling LFP waves and the strength of functional connectivity depended on stimulus contrast, as described recently. These hypotheses were proposed based on the observation that the latency of the peak negativity of the spike-triggered LFP average (STA) increased with distance between the spike and LFP electrodes, and the magnitude of the STA negativity and the distance over which it was observed decreased with increasing stimulus contrast. Detailed analysis of the shape of the STA, however, revealed contributions from two distinct sources – a transient negativity in the LFP locked to the spike (∼0 ms) that attenuated rapidly with distance, and a low frequency rhythm with peak negativity ∼25 ms after the spike that attenuated slowly with distance. The overall negative peak of the LFP, which combined both these components, shifted from ∼0 to ∼25 ms going from electrodes near the spike to electrodes far from the spike, giving an impression of a traveling wave, although the shift was fully explained by changing contributions from the two fixed components. The low frequency rhythm was attenuated during stimulus presentations, decreasing the overall magnitude of the STA. These results highlight the importance of accounting for the network activity while using STAs to determine functional connectivity. PMID:21880928
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-28
... proposes a pass-through reduction in the fees for connectivity to Toronto and Chicago venues as follows: (1... low latency telecommunication carriers. The Exchange is passing along the entire savings of the... passing on the reduction in low latency connectivity fees to the Toronto and Chicago venues to the members...
Mars Surface Operations via Low-Latency Telerobotics from Phobos
NASA Technical Reports Server (NTRS)
Wright, Michael; Lupisella, Mark
2016-01-01
To help assess the feasibility and timing of Low-Latency Telerobotics (LLT) operations on Mars via a Phobos telecommand base, operations concepts (ops cons) and timelines for several representative sequences for Mars surface operations have been developed. A summary of these LLT sequences and timelines will be presented, along with associated assumptions, operational considerations, and challenges.
Switch for serial or parallel communication networks
Crosette, D.B.
1994-07-19
A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination. 9 figs.
Switch for serial or parallel communication networks
Crosette, Dario B.
1994-01-01
A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination.
Klinke, R; Kral, A; Heid, S; Tillein, J; Hartmann, R
1999-09-10
In congenitally deaf cats, the central auditory system is deprived of acoustic input because of degeneration of the organ of Corti before the onset of hearing. Primary auditory afferents survive and can be stimulated electrically. By means of an intracochlear implant and an accompanying sound processor, congenitally deaf kittens were exposed to sounds and conditioned to respond to tones. After months of exposure to meaningful stimuli, the cortical activity in chronically implanted cats produced field potentials of higher amplitudes, expanded in area, developed long latency responses indicative of intracortical information processing, and showed more synaptic efficacy than in naïve, unstimulated deaf cats. The activity established by auditory experience resembles activity in hearing animals.
First experience of vectorizing electromagnetic physics models for detector simulation
NASA Astrophysics Data System (ADS)
Amadio, G.; Apostolakis, J.; Bandieramonte, M.; Bianchini, C.; Bitzes, G.; Brun, R.; Canal, P.; Carminati, F.; de Fine Licht, J.; Duhem, L.; Elvira, D.; Gheata, A.; Jun, S. Y.; Lima, G.; Novak, M.; Presbyterian, M.; Shadura, O.; Seghal, R.; Wenzel, S.
2015-12-01
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. The GeantV vector prototype for detector simulations has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth, parallelization needed to achieve optimal performance or memory access latency and speed. An additional challenge is to avoid the code duplication often inherent to supporting heterogeneous platforms. In this paper we present the first experience of vectorizing electromagnetic physics models developed for the GeantV project.
Low-Latency Lunar Surface Telerobotics from Earth-Moon Libration Points
NASA Technical Reports Server (NTRS)
Lester, Daniel; Thronson, Harley
2011-01-01
Concepts for a long-duration habitat at Earth-Moon LI or L2 have been advanced for a number of purposes. We propose here that such a facility could also have an important role for low-latency telerobotic control of lunar surface equipment, both for lunar science and development. With distances of about 60,000 km from the lunar surface, such sites offer light-time limited two-way control latencies of order 400 ms, making telerobotic control for those sites close to real time as perceived by a human operator. We point out that even for transcontinental teleoperated surgical procedures, which require operational precision and highly dexterous manipulation, control latencies of this order are considered adequate. Terrestrial telerobots that are used routinely for mining and manufacturing also involve control latencies of order several hundred milliseconds. For this reason, an Earth-Moon LI or L2 control node could build on the technology and experience base of commercially proven terrestrial ventures. A lunar libration-point telerobotic node could demonstrate exploration strategies that would eventually be used on Mars, and many other less hospitable destinations in the solar system. Libration-point telepresence for the Moon contrasts with lunar telerobotic control from the Earth, for which two-way control latencies are at least six times longer. For control latencies that long, telerobotic control efforts are of the "move-and-wait" variety, which is cognitively inferior to near real-time control.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gomez, Jonatan Piedra
2005-04-21
The new trigger processor, the Silicon Vertex Tracking (SVT), has dramatically improved the B physics capabilities of the upgraded CDF II Detector; for the first time in a hadron collider, the SVT has enabled the access to non-lepton-triggered B meson decays. Within the new available range of decay modes, the Bmore » $$0\\atop{s}$$ → D$$-\\atop{s}$$π + signature is of paramount importance in the measurement of the Δm s mixing frequency. The analysis reported here is a step towards the measurement of this frequency; two where our goals: carrying out the absolute calibration of the opposite side flavor taggers, used in the Δm s measurement; and measuring the B$$0\\atop{d}$$ mixing frequency in a B → Dπ sample, establishing the feasibility of the mixing measurement in this sample whose decay-length is strongly biased by the selective SVT trigger. We analyze a total integrated luminosity of 355 pb -1 collected with the CDF II Detector. By triggering on muons, using the conventional di-muon trigger; or displaced tracks, using the SVT trigger, we gather a sample rich in bottom and charm mesons.« less
Sponsel, William E.; Johnson, Susan L.; Trevino, Rick; Gonzalez, Alberto; Groth, Sylvia L.; Majcher, Carolyn; Fulton, Diane C.; Reilly, Matthew A.
2017-01-01
Purpose Both pattern electroretinography (PERG) and visual evoked potentials (VEP) can be performed using low- (15%; Lc) and high- (85%; Hc) contrast gratings that may preferentially stimulate the magno- and parvocellular pathways. We observed that among glaucomatous patients showing only one VEP latency deficit per eye, there appeared to be a very strong tendency for an Hc delay in one eye and an Lc delay in the other. Methods Diopsys NOVA-LX system was used to measure VEP Hc and Lc latency among a clinical glaucoma population to find all individuals with either a single Hc or Lc latency abnormality in each eye (group 1), or with greater than 0 and less than 4 Hc or Lc VEP latency abnormalities in the two eyes (group 2) to determine whether a significant inverse correlation existed for these values in either group. Hc and Lc PERG data were also evaluated to assess associated retinal ganglion cell responses. Results A strong inverse correlation (P = 0.0000003) was observed between the Hc and Lc VEP latency values among the 64 eyes in group 1. Group 2 provided a comparable result (n = 143; 286 eyes; P = 0.0005). PERG (n = 81; 162 eyes) also showed strong bilateral symmetry for magnitude values (P < 0.0001 for both Lc and Hc in groups 1 and 2). Conclusions Bilateral retention of both low-resolution/high-speed and high-resolution/low-speed function may persist with both eyes open despite symmetrically pathologic retinal ganglion cell PERG waveform asynchrony for Hc and Lc stimuli in the paired eyes. Translational Relevance Clinical electrophysiology strongly suggests binocular compensation for dynamic dysfunction operates under central nervous system (CNS) control in glaucoma. PMID:29134137
Mahjoub, Nada; Dhorne-Pollet, Sophie; Fuchs, Walter; Endale Ahanda, Marie-Laure; Lange, Elke; Klupp, Barbara; Arya, Anoop; Loveland, Jane E.; Lefevre, François; Mettenleiter, Thomas C.
2014-01-01
ABSTRACT The alphaherpesvirus pseudorabies virus (PrV) establishes latency primarily in neurons of trigeminal ganglia when only the transcription of the latency-associated transcript (LAT) locus is detected. Eleven microRNAs (miRNAs) cluster within the LAT, suggesting a role in establishment and/or maintenance of latency. We generated a mutant (M) PrV deleted of nine miRNA genes which displayed properties that were almost identical to those of the parental PrV wild type (WT) during propagation in vitro. Fifteen pigs were experimentally infected with either WT or M virus or were mock infected. Similar levels of virus excretion and host antibody response were observed in all infected animals. At 62 days postinfection, trigeminal ganglia were excised and profiled by deep sequencing and quantitative RT-PCR. Latency was established in all infected animals without evidence of viral reactivation, demonstrating that miRNAs are not essential for this process. Lower levels of the large latency transcript (LLT) were found in ganglia infected by M PrV than in those infected by WT PrV. All PrV miRNAs were expressed, with highest expression observed for prv-miR-LLT1, prv-miR-LLT2 (in WT ganglia), and prv-miR-LLT10 (in both WT and M ganglia). No evidence of differentially expressed porcine miRNAs was found. Fifty-four porcine genes were differentially expressed between WT, M, and control ganglia. Both viruses triggered a strong host immune response, but in M ganglia gene upregulation was prevalent. Pathway analyses indicated that several biofunctions, including those related to cell-mediated immune response and the migration of dendritic cells, were impaired in M ganglia. These findings are consistent with a function of the LAT locus in the modulation of host response for maintaining a latent state. IMPORTANCE This study provides a thorough reference on the establishment of latency by PrV in its natural host, the pig. Our results corroborate the evidence obtained from the study of several LAT mutants of other alphaherpesviruses encoding miRNAs from their LAT regions. Neither PrV miRNA expression nor high LLT expression levels are essential to achieve latency in trigeminal ganglia. Once latency is established by PrV, the only remarkable differences are found in the pattern of host response. This indicates that, as in herpes simplex virus, LAT functions as an immune evasion locus. PMID:25320324
Rapid prototyping and evaluation of programmable SIMD SDR processors in LISA
NASA Astrophysics Data System (ADS)
Chen, Ting; Liu, Hengzhu; Zhang, Botao; Liu, Dongpei
2013-03-01
With the development of international wireless communication standards, there is an increase in computational requirement for baseband signal processors. Time-to-market pressure makes it impossible to completely redesign new processors for the evolving standards. Due to its high flexibility and low power, software defined radio (SDR) digital signal processors have been proposed as promising technology to replace traditional ASIC and FPGA fashions. In addition, there are large numbers of parallel data processed in computation-intensive functions, which fosters the development of single instruction multiple data (SIMD) architecture in SDR platform. So a new way must be found to prototype the SDR processors efficiently. In this paper we present a bit-and-cycle accurate model of programmable SIMD SDR processors in a machine description language LISA. LISA is a language for instruction set architecture which can gain rapid model at architectural level. In order to evaluate the availability of our proposed processor, three common baseband functions, FFT, FIR digital filter and matrix multiplication have been mapped on the SDR platform. Analytical results showed that the SDR processor achieved the maximum of 47.1% performance boost relative to the opponent processor.
ELIPS: Toward a Sensor Fusion Processor on a Chip
NASA Technical Reports Server (NTRS)
Daud, Taher; Stoica, Adrian; Tyson, Thomas; Li, Wei-te; Fabunmi, James
1998-01-01
The paper presents the concept and initial tests from the hardware implementation of a low-power, high-speed reconfigurable sensor fusion processor. The Extended Logic Intelligent Processing System (ELIPS) processor is developed to seamlessly combine rule-based systems, fuzzy logic, and neural networks to achieve parallel fusion of sensor in compact low power VLSI. The first demonstration of the ELIPS concept targets interceptor functionality; other applications, mainly in robotics and autonomous systems are considered for the future. The main assumption behind ELIPS is that fuzzy, rule-based and neural forms of computation can serve as the main primitives of an "intelligent" processor. Thus, in the same way classic processors are designed to optimize the hardware implementation of a set of fundamental operations, ELIPS is developed as an efficient implementation of computational intelligence primitives, and relies on a set of fuzzy set, fuzzy inference and neural modules, built in programmable analog hardware. The hardware programmability allows the processor to reconfigure into different machines, taking the most efficient hardware implementation during each phase of information processing. Following software demonstrations on several interceptor data, three important ELIPS building blocks (a fuzzy set preprocessor, a rule-based fuzzy system and a neural network) have been fabricated in analog VLSI hardware and demonstrated microsecond-processing times.
Real-Time Data Processing in the muon system of the D0 detector.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Neeti Parashar et al.
2001-07-03
This paper presents a real-time application of the 16-bit fixed point Digital Signal Processors (DSPs), in the Muon System of the D0 detector located at the Fermilab Tevatron, presently the world's highest-energy hadron collider. As part of the Upgrade for a run beginning in the year 2000, the system is required to process data at an input event rate of 10 KHz without incurring significant deadtime in readout. The ADSP21csp01 processor has high I/O bandwidth, single cycle instruction execution and fast task switching support to provide efficient multisignal processing. The processor's internal memory consists of 4K words of Program Memorymore » and 4K words of Data Memory. In addition there is an external memory of 32K words for general event buffering and 16K words of Dual port Memory for input data queuing. This DSP fulfills the requirement of the Muon subdetector systems for data readout. All error handling, buffering, formatting and transferring of the data to the various trigger levels of the data acquisition system is done in software. The algorithms developed for the system complete these tasks in about 20 {micro}s per event.« less
Effect of poor control of film processors on mammographic image quality.
Kimme-Smith, C; Sun, H; Bassett, L W; Gold, R H
1992-11-01
With the increasingly stringent standards of image quality in mammography, film processor quality control is especially important. Current methods are not sufficient for ensuring good processing. The authors used a sensitometer and densitometer system to evaluate the performance of 22 processors at 16 mammographic facilities. Standard sensitometric values of two films were established, and processor performance was assessed for variations from these standards. Developer chemistry of each processor was analyzed and correlated with its sensitometric values. Ten processors were retested, and nine were found to be out of calibration. The developer components of hydroquinone, sulfites, bromide, and alkalinity varied the most, and low concentrations of hydroquinone were associated with lower average gradients at two facilities. Use of the sensitometer and densitometer system helps identify out-of-calibration processors, but further study is needed to correlate sensitometric values with developer component values. The authors believe that present quality control would be improved if sensitometric or other tests could be used to identify developer components that are out of calibration.
Food insecurity is associated with poor sleep outcomes among US adults.
Ding, Meng; Keiley, Margaret K; Garza, Kimberly B; Duffy, Patricia A; Zizza, Claire A
2015-03-01
Although food insecure (FI) adults are at risk of chronic conditions, little research attention is given to their health behaviors, such as sleep. We examined the associations between adult food security status and sleep duration, sleep latency, and sleep complaints reported to a health care professional. Our population-based sample included 5637 men and 5264 women (≥22 y) who participated in the NHANES 2005-2010. Food security status was assessed with USDA's 10-item adult Food Security Survey Module. Self-reported information about sleep duration, sleep latency, and sleep complaints to a health care professional were used as sleep outcomes. Multiple linear, stratified by sex, and logistic regression models were used to estimate the association between food security status and the 3 sleep outcomes. Very low food secure (FS) women reported significantly shorter sleep duration than fully FS women (difference: -30 ± 5.2 min; P < 0.01); however, no relation to sleep duration was observed among men. Among men, participants who were marginally FS (4 ± 1.1 min), low FS (4 ± 1.7 min), and very low FS (5 ± 1.8 min) reported significantly longer sleep latency than fully FS men (P < 0.05), but no association with sleep latency was observed among women. The divergent patterns in sleep duration and latency were likely because of our reference groups reporting undesirable sleep outcomes; fully FS men reported inadequate sleep and fully FS women reported long sleep latency. Among both men and women, marginally FS (OR: 1.64; 95% CI: 1.24, 2.16), low FS (OR: 1.63; 95% CI: 1.16, 2.30), and very low FS (OR: 1.99; 95% CI: 1.36, 2.92) participants were more likely to report sleep complaints than their fully FS counterparts (P < 0.05). Poor sleep quantity and quality may predispose FI adults to adverse health outcomes. © 2015 American Society for Nutrition.
Kolev, Ognyan I; Reschke, Millard F
2014-06-01
In an operational setting acquisition of visual targets using both head and eye movements can be driven by memorized sequence of commands - internal triggering (IT) or by commands issued through secondary operator - external triggering (ET). The primary objective of our research was to examine differences in target acquisition using IT compared with ET. Using a forced time optimal strategy eight subjects were required to acquire targets with angular offsets of ±20°, 30° and 60° along the horizontal plane in both IT and ET conditions. The data showed that the eye/head latency difference in IT condition is longer than that for ET, the target acquisition time is also longer for IT commands. Consistent with this finding were similar results when examining the peak head velocity and peak head acceleration. Under IT protocol head amplitude is higher than when using ET. In conclusion, the study demonstrates that the pattern of performance of target acquisition task is influenced by the way of command triggering. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Brown, Molly E.; Carroll, Mark L.; Escobar, Vanessa M.
2014-01-01
Since the advent of NASA's Earth Observing System, knowledge of the practical benefits of Earth science data has grown considerably. The community using NASA Earth science observations in applications has grown significantly, with increasing sophistication to serve national interests. Data latency, or how quickly communities receive science observations after acquisition, can have a direct impact on the applications and usability of the information. This study was conducted to determine how users are incorporating NASA data into applications and operational processes to benefit society beyond scientific research, as well as to determine the need for data latency of less than 12 h. The results of the analysis clearly show the significant benefit to society of serving the needs of the agricultural, emergency response, environmental monitoring and weather communities who use rapidly delivered, accurate Earth science data. The study also showed the potential of expanding the communities who use low latency NASA science data products to provide new ways of transforming data into information. These benefits can be achieved with a clear and consistent NASA policy on product latency.
NASA Astrophysics Data System (ADS)
Pruhs, Kirk
A particularly important emergent technology is heterogeneous processors (or cores), which many computer architects believe will be the dominant architectural design in the future. The main advantage of a heterogeneous architecture, relative to an architecture of identical processors, is that it allows for the inclusion of processors whose design is specialized for particular types of jobs, and for jobs to be assigned to a processor best suited for that job. Most notably, it is envisioned that these heterogeneous architectures will consist of a small number of high-power high-performance processors for critical jobs, and a larger number of lower-power lower-performance processors for less critical jobs. Naturally, the lower-power processors would be more energy efficient in terms of the computation performed per unit of energy expended, and would generate less heat per unit of computation. For a given area and power budget, heterogeneous designs can give significantly better performance for standard workloads. Moreover, even processors that were designed to be homogeneous, are increasingly likely to be heterogeneous at run time: the dominant underlying cause is the increasing variability in the fabrication process as the feature size is scaled down (although run time faults will also play a role). Since manufacturing yields would be unacceptably low if every processor/core was required to be perfect, and since there would be significant performance loss from derating the entire chip to the functioning of the least functional processor (which is what would be required in order to attain processor homogeneity), some processor heterogeneity seems inevitable in chips with many processors/cores.
NASA Astrophysics Data System (ADS)
Hewawasam, Kuravi; Mendillo, Christopher B.; Howe, Glenn A.; Martel, Jason; Finn, Susanna C.; Cook, Timothy A.; Chakrabarti, Supriya
2017-09-01
The Planetary Imaging Concept Testbed Using a Recoverable Experiment - Coronagraph (PICTURE-C) mission will directly image debris disks and exozodiacal dust around nearby stars from a high-altitude balloon using a vector vortex coronagraph. The PICTURE-C low-order wavefront control (LOWC) system will be used to correct time-varying low-order aberrations due to pointing jitter, gravity sag, thermal deformation, and the gondola pendulum motion. We present the hardware and software implementation of the low-order ShackHartmann and reflective Lyot stop sensors. Development of the high-speed image acquisition and processing system is discussed with the emphasis on the reduction of hardware and computational latencies through the use of a real-time operating system and optimized data handling. By characterizing all of the LOWC latencies, we describe techniques to achieve a framerate of 200 Hz with a mean latency of ˜378 μs
Workman, Aspen; Eudy, James; Smith, Lynette; Frizzo da Silva, Leticia; Sinani, Devis; Bricker, Halie; Cook, Emily; Doster, Alan
2012-01-01
Bovine herpesvirus 1 (BHV-1), an alphaherpesvirinae subfamily member, establishes latency in sensory neurons. Elevated corticosteroid levels, due to stress, reproducibly triggers reactivation from latency in the field. A single intravenous injection of the synthetic corticosteroid dexamethasone (DEX) to latently infected calves consistently induces reactivation from latency. Lytic cycle viral gene expression is detected in sensory neurons within 6 h after DEX treatment of latently infected calves. These observations suggested that DEX stimulated expression of cellular genes leads to lytic cycle viral gene expression and productive infection. In this study, a commercially available assay—Bovine Gene Chip—was used to compare cellular gene expression in the trigeminal ganglia (TG) of calves latently infected with BHV-1 versus DEX-treated animals. Relative to TG prepared from latently infected calves, 11 cellular genes were induced more than 10-fold 3 h after DEX treatment. Pentraxin three, a regulator of innate immunity and neurodegeneration, was stimulated 35- to 63-fold after 3 or 6 h of DEX treatment. Two transcription factors, promyelocytic leukemia zinc finger (PLZF) and Slug were induced more than 15-fold 3 h after DEX treatment. PLZF or Slug stimulated productive infection 20- or 5-fold, respectively, and Slug stimulated the late glycoprotein C promoter more than 10-fold. Additional DEX-induced transcription factors also stimulated productive infection and certain viral promoters. These studies suggest that DEX-inducible cellular transcription factors and/or signaling pathways stimulate lytic cycle viral gene expression, which subsequently leads to successful reactivation from latency in a small subset of latently infected neurons. PMID:22190728
Development of anticipatory postural adjustments during locomotion in children.
Hirschfeld, H; Forssberg, H
1992-08-01
1. Anticipatory postural adjustments were studied in children (6-14 yr of age) walking on a treadmill while pulling a handle. Electromyographs (EMGs) and movements were recorded from the left arm and leg. 2. Postural activity in the leg muscles preceded voluntary arm muscle activity in all age groups, including the youngest children (6 yr of age). The latency to both leg and arm muscle activity, from a triggering audio signal, decreased with age. 3. In older children the latency to both voluntary and postural activity was influenced by the phase of the step cycle. The shortest latency to the first activated postural muscle occurred during single support phase in combination with a long latency to arm muscle activity. 4. In the youngest children, there was no phase-dependent modulation of the latency to the activation of the postural muscles. The voluntary activity was delayed during the beginning of the support phase resulting in a long delay between leg and arm muscle activity. 5. The postural muscle activation pattern was modified in a phase-dependent manner in all children. Lateral gastrocnemius (LG) and hamstring muscles (HAM) were activated during the early support phase, whereas tibialis anterior (TA) and quadriceps (Q) muscles were activated during the late support phase and during the swing phase. However, in the 6-yr-old children, LG was also activated in the swing phase. LG was activated before the HAM activity in the youngest children but after HAM in 14-yr-old children and adults. 6. The occurrence of LG activity in postural responses before heel strike suggests an immature (nonplantigrade) gating of postural activity.(ABSTRACT TRUNCATED AT 250 WORDS)
A Trade Study of Two Membrane-Aerated Biological Water Processors
NASA Technical Reports Server (NTRS)
Allada, Ram; Lange, Kevin; Vega. Leticia; Roberts, Michael S.; Jackson, Andrew; Anderson, Molly; Pickering, Karen
2011-01-01
Biologically based systems are under evaluation as primary water processors for next generation life support systems due to their low power requirements and their inherent regenerative nature. This paper will summarize the results of two recent studies involving membrane aerated biological water processors and present results of a trade study comparing the two systems with regards to waste stream composition, nutrient loading and system design. Results of optimal configurations will be presented.
Reagor, David; Vasquez-Dominguez, Jose
2006-12-12
A through-the-earth communication system that includes a digital signal input device; a transmitter operating at a predetermined frequency sufficiently low to effectively penetrate useful distances through-the earth; a data compression circuit that is connected to an encoding processor; an amplifier that receives encoded output from the encoding processor for amplifying the output and transmitting the data to an antenna; and a receiver with an antenna, a band pass filter, a decoding processor, and a data decompressor.
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cerati, Giuseppe; Elmer, Peter; Krutelyov, Slava
2017-01-01
For over a decade now, physical and energy constraints have limited clock speed improvements in commodity microprocessors. Instead, chipmakers have been pushed into producing lower-power, multi-core processors such as Graphical Processing Units (GPU), ARM CPUs, and Intel MICs. Broad-based efforts from manufacturers and developers have been devoted to making these processors user-friendly enough to perform general computations. However, extracting performance from a larger number of cores, as well as specialized vector or SIMD units, requires special care in algorithm design and code optimization. One of the most computationally challenging problems in high-energy particle experiments is finding and fitting the charged-particlemore » tracks during event reconstruction. This is expected to become by far the dominant problem at the High-Luminosity Large Hadron Collider (HL-LHC), for example. Today the most common track finding methods are those based on the Kalman filter. Experience with Kalman techniques on real tracking detector systems has shown that they are robust and provide high physics performance. This is why they are currently in use at the LHC, both in the trigger and offine. Previously we reported on the significant parallel speedups that resulted from our investigations to adapt Kalman filters to track fitting and track building on Intel Xeon and Xeon Phi. Here, we discuss our progresses toward the understanding of these processors and the new developments to port the Kalman filter to NVIDIA GPUs.« less
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs
NASA Astrophysics Data System (ADS)
Cerati, Giuseppe; Elmer, Peter; Krutelyov, Slava; Lantz, Steven; Lefebvre, Matthieu; Masciovecchio, Mario; McDermott, Kevin; Riley, Daniel; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi
2017-08-01
For over a decade now, physical and energy constraints have limited clock speed improvements in commodity microprocessors. Instead, chipmakers have been pushed into producing lower-power, multi-core processors such as Graphical Processing Units (GPU), ARM CPUs, and Intel MICs. Broad-based efforts from manufacturers and developers have been devoted to making these processors user-friendly enough to perform general computations. However, extracting performance from a larger number of cores, as well as specialized vector or SIMD units, requires special care in algorithm design and code optimization. One of the most computationally challenging problems in high-energy particle experiments is finding and fitting the charged-particle tracks during event reconstruction. This is expected to become by far the dominant problem at the High-Luminosity Large Hadron Collider (HL-LHC), for example. Today the most common track finding methods are those based on the Kalman filter. Experience with Kalman techniques on real tracking detector systems has shown that they are robust and provide high physics performance. This is why they are currently in use at the LHC, both in the trigger and offine. Previously we reported on the significant parallel speedups that resulted from our investigations to adapt Kalman filters to track fitting and track building on Intel Xeon and Xeon Phi. Here, we discuss our progresses toward the understanding of these processors and the new developments to port the Kalman filter to NVIDIA GPUs.
Strassman, Barbara K; O'Dell, Katie
2012-01-01
Using a nonexperimental design, the researchers explored the effect of captioning as part of the writing process of individuals who are d/Deaf and hard of hearing. Sixty-nine d/Deaf and hard of hearing middle school students composed responses to four writing-to-learn activities in a word processor. Two compositions were revised and published with software that displayed texts as captions to digital images; two compositions were revised with a word processor and published on paper. Analysis showed increases in content-area vocabulary, text length, and inclusion of main ideas and details for texts revised in the captioning software. Given the nonexperimental design, it is not possible to determine the extent to which the results could be attributed to captioned revisions. However, the findings do suggest that the images acted as procedural facilitators, triggering recall of vocabulary and details.
Microcalorimeters with Germanium Thermistors for High Resolution Soft and Hard X-ray Astronomy
NASA Technical Reports Server (NTRS)
Silver, E.
2003-01-01
This is a progress report for the first year of a three year Space Research and Technology (SR&T) grant to continue the advancement of neutron transmutation doped (NTD-based) microcalorimeters. We have re-prioritized certain aspects of the statement of work and chose to emphasize issues of array development in the first year rather than wait until year two. Consequently, some of the projects scheduled for the first year were delayed to the second year. Here we report on our progress to: a) Build and test a 1 x 4 element array and to investigate electrical and thermal cross-talk; b) Build a multiplexed 4 channel analog pulse processor; c) Build a digital pulse processor that can accommodate 4 channels with independent triggers; d) Develop a proportional thermal baseline restoration system compatible with the constant voltage mode of microcalorimeter operation.
Mehraei, Golbarg; Gallardo, Andreu Paredes; Shinn-Cunningham, Barbara G.; Dau, Torsten
2017-01-01
In rodent models, acoustic exposure too modest to elevate hearing thresholds can nonetheless cause auditory nerve fiber deafferentation, interfering with the coding of supra-threshold sound. Low-spontaneous rate nerve fibers, important for encoding acoustic information at supra-threshold levels and in noise, are more susceptible to degeneration than high-spontaneous rate fibers. The change in auditory brainstem response (ABR) wave-V latency with noise level has been shown to be associated with auditory nerve deafferentation. Here, we measured ABR in a forward masking paradigm and evaluated wave-V latency changes with increasing masker-to-probe intervals. In the same listeners, behavioral forward masking detection thresholds were measured. We hypothesized that 1) auditory nerve fiber deafferentation increases forward masking thresholds and increases wave-V latency and 2) a preferential loss of low-SR fibers results in a faster recovery of wave-V latency as the slow contribution of these fibers is reduced. Results showed that in young audiometrically normal listeners, a larger change in wave-V latency with increasing masker-to-probe interval was related to a greater effect of a preceding masker behaviorally. Further, the amount of wave-V latency change with masker-to-probe interval was positively correlated with the rate of change in forward masking detection thresholds. Although we cannot rule out central contributions, these findings are consistent with the hypothesis that auditory nerve fiber deafferentation occurs in humans and may predict how well individuals can hear in noisy environments. PMID:28159652
A Low-Power Wearable Stand-Alone Tongue Drive System for People With Severe Disabilities.
Jafari, Ali; Buswell, Nathanael; Ghovanloo, Maysam; Mohsenin, Tinoosh
2018-02-01
This paper presents a low-power stand-alone tongue drive system (sTDS) used for individuals with severe disabilities to potentially control their environment such as computer, smartphone, and wheelchair using their voluntary tongue movements. A low-power local processor is proposed, which can perform signal processing to convert raw magnetic sensor signals to user-defined commands, on the sTDS wearable headset, rather than sending all raw data out to a PC or smartphone. The proposed sTDS significantly reduces the transmitter power consumption and subsequently increases the battery life. Assuming the sTDS user issues one command every 20 ms, the proposed local processor reduces the data volume that needs to be wirelessly transmitted by a factor of 64, from 9.6 to 0.15 kb/s. The proposed processor consists of three main blocks: serial peripheral interface bus for receiving raw data from magnetic sensors, external magnetic interference attenuation to attenuate external magnetic field from the raw magnetic signal, and a machine learning classifier for command detection. A proof-of-concept prototype sTDS has been implemented with a low-power IGLOO-nano field programmable gate array (FPGA), bluetooth low energy, battery and magnetic sensors on a headset, and tested. At clock frequency of 20 MHz, the processor takes 6.6 s and consumes 27 nJ for detecting a command with a detection accuracy of 96.9%. To further reduce power consumption, an application-specified integrated circuit processor for the sTDS is implemented at the postlayout level in 65-nm CMOS technology with 1-V power supply, and it consumes 0.43 mW, which is 10 lower than FPGA power consumption and occupies an area of only 0.016 mm.
Simulation Based Studies of Low Latency Teleoperations for NASA Exploration Missions
NASA Technical Reports Server (NTRS)
Gernhardt, Michael L.; Crues, Edwin Z.; Bielski, Paul; Dexter, Dan; Litaker, Harry L.; Chappell, Steven P.; Beaton, Kara H.; Bekdash, Omar S.
2017-01-01
Human exploration of Mars will involve both crewed and robotic systems. Many mission concepts involve the deployment and assembly of mission support assets prior to crew arrival on the surface. Some of these deployment and assembly activities will be performed autonomously while others will be performed using teleoperations. However, significant communications latencies between the Earth and Mars make teleoperations challenging. Alternatively, low latency teleoperations are possible from locations in Mars orbit like Mars' moons Phobos and Deimos. To explore these latency opportunities, NASA is conducting a series of studies to investigate the effects of latency on telerobotic deployment and assembly activities. These studies are being conducted in laboratory environments at NASA's Johnson Space Center (JSC), the Human Exploration Research Analog (HERA) at JSC and the NASA Extreme Environment Mission Operations (NEEMO) underwater habitat off the coast of Florida. The studies involve two human-in-the-loop interactive simulations developed by the NASA Exploration Systems Simulations (NExSyS) team at JSC. The first simulation investigates manipulation related activities while the second simulation investigates mobility related activities. The first simulation provides a simple real-time operator interface with displays and controls for a simulated 6 degree of freedom end effector. The initial version of the simulation uses a simple control mode to decouple the robotic kinematic constraints and a communications delay to model latency effects. This provides the basis for early testing with more detailed manipulation simulations planned for the future. Subjects are tested using five operating latencies that represent teleoperation conditions from local surface operations to orbital operations at Phobos, Deimos and ultimately high Martian orbit. Subject performance is measured and correlated with three distance-to-target zones of interest. Each zone represents a target distance ranging from beyond 10m in Zone 1, through 1 cm to contact in Zone 5 with a step size factor of 10. Collected data consists of both objective simulation data (time, distance, hand controller inputs, velocity) and subjective questionnaire data. The second simulation provides a simple real-time operator interface with displays and control of a simulated surface rover. The rover traverses a synthetic Mars-like terrain and must be maneuvered to avoid obstacles while progressing to its destination. Like the manipulator simulation, subjects are tested using five operating latencies that represent teleoperation conditions from local surface operations to orbital operations at Phobos, Deimos and ultimately high Martian orbit. The rover is also operated at three different traverse speeds to assess the correlation between latency and speed. Collected data consisted of both objective simulation data (time, distance, hand controller inputs, braking) and subjective questionnaire data. These studies are exploring relationships between task complexity, operating speeds, operator efficiencies, and communications latencies for low latency teleoperations in support of human planetary exploration. This paper presents early results from these studies along with the current observations and conclusions. These and planned future studies will help to inform NASA on the potential for low latency teleoperations to support human exploration of Mars and inform the design of robotic systems and exploration missions.
Event-Triggered Model Predictive Control for Embedded Artificial Pancreas Systems.
Chakrabarty, Ankush; Zavitsanou, Stamatina; Doyle, Francis J; Dassau, Eyal
2018-03-01
The development of artificial pancreas (AP) technology for deployment in low-energy, embedded devices is contingent upon selecting an efficient control algorithm for regulating glucose in people with type 1 diabetes mellitus. In this paper, we aim to lower the energy consumption of the AP by reducing controller updates, that is, the number of times the decision-making algorithm is invoked to compute an appropriate insulin dose. Physiological insights into glucose management are leveraged to design an event-triggered model predictive controller (MPC) that operates efficiently, without compromising patient safety. The proposed event-triggered MPC is deployed on a wearable platform. Its robustness to latent hypoglycemia, model mismatch, and meal misinformation is tested, with and without meal announcement, on the full version of the US-FDA accepted UVA/Padova metabolic simulator. The event-based controller remains on for 18 h of 41 h in closed loop with unannounced meals, while maintaining glucose in 70-180 mg/dL for 25 h, compared to 27 h for a standard MPC controller. With meal announcement, the time in 70-180 mg/dL is almost identical, with the controller operating a mere 25.88% of the time in comparison with a standard MPC. A novel control architecture for AP systems enables safe glycemic regulation with reduced processor computations. Our proposed framework integrated seamlessly with a wide variety of popular MPC variants reported in AP research, customizes tradeoff between glycemic regulation and efficacy according to prior design specifications, and eliminates judicious prior selection of controller sampling times.
Technology Developments in Radiation-Hardened Electronics for Space Environments
NASA Technical Reports Server (NTRS)
Keys, Andrew S.; Howell, Joe T.
2008-01-01
The Radiation Hardened Electronics for Space Environments (RHESE) project consists of a series of tasks designed to develop and mature a broad spectrum of radiation hardened and low temperature electronics technologies. Three approaches are being taken to address radiation hardening: improved material hardness, design techniques to improve radiation tolerance, and software methods to improve radiation tolerance. Within these approaches various technology products are being addressed including Field Programmable Gate Arrays (FPGA), Field Programmable Analog Arrays (FPAA), MEMS, Serial Processors, Reconfigurable Processors, and Parallel Processors. In addition to radiation hardening, low temperature extremes are addressed with a focus on material and design approaches. System level applications for the RHESE technology products are discussed.
Electronic Sleep Stage Classifiers: A Survey and VLSI Design Methodology.
Kassiri, Hossein; Chemparathy, Aditi; Salam, M Tariqus; Boyce, Richard; Adamantidis, Antoine; Genov, Roman
2017-02-01
First, existing sleep stage classifier sensors and algorithms are reviewed and compared in terms of classification accuracy, level of automation, implementation complexity, invasiveness, and targeted application. Next, the implementation of a miniature microsystem for low-latency automatic sleep stage classification in rodents is presented. The classification algorithm uses one EMG (electromyogram) and two EEG (electroencephalogram) signals as inputs in order to detect REM (rapid eye movement) sleep, and is optimized for low complexity and low power consumption. It is implemented in an on-board low-power FPGA connected to a multi-channel neural recording IC, to achieve low-latency (order of 1 ms or less) classification. Off-line experimental results using pre-recorded signals from nine mice show REM detection sensitivity and specificity of 81.69% and 93.86%, respectively, with the maximum latency of 39 [Formula: see text]. The device is designed to be used in a non-disruptive closed-loop REM sleep suppression microsystem, for future studies of the effects of REM sleep deprivation on memory consolidation.
The computational structural mechanics testbed architecture. Volume 2: The interface
NASA Technical Reports Server (NTRS)
Felippa, Carlos A.
1988-01-01
This is the third set of five volumes which describe the software architecture for the Computational Structural Mechanics Testbed. Derived from NICE, an integrated software system developed at Lockheed Palo Alto Research Laboratory, the architecture is composed of the command language CLAMP, the command language interpreter CLIP, and the data manager GAL. Volumes 1, 2, and 3 (NASA CR's 178384, 178385, and 178386, respectively) describe CLAMP and CLIP and the CLIP-processor interface. Volumes 4 and 5 (NASA CR's 178387 and 178388, respectively) describe GAL and its low-level I/O. CLAMP, an acronym for Command Language for Applied Mechanics Processors, is designed to control the flow of execution of processors written for NICE. Volume 3 describes the CLIP-Processor interface and related topics. It is intended only for processor developers.
Atcherson, Samuel R; Damji, Zohra; Upson, Steve
2011-11-01
We explored the feasibility of a subtraction technique described by Friesen and Picton to remove the cochlear implant (CI) artifact to long duration stimuli in the soundfield and using direct input all through the participant's preferred MAP. Friesen and Picton previously explored this technique by recording cortical potentials in four CI users with 1000 pulse per second (pps) stimuli, bypassing the speech processor. Cortical auditory evoked potentials (N1-P2) to 1000 Hz tones were recorded from a post-lingually deafened adult with three different stimulus presentation setups: soundfield to processor T-mic (SF), soundfield to lapel mic (SF-LM), and direct input (DI). Stimuli were presented at 65 dB SPL(A). The SF setup required stabilizing the head to minimize changes in magnitude for the CI artifact. The SF-LM and DI setups did not require head stabilization, but were evaluated as alternatives to the SF setup. Clear N1-P2 responses were obtained with comparable waveform morphologies, amplitudes, and latencies despite some differences in the magnitude of the CI artifact for the different stimulus presentation setups. The results of this study demonstrate that subtraction technique is feasible for recording N1-P2 responses in CI users, though further studies are needed for the three stimulation setups.
Adapting wave-front algorithms to efficiently utilize systems with deep communication hierarchies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerbyson, Darren J; Lang, Michael; Pakin, Scott
2009-01-01
Large-scale systems increasingly exhibit a differential between intra-chip and inter-chip communication performance. Processor-cores on the same socket are able to communicate at lower latencies, and with higher bandwidths, than cores on different sockets either within the same node or between nodes. A key challenge is to efficiently use this communication hierarchy and hence optimize performance. We consider here the class of applications that contain wave-front processing. In these applications data can only be processed after their upstream neighbors have been processed. Similar dependencies result between processors in which communication is required to pass boundary data downstream and whose cost ismore » typically impacted by the slowest communication channel in use. In this work we develop a novel hierarchical wave-front approach that reduces the use of slower communications in the hierarchy but at the cost of additional computation and higher use of on-chip communications. This tradeoff is explored using a performance model and an implementation on the Petascale Roadrunner system demonstrates a 27% performance improvement at full system-scale on a kernel application. The approach is generally applicable to large-scale multi-core and accelerated systems where a differential in system communication performance exists.« less
Prasad, Peeyush; Wijnholds, Stefan J
2013-06-13
The Amsterdam-ASTRON Radio Transient Facility And Analysis Centre (AARTFAAC) project aims to implement an all-sky monitor (ASM), using the low-frequency array (LOFAR) telescope. It will enable real-time, 24 × 7 monitoring for low-frequency radio transients over most of the sky locally visible to the LOFAR at time scales ranging from seconds to several days, and rapid triggering of follow-up observations with the full LOFAR on detection of potential transient candidates. These requirements pose several implementation challenges: imaging of an all-sky field of view, low latencies of processing, continuous availability and autonomous operation of the ASM. The first of these has already resulted in the correlator for the ASM being the largest in the world in terms of the number of input data streams. We have carried out test observations using existing LOFAR infrastructure, in order to quantify and constrain crucial instrumental design criteria for the ASM. In this study, we present an overview of the AARTFAAC data-processing pipeline and illustrate some of the aforementioned challenges by showing all-sky images obtained from one of the test observations. These results provide quantitative estimates of the capabilities of the instrument.
Control of HIV infection by IFN-α: implications for latency and a cure.
Bourke, Nollaig M; Napoletano, Silvia; Bannan, Ciaran; Ahmed, Suaad; Bergin, Colm; McKnight, Áine; Stevenson, Nigel J
2018-03-01
Viral infections, including HIV, trigger the production of type I interferons (IFNs), which in turn, activate a signalling cascade that ultimately culminates with the expression of anti-viral proteins. Mounting evidence suggests that type I IFNs, in particular IFN-α, play a pivotal role in limiting acute HIV infection. Highly active anti-retroviral treatment reduces viral load and increases life expectancy in HIV positive patients; however, it fails to fully eliminate latent HIV reservoirs. To revisit HIV as a curable disease, this article reviews a body of literature that highlights type I IFNs as mediators in the control of HIV infection, with particular focus on the anti-HIV restriction factors induced and/or activated by IFN-α. In addition, we discuss the relevance of type I IFN treatment in the context of HIV latency reversal, novel therapeutic intervention strategies and the potential for full HIV clearance.
Narcolepsy with cataplexy in a child with Charcot-Marie-Tooth disease. Case Report.
Zheng, Feixia; Wang, Shuang
2016-09-01
We report an 8-year-old boy diagnosed with both CMT1 and narcolepsy, which were not reported simultaneously presenting in one person. The boy presented with a history of increased suddenly falling frequency and excessive daytime sleepiness for 3 months. CMT1 was diagnosed by electrophysiology and genetic testing. Narcolepsy had not been diagnosed until the frequently falling caused by sudden and transient episodes of legs weakness triggered by emotion was found. Multiple sleep latency test showed multiple sleep onset REM periods with reduced sleep latency. When CMT1 and narcolepsy were coexist in an individual, the latter might be overlooked. Cataplexy caused by narcolepsy might be disregard as distal muscle weakness of CMT1. The daytime sleepiness might also be ignored. Therefore, we recommend that patients with sleep disorders should be queried about the symptoms of narcolepsy.
Muscoso, E G; Costanzo, E; Daniele, O; Maugeri, D; Natale, E; Caravaglios, G
2006-11-01
Few studies exist on ERPs and patients with subcortical vascular cognitive impairment (SVCI). This latter is a quite homogeneous subtype of vascular dementia whose cognitive profile is quite different from that of Alzheimer disease (AD). The present study aims at comparing the ERPs profile both in patients with SVCI and in patients with AD. ERPs and psychometric tests were collected from 39 healthy elderly controls, 51 patients with SVCI and 43 patients with AD. Subjects mentally count high pitched target tones that were randomly intermixed with low pitched frequent tones. We measured ERPs latencies (N1, P2, N2 and P3), and interpeak latencies (N1-P3, N1-P2, N1-N2). Grand averaged potentials in SVCI showed a significant increase of P3 latency. AD patients showed a prolongation of N1, P2, N2, P3 latencies. As far as interpeak latencies are concerned, SVCI patients showed a significant prolongation of N1-P3, AD patients had a significant increase of N1-N2, and N1-P3 intervals. When all patients were considered as a single group, correlation of neuropsychological tests scores showed a significant negative relationship between P300 latency and, respectively, Mini Mental Status Examination, auditive and visual span forward. In both groups, ERPs latency sensitivity, was low, whilst specificity values were quite high. Our finding suggest that these two dementing diseases have different electrophysiologic features that may be related to their specific underlying pathogenetic mechanism; in particular, we hypothesise that, differently from AD, P300 latency prolongation characterizes the early stage of SVCI. So, this ERPs approach could be helpful to detect early alterations of the attentional/working-memory functions in patients with subcortical ischaemic vascular disease.
High performance, low cost, self-contained, multipurpose PC based ground systems
NASA Technical Reports Server (NTRS)
Forman, Michael; Nickum, William; Troendly, Gregory
1993-01-01
The use of embedded processors greatly enhances the capabilities of personal computers when used for telemetry processing and command control center functions. Parallel architectures based on the use of transputers are shown to be very versatile and reusable, and the synergism between the PC and the embedded processor with transputers results in single unit, low cost workstations of 20 less than MIPS less than or equal to 1000.
2017-02-15
Maunz2 Quantum information processors promise fast algorithms for problems inaccessible to classical computers. But since qubits are noisy and error-prone...information processors have been demonstrated experimentally using superconducting circuits1–3, electrons in semiconductors4–6, trapped atoms and...qubit quantum information processor has been realized14, and single- qubit gates have demonstrated randomized benchmarking (RB) infidelities as low as 10
The ISS Water Processor Catalytic Reactor as a Post Processor for Advanced Water Reclamation Systems
NASA Technical Reports Server (NTRS)
Nalette, Tim; Snowdon, Doug; Pickering, Karen D.; Callahan, Michael
2007-01-01
Advanced water processors being developed for NASA s Exploration Initiative rely on phase change technologies and/or biological processes as the primary means of water reclamation. As a result of the phase change, volatile compounds will also be transported into the distillate product stream. The catalytic reactor assembly used in the International Space Station (ISS) water processor assembly, referred to as Volatile Removal Assembly (VRA), has demonstrated high efficiency oxidation of many of these volatile contaminants, such as low molecular weight alcohols and acetic acid, and is considered a viable post treatment system for all advanced water processors. To support this investigation, two ersatz solutions were defined to be used for further evaluation of the VRA. The first solution was developed as part of an internal research and development project at Hamilton Sundstrand (HS) and is based primarily on ISS experience related to the development of the VRA. The second ersatz solution was defined by NASA in support of a study contract to Hamilton Sundstrand to evaluate the VRA as a potential post processor for the Cascade Distillation system being developed by Honeywell. This second ersatz solution contains several low molecular weight alcohols, organic acids, and several inorganic species. A range of residence times, oxygen concentrations and operating temperatures have been studied with both ersatz solutions to provide addition performance capability of the VRA catalyst.
WDM mid-board optics for chip-to-chip wavelength routing interconnects in the H2020 ICT-STREAMS
NASA Astrophysics Data System (ADS)
Kanellos, G. T.; Pleros, N.
2017-02-01
Multi-socket server boards have emerged to increase the processing power density on the board level and further flatten the data center networks beyond leaf-spine architectures. Scaling however the number of processors per board puts current electronic technologies into challenge, as it requires high bandwidth interconnects and high throughput switches with increased number of ports that are currently unavailable. On-board optical interconnection has proved the potential to efficiently satisfy the bandwidth needs, but their use has been limited to parallel links without performing any smart routing functionality. With CWDM optical interconnects already a commodity, cyclical wavelength routing proposed to fit the datacom for rack-to-rack and board-to-board communication now becomes a promising on-board routing platform. ICT-STREAMS is a European research project that aims to combine WDM parallel on-board transceivers with a cyclical AWGR, in order to create a new board-level, chip-to-chip interconnection paradigm that will leverage WDM parallel transmission to a powerful wavelength routing platform capable to interconnect multiple processors with unprecedented bandwidth and throughput capacity. Direct, any-to-any, on-board interconnection of multiple processors will significantly contribute to further flatten the data centers and facilitate east-west communication. In the present communication, we present ICT-STREAMS on-board wavelength routing architecture for multiple chip-to-chip interconnections and evaluate the overall system performance in terms of throughput and latency for several schemes and traffic profiles. We also review recent advances of the ICT-STREAMS platform key-enabling technologies that span from Si in-plane lasers and polymer based electro-optical circuit boards to silicon photonics transceivers and photonic-crystal amplifiers.
NASA Astrophysics Data System (ADS)
Schultz, A.
2010-12-01
3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We describe our ongoing efforts to achieve massive parallelization on a novel hybrid GPU testbed machine currently configured with 12 Intel Westmere Xeon CPU cores (or 24 parallel computational threads) with 96 GB DDR3 system memory, 4 GPU subsystems which in aggregate contain 960 NVidia Tesla GPU cores with 16 GB dedicated DDR3 GPU memory, and a second interleved bank of 4 GPU subsystems containing in aggregate 1792 NVidia Fermi GPU cores with 12 GB dedicated DDR5 GPU memory. We are applying domain decomposition methods to a modified version of Weiss' (2001) 3D frequency domain full physics EM finite difference code, an open source GPL licensed f90 code available for download from www.OpenEM.org. This will be the core of a new hybrid 3D inversion that parallelizes frequencies across CPUs and individual forward solutions across GPUs. We describe progress made in modifying the code to use direct solvers in GPU cores dedicated to each small subdomain, iteratively improving the solution by matching adjacent subdomain boundary solutions, rather than iterative Krylov space sparse solvers as currently applied to the whole domain.
Arba-Mosquera, Samuel; Aslanides, Ioannis M.
2012-01-01
Purpose To analyze the effects of Eye-Tracker performance on the pulse positioning errors during refractive surgery. Methods A comprehensive model, which directly considers eye movements, including saccades, vestibular, optokinetic, vergence, and miniature, as well as, eye-tracker acquisition rate, eye-tracker latency time, scanner positioning time, laser firing rate, and laser trigger delay have been developed. Results Eye-tracker acquisition rates below 100 Hz correspond to pulse positioning errors above 1.5 mm. Eye-tracker latency times to about 15 ms correspond to pulse positioning errors of up to 3.5 mm. Scanner positioning times to about 9 ms correspond to pulse positioning errors of up to 2 mm. Laser firing rates faster than eye-tracker acquisition rates basically duplicate pulse-positioning errors. Laser trigger delays to about 300 μs have minor to no impact on pulse-positioning errors. Conclusions The proposed model can be used for comparison of laser systems used for ablation processes. Due to the pseudo-random nature of eye movements, positioning errors of single pulses are much larger than observed decentrations in the clinical settings. There is no single parameter that ‘alone’ minimizes the positioning error. It is the optimal combination of the several parameters that minimizes the error. The results of this analysis are important to understand the limitations of correcting very irregular ablation patterns.
LHCb Kalman Filter cross architecture studies
NASA Astrophysics Data System (ADS)
Cámpora Pérez, Daniel Hugo
2017-10-01
The 2020 upgrade of the LHCb detector will vastly increase the rate of collisions the Online system needs to process in software, in order to filter events in real time. 30 million collisions per second will pass through a selection chain, where each step is executed conditional to its prior acceptance. The Kalman Filter is a fit applied to all reconstructed tracks which, due to its time characteristics and early execution in the selection chain, consumes 40% of the whole reconstruction time in the current trigger software. This makes the Kalman Filter a time-critical component as the LHCb trigger evolves into a full software trigger in the Upgrade. I present a new Kalman Filter algorithm for LHCb that can efficiently make use of any kind of SIMD processor, and its design is explained in depth. Performance benchmarks are compared between a variety of hardware architectures, including x86_64 and Power8, and the Intel Xeon Phi accelerator, and the suitability of said architectures to efficiently perform the LHCb Reconstruction process is determined.
NASA Astrophysics Data System (ADS)
Stark, Giordon; Atlas Collaboration
2015-04-01
The Global Feature Extraction (gFEX) module is a Level 1 jet trigger system planned for installation in ATLAS during the Phase 1 upgrade in 2018. The gFEX selects large-radius jets for capturing Lorentz-boosted objects by means of wide-area jet algorithms refined by subjet information. The architecture of the gFEX permits event-by-event local pile-up suppression for these jets using the same subtraction techniques developed for offline analyses. The gFEX architecture is also suitable for other global event algorithms such as missing transverse energy (MET), centrality for heavy ion collisions, and ``jets without jets.'' The gFEX will use 4 processor FPGAs to perform calculations on the incoming data and a Hybrid APU-FPGA for slow control of the module. The gFEX is unique in both design and implementation and substantially enhance the selectivity of the L1 trigger and increases sensitivity to key physics channels.
The DAQ needle in the big-data haystack
NASA Astrophysics Data System (ADS)
Meschi, E.
2015-12-01
In the last three decades, HEP experiments have faced the challenge of manipulating larger and larger masses of data from increasingly complex, heterogeneous detectors with millions and then tens of millions of electronic channels. LHC experiments abandoned the monolithic architectures of the nineties in favor of a distributed approach, leveraging the appearence of high speed switched networks developed for digital telecommunication and the internet, and the corresponding increase of memory bandwidth available in off-the-shelf consumer equipment. This led to a generation of experiments where custom electronics triggers, analysing coarser-granularity “fast” data, are confined to the first phase of selection, where predictable latency and real time processing for a modest initial rate reduction are “a necessary evil”. Ever more sophisticated algorithms are projected for use in HL- LHC upgrades, using tracker data in the low-level selection in high multiplicity environments, and requiring extremely complex data interconnects. These systems are quickly obsolete and inflexible but must nonetheless survive and be maintained across the extremely long life span of current detectors. New high-bandwidth bidirectional links could make high-speed low-power full readout at the crossing rate a possibility already in the next decade. At the same time, massively parallel and distributed analysis of unstructured data produced by loosely connected, “intelligent” sources has become ubiquitous in commercial applications, while the mass of persistent data produced by e.g. the LHC experiments has made multiple pass, systematic, end-to-end offline processing increasingly burdensome. A possible evolution of DAQ and trigger architectures could lead to detectors with extremely deep asynchronous or even virtual pipelines, where data streams from the various detector channels are analysed and indexed in situ quasi-real-time using intelligent, pattern-driven data organization, and the final selection is operated as a distributed “search for interesting event parts”. A holistic approach is required to study the potential impact of these different developments on the design of detector readout, trigger and data acquisition systems in the next decades.
Baumgartner, F; Woess, C; Pedit, V; Tzankov, A; Labi, V; Villunger, A
2013-01-31
Proapoptotic Bcl-2 family members of the Bcl-2 homology (BH)3-only subgroup are critical for the establishment and maintenance of tissue homeostasis and can mediate apoptotic cell death in response to developmental cues or exogenously induced forms of cell stress. On the basis of the biochemical experiments as well as genetic studies in mice, the BH3-only proteins Bad and Bmf have been implicated in different proapoptotic events such as those triggered by glucose- or trophic factor-deprivation, glucocorticoids, or histone deacetylase inhibition, as well as suppression of B-cell lymphomagenesis upon aberrant expression of c-Myc. To address possible redundancies in cell death regulation and tumor suppression, we generated compound mutant mice lacking both genes. Our studies revealed lack of redundancy in most paradigms of lymphocyte apoptosis tested in tissue culture. Only spontaneous cell death of thymocytes kept in low glucose or that of pre-B cells deprived of cytokines was significantly delayed when both genes were lacking. Of note, despite these minor apoptosis defects we observed compromised lymphocyte homeostasis in vivo that affected mainly the B-cell lineage. Long-term follow-up revealed significantly reduced latency to spontaneous tumor formation in aged mice when both genes were lacking. Together our study suggests that Bad and Bmf co-regulate lymphocyte homeostasis and limit spontaneous transformation by mechanisms that may not exclusively be linked to the induction of lymphocyte apoptosis.
FPGA-based coprocessor for matrix algorithms implementation
NASA Astrophysics Data System (ADS)
Amira, Abbes; Bensaali, Faycal
2003-03-01
Matrix algorithms are important in many types of applications including image and signal processing. These areas require enormous computing power. A close examination of the algorithms used in these, and related, applications reveals that many of the fundamental actions involve matrix operations such as matrix multiplication which is of O (N3) on a sequential computer and O (N3/p) on a parallel system with p processors complexity. This paper presents an investigation into the design and implementation of different matrix algorithms such as matrix operations, matrix transforms and matrix decompositions using an FPGA based environment. Solutions for the problem of processing large matrices have been proposed. The proposed system architectures are scalable, modular and require less area and time complexity with reduced latency when compared with existing structures.
Chang, Nai-Fu; Chiang, Cheng-Yi; Chen, Tung-Chien; Chen, Liang-Gee
2011-01-01
On-chip implementation of Hilbert-Huang transform (HHT) has great impact to analyze the non-linear and non-stationary biomedical signals on wearable or implantable sensors for the real-time applications. Cubic spline interpolation (CSI) consumes the most computation in HHT, and is the key component for the HHT processor. In tradition, CSI in HHT is usually performed after the collection of a large window of signals, and the long latency violates the realtime requirement of the applications. In this work, we propose to keep processing the incoming signals on-line with small and overlapped data windows without sacrificing the interpolation accuracy. 58% multiplication and 73% division of CSI are saved after the data reuse between the data windows.
WindTalker: A P2P-Based Low-Latency Anonymous Communication Network
NASA Astrophysics Data System (ADS)
Zhang, Jia; Duan, Haixin; Liu, Wu; Wu, Jianping
Compared with traditional static anonymous communication networks, the P2P architecture can provide higher anonymity in communication. However, the P2P architecture also leads to more challenges, such as route, stability, trust and so on. In this paper, we present WindTalker, a P2P-based low-latency anonymous communication network. It is a pure decentralized mix network and can provide low-latency services which help users hide their real identity in communication. In order to ensure stability and reliability, WindTalker imports “seed nodes” to help a peer join in the P2P network and the peer nodes can use gossip-based protocol to exchange active information. Moreover, WindTalker uses layer encryption to ensure the information of relayed messages cannot be leaked. In addition, malicious nodes in the network are the major threat to anonymity of P2P anonymous communication, so WindTalker imports a trust mechanism which can help the P2P network exclude malicious nodes and optimize the strategy of peer discovery, tunnel construction, and relaying etc. in anonymous communications. We deploy peer nodes of WindTalker in our campus network to test reliability and analyze anonymity in theory. The network measurement and simulation analysis shows that WindTalker can provide low-latency and reliable anonymous communication services.
UGS video target detection and discrimination
NASA Astrophysics Data System (ADS)
Roberts, G. Marlon; Fitzgerald, James; McCormack, Michael; Steadman, Robert; Vitale, Joseph D.
2007-04-01
This project focuses on developing electro-optic algorithms which rank images by their likelihood of containing vehicles and people. These algorithms have been applied to images obtained from Textron's Terrain Commander 2 (TC2) Unattended Ground Sensor system. The TC2 is a multi-sensor surveillance system used in military applications. It combines infrared, acoustic, seismic, magnetic, and electro-optic sensors to detect nearby targets. When targets are detected by the seismic and acoustic sensors, the system is triggered and images are taken in the visible and infrared spectrum. The original Terrain Commander system occasionally captured and transmitted an excessive number of images, sometimes triggered by undesirable targets such as swaying trees. This wasted communications bandwidth, increased power consumption, and resulted in a large amount of end-user time being spent evaluating unimportant images. The algorithms discussed here help alleviate these problems. These algorithms are currently optimized for infra-red images, which give the best visibility in a wide range of environments, but could be adapted to visible imagery as well. It is important that the algorithms be robust, with minimal dependency on user input. They should be effective when tracking varying numbers of targets of different sizes and orientations, despite the low resolutions of the images used. Most importantly, the algorithms must be appropriate for implementation on a low-power processor in real time. This would enable us to maintain frame rates of 2 Hz for effective surveillance operations. Throughout our project we have implemented several algorithms, and used an appropriate methodology to quantitatively compare their performance. They are discussed in this paper.
NASA Astrophysics Data System (ADS)
Ammendola, R.; Barbanera, M.; Bizzarri, M.; Bonaiuto, V.; Ceccucci, A.; Checcucci, B.; De Simone, N.; Fantechi, R.; Federici, L.; Fucci, A.; Lupi, M.; Paoluzzi, G.; Papi, A.; Piccini, M.; Ryjov, V.; Salamon, A.; Salina, G.; Sargeni, F.; Venditti, S.
2017-03-01
The NA62 experiment at CERN SPS has started its data-taking. Its aim is to measure the branching ratio of the ultra-rare decay K+ → π+ν ν̅ . In this context, rejecting the background is a crucial topic. One of the main background to the measurement is represented by the K+ → π+π0 decay. In the 1-8.5 mrad decay region this background is rejected by the calorimetric trigger processor (Cal-L0). In this work we present the performance of a soft-core based parallel architecture built on FPGAs for the energy peak reconstruction as an alternative to an implementation completely founded on VHDL language.
Abbaszadeh-Amirdehi, Maryam; Ansari, Noureddin Nakhostin; Naghdi, Soofia; Olyaei, Gholamreza; Nourbakhsh, Mohammad Reza
2017-01-01
Dry needling (DN) is a widely used in treatment of myofascial trigger points (MTrPs). The purpose of this pretest-posttest clinical trial was to investigate the neurophysiological and clinical effects of DN in patients with MTrPs. A sample of 20 patients (3 man, 17 women; mean age 31.7 ± 10.8) with upper trapezius MTrPs received one session of deep DN. The outcomes of neuromuscular junction response (NMJR), sympathetic skin response (SSR), pain intensity (PI) and pressure pain threshold (PPT) were measured at baseline and immediately after DN. There were significant improvements in SSR latency and amplitude, pain, and PPT after DN. The NMJR decreased and returned to normal after DN. A single session of DN to the active upper trapezius MTrP was effective in improving pain, PPT, NMJR, and SSR in patients with myofascial trigger points. Further studies are needed. Copyright © 2016 Elsevier Ltd. All rights reserved.
78 FR 41116 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-09
... Agreement State regulations. All generators, collectors, and processors of low-level waste intended for... which facilitates tracking the identity of the waste generator. That tracking becomes more complicated... waste shipped from a waste processor may contain waste from several different generators. The...
Hybridization of biomedical circuitry
NASA Technical Reports Server (NTRS)
Rinard, G. A.
1978-01-01
The design and fabrication of low power hybrid circuits to perform vital signs monitoring are reported. The circuits consist of: (1) clock; (2) ECG amplifier and cardiotachometer signal conditioner; (3) impedance pneumobraph and respiration rate processor; (4) hear/breath rate processor; (5) temperature monitor; and (6) LCD display.
Software design and implementation of ship heave motion monitoring system based on MBD method
NASA Astrophysics Data System (ADS)
Yu, Yan; Li, Yuhan; Zhang, Chunwei; Kang, Won-Hee; Ou, Jinping
2015-03-01
Marine transportation plays a significant role in the modern transport sector due to its advantage of low cost, large capacity. It is being attached enormous importance to all over the world. Nowadays the related areas of product development have become an existing hot spot. DSP signal processors feature micro volume, low cost, high precision, fast processing speed, which has been widely used in all kinds of monitoring systems. But traditional DSP code development process is time-consuming, inefficiency, costly and difficult. MathWorks company proposed Model-based Design (MBD) to overcome these defects. By calling the target board modules in simulink library to compile and generate the corresponding code for the target processor. And then automatically call DSP integrated development environment CCS for algorithm validation on the target processor. This paper uses the MDB to design the algorithm for the ship heave motion monitoring system. It proves the effectiveness of the MBD run successfully on the processor.
Liu, Zhi; Sun, Yongzhu; Chang, Haifeng; Cui, Pengcheng
2014-01-01
Objective This study was designed to establish a low dose salicylate-induced tinnitus rat model and to investigate whether central or peripheral auditory system is involved in tinnitus. Methods Lick suppression ratio (R), lick count and lick latency of conditioned rats in salicylate group (120 mg/kg, intraperitoneally) and saline group were first compared. Bilateral auditory nerves were ablated in unconditioned rats and lick count and lick latency were compared before and after ablation. The ablation was then performed in conditioned rats and lick count and lick latency were compared between salicylate group and saline group and between ablated and unablated salicylate groups. Results Both the R value and the lick count in salicylate group were significantly higher than those in saline group and lick latency in salicylate group was significantly shorter than that in saline group. No significant changes were observed in lick count and lick latency before and after ablation. After ablation, lick count and lick latency in salicylate group were significantly higher and shorter respectively than those in saline group, but they were significantly lower and longer respectively than those in unablated salicylate group. Conclusion A low dose of salicylate (120 mg/kg) can induce tinnitus in rats and both central and peripheral auditory systems participate in the generation of salicylate-induced tinnitus. PMID:25269067
NASA Astrophysics Data System (ADS)
Moralis-Pegios, M.; Terzenidis, N.; Mourgias-Alexandris, G.; Vyrsokinos, K.; Pleros, N.
2018-02-01
Disaggregated Data Centers (DCs) have emerged as a powerful architectural framework towards increasing resource utilization and system power efficiency, requiring, however, a networking infrastructure that can ensure low-latency and high-bandwidth connectivity between a high-number of interconnected nodes. This reality has been the driving force towards high-port count and low-latency optical switching platforms, with recent efforts concluding that the use of distributed control architectures as offered by Broadcast-and-Select (BS) layouts can lead to sub-μsec latencies. However, almost all high-port count optical switch designs proposed so far rely either on electronic buffering and associated SerDes circuitry for resolving contention or on buffer-less designs with packet drop and re-transmit procedures, unavoidably increasing latency or limiting throughput. In this article, we demonstrate a 256x256 optical switch architecture for disaggregated DCs that employs small-size optical delay line buffering in a distributed control scheme, exploiting FPGA-based header processing over a hybrid BS/Wavelength routing topology that is implemented by a 16x16 BS design and a 16x16 AWGR. Simulation-based performance analysis reveals that even the use of a 2- packet optical buffer can yield <620nsec latency with >85% throughput for up to 100% loads. The switch has been experimentally validated with 10Gb/s optical data packets using 1:16 optical splitting and a SOA-MZI wavelength converter (WC) along with fiber delay lines for the 2-packet buffer implementation at every BS outgoing port, followed by an additional SOA-MZI tunable WC and the 16x16 AWGR. Error-free performance in all different switch input/output combinations has been obtained with a power penalty of <2.5dB.
Implicit, nonswitching, vector-oriented algorithm for steady transonic flow
NASA Technical Reports Server (NTRS)
Lottati, I.
1983-01-01
A rapid computation of a sequence of transonic flow solutions has to be performed in many areas of aerodynamic technology. The employment of low-cost vector array processors makes the conduction of such calculations economically feasible. However, for a full utilization of the new hardware, the developed algorithms must take advantage of the special characteristics of the vector array processor. The present investigation has the objective to develop an efficient algorithm for solving transonic flow problems governed by mixed partial differential equations on an array processor.
High-performance ultra-low power VLSI analog processor for data compression
NASA Technical Reports Server (NTRS)
Tawel, Raoul (Inventor)
1996-01-01
An apparatus for data compression employing a parallel analog processor. The apparatus includes an array of processor cells with N columns and M rows wherein the processor cells have an input device, memory device, and processor device. The input device is used for inputting a series of input vectors. Each input vector is simultaneously input into each column of the array of processor cells in a pre-determined sequential order. An input vector is made up of M components, ones of which are input into ones of M processor cells making up a column of the array. The memory device is used for providing ones of M components of a codebook vector to ones of the processor cells making up a column of the array. A different codebook vector is provided to each of the N columns of the array. The processor device is used for simultaneously comparing the components of each input vector to corresponding components of each codebook vector, and for outputting a signal representative of the closeness between the compared vector components. A combination device is used to combine the signal output from each processor cell in each column of the array and to output a combined signal. A closeness determination device is then used for determining which codebook vector is closest to an input vector from the combined signals, and for outputting a codebook vector index indicating which of the N codebook vectors was the closest to each input vector input into the array.
Low Power Multi-Hop Networking Analysis in Intelligent Environments.
Etxaniz, Josu; Aranguren, Gerardo
2017-05-19
Intelligent systems are driven by the latest technological advances in many different areas such as sensing, embedded systems, wireless communications or context recognition. This paper focuses on some of those areas. Concretely, the paper deals with wireless communications issues in embedded systems. More precisely, the paper combines the multi-hop networking with Bluetooth technology and a quality of service (QoS) metric, the latency. Bluetooth is a radio license-free worldwide communication standard that makes low power multi-hop wireless networking available. It establishes piconets (point-to-point and point-to-multipoint links) and scatternets (multi-hop networks). As a result, many Bluetooth nodes can be interconnected to set up ambient intelligent networks. Then, this paper presents the results of the investigation on multi-hop latency with park and sniff Bluetooth low power modes conducted over the hardware test bench previously implemented. In addition, the empirical models to estimate the latency of multi-hop communications over Bluetooth Asynchronous Connectionless Links (ACL) in park and sniff mode are given. The designers of devices and networks for intelligent systems will benefit from the estimation of the latency in Bluetooth multi-hop communications that the models provide.
Low Power Multi-Hop Networking Analysis in Intelligent Environments
Etxaniz, Josu; Aranguren, Gerardo
2017-01-01
Intelligent systems are driven by the latest technological advances in many different areas such as sensing, embedded systems, wireless communications or context recognition. This paper focuses on some of those areas. Concretely, the paper deals with wireless communications issues in embedded systems. More precisely, the paper combines the multi-hop networking with Bluetooth technology and a quality of service (QoS) metric, the latency. Bluetooth is a radio license-free worldwide communication standard that makes low power multi-hop wireless networking available. It establishes piconets (point-to-point and point-to-multipoint links) and scatternets (multi-hop networks). As a result, many Bluetooth nodes can be interconnected to set up ambient intelligent networks. Then, this paper presents the results of the investigation on multi-hop latency with park and sniff Bluetooth low power modes conducted over the hardware test bench previously implemented. In addition, the empirical models to estimate the latency of multi-hop communications over Bluetooth Asynchronous Connectionless Links (ACL) in park and sniff mode are given. The designers of devices and networks for intelligent systems will benefit from the estimation of the latency in Bluetooth multi-hop communications that the models provide. PMID:28534847
12-bit 32 channel 500 MS/s low-latency ADC for particle accelerators real-time control
NASA Astrophysics Data System (ADS)
Karnitski, Anton; Baranauskas, Dalius; Zelenin, Denis; Baranauskas, Gytis; Zhankevich, Alexander; Gill, Chris
2017-09-01
Particle beam control systems require real-time low latency digital feedback with high linearity and dynamic range. Densely packed electronic systems employ high performance multichannel digitizers causing excessive heat dissipation. Therefore, low power dissipation is another critical requirement for these digitizers. A described 12-bit 500 MS/s ADC employs a sub-ranging architecture based on a merged sample & hold circuit, a residue C-DAC and a shared 6-bit flash core ADC. The core ADC provides a sequential coarse and fine digitization featuring a latency of two clock cycles. The ADC is implemented in a 28 nm CMOS process and consumes 4 mW of power per channel from a 0.9 V supply (interfacing and peripheral circuits are excluded). Reduced power consumption and small on-chip area permits the implementation of 32 ADC channels on a 10.7 mm2 chip. The ADC includes a JESD204B standard compliant output data interface operated at the 7.5 Gbps/ch rate. To minimize the data interface related time latency, a special feature permitting to bypass the JESD204B interface is built in. DoE Phase I Award Number: DE-SC0017213.
NASA Astrophysics Data System (ADS)
Li, C.; Huang, X.; Cao, P.; Wang, J.; An, Q.
2018-03-01
RPC Super module (SM) detector assemblies are used for charged hadron identification in the Time-of-Flight (TOF) spectrometer at the Compressed Baryonic Matter (CBM) experiment. Each SM contains several multi-gap Resistive Plate Chambers (MRPCs) and provides up to 320 electronic channels in total for high-precision time measurements. Time resolution of the Time-to-Digital Converter (TDC) is required to be better than 20 ps. During mass production, the quality of each SM needs to be evaluated. In order to meet the requirements, the system clock signal as well as the trigger signal should be distributed precisely and synchronously to all electronics modules within the evaluation readout system. In this paper, a hierarchical clock and trigger distribution method is proposed for the quality evaluation of CBM-TOF SM detectors. In a first stage, the master clock and trigger module (CTM) allocated in a 6U PXI chassis distributes the clock and trigger signals to the slave CTM in the same chassis. In a second stage, the slave CTM transmits the clock and trigger signals to the TDC readout module (TRM) through one optical link. In a third stage, the TRM distributes the clock and trigger signals synchronously to 10 individual TDC boards. Laboratory test results show that the clock jitter at the third stage is less than 4 ps (RMS) and the trigger transmission latency from the master CTM to the TDC is about 272 ns with 11 ps (RMS) jitter. The overall performance complies well with the required specifications.
NASA Astrophysics Data System (ADS)
Smith, W.; Weisz, E.; McNabb, J. M. C.
2017-12-01
A technique is described which enables the combination of high vertical resolution (1 to 2-km) JPSS hyper-spectral soundings (i.e., from AIRS, CrIS, and IASI) with high horizontal (2-km) and temporal (15-min) resolution GOES multi-spectral imagery (i.e., provided by ABI) to produce low latency sounding products with the highest possible spatial and temporal resolution afforded by the instruments.
Mapping of MPEG-4 decoding on a flexible architecture platform
NASA Astrophysics Data System (ADS)
van der Tol, Erik B.; Jaspers, Egbert G.
2001-12-01
In the field of consumer electronics, the advent of new features such as Internet, games, video conferencing, and mobile communication has triggered the convergence of television and computers technologies. This requires a generic media-processing platform that enables simultaneous execution of very diverse tasks such as high-throughput stream-oriented data processing and highly data-dependent irregular processing with complex control flows. As a representative application, this paper presents the mapping of a Main Visual profile MPEG-4 for High-Definition (HD) video onto a flexible architecture platform. A stepwise approach is taken, going from the decoder application toward an implementation proposal. First, the application is decomposed into separate tasks with self-contained functionality, clear interfaces, and distinct characteristics. Next, a hardware-software partitioning is derived by analyzing the characteristics of each task such as the amount of inherent parallelism, the throughput requirements, the complexity of control processing, and the reuse potential over different applications and different systems. Finally, a feasible implementation is proposed that includes amongst others a very-long-instruction-word (VLIW) media processor, one or more RISC processors, and some dedicated processors. The mapping study of the MPEG-4 decoder proves the flexibility and extensibility of the media-processing platform. This platform enables an effective HW/SW co-design yielding a high performance density.
Advanced electronics for the CTF MEG system.
McCubbin, J; Vrba, J; Spear, P; McKenzie, D; Willis, R; Loewen, R; Robinson, S E; Fife, A A
2004-11-30
Development of the CTF MEG system has been advanced with the introduction of a computer processing cluster between the data acquisition electronics and the host computer. The advent of fast processors, memory, and network interfaces has made this innovation feasible for large data streams at high sampling rates. We have implemented tasks including anti-alias filter, sample rate decimation, higher gradient balancing, crosstalk correction, and optional filters with a cluster consisting of 4 dual Intel Xeon processors operating on up to 275 channel MEG systems at 12 kHz sample rate. The architecture is expandable with additional processors to implement advanced processing tasks which may include e.g., continuous head localization/motion correction, optional display filters, coherence calculations, or real time synthetic channels (via beamformer). We also describe an electronics configuration upgrade to provide operator console access to the peripheral interface features such as analog signal and trigger I/O. This allows remote location of the acoustically noisy electronics cabinet and fitting of the cabinet with doors for improved EMI shielding. Finally, we present the latest performance results available for the CTF 275 channel MEG system including an unshielded SEF (median nerve electrical stimulation) measurement enhanced by application of an adaptive beamformer technique (SAM) which allows recognition of the nominal 20-ms response in the unaveraged signal.
Compact propane fuel processor for auxiliary power unit application
NASA Astrophysics Data System (ADS)
Dokupil, M.; Spitta, C.; Mathiak, J.; Beckhaus, P.; Heinzel, A.
With focus on mobile applications a fuel cell auxiliary power unit (APU) using liquefied petroleum gas (LPG) is currently being developed at the Centre for Fuel Cell Technology (Zentrum für BrennstoffzellenTechnik, ZBT gGmbH). The system is consisting of an integrated compact and lightweight fuel processor and a low temperature PEM fuel cell for an electric power output of 300 W. This article is presenting the current status of development of the fuel processor which is designed for a nominal hydrogen output of 1 k Wth,H2 within a load range from 50 to 120%. A modular setup was chosen defining a reformer/burner module and a CO-purification module. Based on the performance specifications, thermodynamic simulations, benchmarking and selection of catalysts the modules have been developed and characterised simultaneously and then assembled to the complete fuel processor. Automated operation results in a cold startup time of about 25 min for nominal load and carbon monoxide output concentrations below 50 ppm for steady state and dynamic operation. Also fast transient response of the fuel processor at load changes with low fluctuations of the reformate gas composition have been achieved. Beside the development of the main reactors the transfer of the fuel processor to an autonomous system is of major concern. Hence, concepts for packaging have been developed resulting in a volume of 7 l and a weight of 3 kg. Further a selection of peripheral components has been tested and evaluated regarding to the substitution of the laboratory equipment.
Curreli, Francesca; Friedman-Kien, Alvin E.; Flore, Ornella
2005-01-01
Kaposi sarcoma–associated herpesvirus (KSHV) is linked with all clinical forms of Kaposi sarcoma and several lymphoproliferative disorders. Like other herpesviruses, KSHV becomes latent in the infected cells, expressing only a few genes that are essential for the establishment and maintenance of its latency and for the survival of the infected cells. Inhibiting the expression of these latent genes should lead to eradication of herpesvirus infection. All currently available drugs are ineffective against latent infection. Here we show, for the first time to our knowledge, that latent infection with KSHV in B lymphocytes can be terminated by glycyrrhizic acid (GA), a triterpenoid compound earlier shown to inhibit the lytic replication of other herpesviruses. We demonstrate that GA disrupts latent KSHV infection by downregulating the expression of latency-associated nuclear antigen (LANA) and upregulating the expression of viral cyclin and selectively induces cell death of KSHV-infected cells. We show that reduced levels of LANA lead to p53 reactivation, an increase in ROS, and mitochondrial dysfunction, which result in G1 cell cycle arrest, DNA fragmentation, and oxidative stress–mediated apoptosis. Latent genes are involved in KSHV-induced oncogenesis, and strategies to interfere with their expression might prove useful for eradicating latent KSHV infection and have future therapeutic implications. PMID:15765147
Curreli, Francesca; Friedman-Kien, Alvin E; Flore, Ornella
2005-03-01
Kaposi sarcoma-associated herpesvirus (KSHV) is linked with all clinical forms of Kaposi sarcoma and several lymphoproliferative disorders. Like other herpesviruses, KSHV becomes latent in the infected cells, expressing only a few genes that are essential for the establishment and maintenance of its latency and for the survival of the infected cells. Inhibiting the expression of these latent genes should lead to eradication of herpesvirus infection. All currently available drugs are ineffective against latent infection. Here we show, for the first time to our knowledge, that latent infection with KSHV in B lymphocytes can be terminated by glycyrrhizic acid (GA), a triterpenoid compound earlier shown to inhibit the lytic replication of other herpesviruses. We demonstrate that GA disrupts latent KSHV infection by downregulating the expression of latency-associated nuclear antigen (LANA) and upregulating the expression of viral cyclin and selectively induces cell death of KSHV-infected cells. We show that reduced levels of LANA lead to p53 reactivation, an increase in ROS, and mitochondrial dysfunction, which result in G1 cell cycle arrest, DNA fragmentation, and oxidative stress-mediated apoptosis. Latent genes are involved in KSHV-induced oncogenesis, and strategies to interfere with their expression might prove useful for eradicating latent KSHV infection and have future therapeutic implications.
Aistrup, Gary L; Arora, Rishi; Grubb, Søren; Yoo, Shin; Toren, Benjamin; Kumar, Manvinder; Kunamalla, Aaron; Marszalec, William; Motiwala, Tej; Tai, Shannon; Yamakawa, Sean; Yerrabolu, Satya; Alvarado, Francisco J; Valdivia, Hector H; Cordeiro, Jonathan M; Shiferaw, Yohannes; Wasserstrom, John Andrew
2017-11-01
Abnormal intracellular Ca2+ cycling contributes to triggered activity and arrhythmias in the heart. We investigated the properties and underlying mechanisms for systolic triggered Ca2+ waves in left atria from normal and failing dog hearts. Intracellular Ca2+ cycling was studied using confocal microscopy during rapid pacing of atrial myocytes (36 °C) isolated from normal and failing canine hearts (ventricular tachypacing model). In normal atrial myocytes (NAMs), Ca2+ waves developed during rapid pacing at rates ≥ 3.3 Hz and immediately disappeared upon cessation of pacing despite high sarcoplasmic reticulum (SR) load. In heart failure atrial myocytes (HFAMs), triggered Ca2+ waves (TCWs) developed at a higher incidence at slower rates. Because of their timing, TCW development relies upon action potential (AP)-evoked Ca2+ entry. The distribution of Ca2+ wave latencies indicated two populations of waves, with early events representing TCWs and late events representing conventional spontaneous Ca2+ waves. Latency analysis also demonstrated that TCWs arise after junctional Ca2+ release has occurred and spread to non-junctional (cell core) SR. TCWs also occurred in intact dog atrium and in myocytes from humans and pigs. β-adrenergic stimulation increased Ca2+ release and abolished TCWs in NAMs but was ineffective in HFAMs making this a potentially effective adaptive mechanism in normals but potentially arrhythmogenic in HF. Block of Ca-calmodulin kinase II also abolished TCWs, suggesting a role in TCW formation. Pharmacological manoeuvres that increased Ca2+ release suppressed TCWs as did interventions that decreased Ca2+ release but these also severely reduced excitation-contraction coupling. TCWs develop during the atrial AP and thus could affect AP duration, producing repolarization gradients and creating a substrate for reentry, particularly in HF where they develop at slower rates and a higher incidence. TCWs may represent a mechanism for the initiation of atrial fibrillation particularly in HF. Published on behalf of the European Society of Cardiology. All rights reserved. © The Author 2017. For permissions please email: journals.permissions@oup.com.
Development of COTS ADC SEE Test System for the ATLAS LArCalorimeter Upgrade
Hu, Xue -Ye; Chen, Hu -Cheng; Chen, Kai; ...
2014-12-01
Radiation-tolerant, high speed, high density and low power commercial off-the-shelf (COTS) analog-to-digital converters (ADCs) are planned to be used in the upgrade to the Liquid Argon (LAr) calorimeter front end (FE) trigger readout electronics. Total ionization dose (TID) and single event effect (SEE) are two important radiation effects which need to be characterized on COTS ADCs. In our initial TID test, Texas Instruments (TI) ADS5272 was identified to be the top performer after screening a total 17 COTS ADCs from different manufacturers with dynamic range and sampling rate meeting the requirements of the FE electronics. Another interesting feature of ADS5272more » is its 6.5 clock cycles latency, which is the shortest among the 17 candidates. Based on the TID performance, we have designed a SEE evaluation system for ADS5272, which allows us to further assess its radiation tolerance. In this paper, we present a detailed design of ADS5272 SEE evaluation system and show the effectiveness of this system while evaluating ADS5272 SEE characteristics in multiple irradiation tests. According to TID and SEE test results, ADS5272 was chosen to be implemented in the full-size LAr Trigger Digitizer Board (LTDB) demonstrator, which will be installed on ATLAS calorimeter during the 2014 Long Shutdown 1 (LS1).« less
Acousto-optic time- and space-integrating spotlight-mode SAR processor
NASA Astrophysics Data System (ADS)
Haney, Michael W.; Levy, James J.; Michael, Robert R., Jr.
1993-09-01
The technical approach and recent experimental results for the acousto-optic time- and space- integrating real-time SAR image formation processor program are reported. The concept overcomes the size and power consumption limitations of electronic approaches by using compact, rugged, and low-power analog optical signal processing techniques for the most computationally taxing portions of the SAR imaging problem. Flexibility and performance are maintained by the use of digital electronics for the critical low-complexity filter generation and output image processing functions. The results include a demonstration of the processor's ability to perform high-resolution spotlight-mode SAR imaging by simultaneously compensating for range migration and range/azimuth coupling in the analog optical domain, thereby avoiding a highly power-consuming digital interpolation or reformatting operation usually required in all-electronic approaches.
Xu, Ren; Jiang, Ning; Mrachacz-Kersting, Natalie; Dremstrup, Kim; Farina, Dario
2016-01-01
Brain-computer interfacing (BCI) has recently been applied as a rehabilitation approach for patients with motor disorders, such as stroke. In these closed-loop applications, a brain switch detects the motor intention from brain signals, e.g., scalp EEG, and triggers a neuroprosthetic device, either to deliver sensory feedback or to mimic real movements, thus re-establishing the compromised sensory-motor control loop and promoting neural plasticity. In this context, single trial detection of motor intention with short latency is a prerequisite. The performance of the event detection from EEG recordings is mainly determined by three factors: the type of motor imagery (e.g., repetitive, ballistic), the frequency band (or signal modality) used for discrimination (e.g., alpha, beta, gamma, and MRCP, i.e., movement-related cortical potential), and the processing technique (e.g., time-series analysis, sub-band power estimation). In this study, we investigated single trial EEG traces during movement imagination on healthy individuals, and provided a comprehensive analysis of the performance of a short-latency brain switch when varying these three factors. The morphological investigation showed a cross-subject consistency of a prolonged negative phase in MRCP, and a delayed beta rebound in sensory-motor rhythms during repetitive tasks. The detection performance had the greatest accuracy when using ballistic MRCP with time-series analysis. In this case, the true positive rate (TPR) was ~70% for a detection latency of ~200 ms. The results presented here are of practical relevance for designing BCI systems for motor function rehabilitation. PMID:26834551
Low-Latency Teleoperations for Human Exploration and Evolvable Mars Campaign
NASA Technical Reports Server (NTRS)
Lupisella, Mark; Wright, Michael; Arney, Dale; Gershman, Bob; Stillwagen, Fred; Bobskill, Marianne; Johnson, James; Shyface, Hilary; Larman, Kevin; Lewis, Ruthan;
2015-01-01
NASA has been analyzing a number of mission concepts and activities that involve low-latency telerobotic (LLT) operations. One mission concept that will be covered in this presentation is Crew-Assisted Sample Return which involves the crew acquiring samples (1) that have already been delivered to space, and or acquiring samples via LLT from orbit to a planetary surface and then launching the samples to space to be captured in space and then returned to the earth with the crew. Both versions of have key roles for low-latency teleoperations. More broadly, the NASA Evolvable Mars Campaign is exploring a number of other activities that involve LLT, such as: (a) human asteroid missions, (b) PhobosDeimos missions, (c) Mars human landing site reconnaissance and site preparation, and (d) Mars sample handling and analysis. Many of these activities could be conducted from Mars orbit and also with the crew on the Mars surface remotely operating assets elsewhere on the surface, e.g. for exploring Mars special regions and or teleoperating a sample analysis laboratory both of which may help address planetary protection concerns. The operational and technology implications of low-latency teleoperations will be explored, including discussion of relevant items in the NASA Technology Roadmap and also how previously deployed robotic assets from any source could subsequently be used by astronauts via LLT.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karamooz, Saeed; Breeding, John Eric; Justice, T Alan
As MicroTCA expands into applications beyond the telecommunications industry from which it originated, it faces new challenges in the area of inter-blade communications. The ability to achieve deterministic, low-latency communications between blades is critical to realizing a scalable architecture. In the past, legacy bus architectures accomplished inter-blade communications using dedicated parallel buses across the backplane. Because of limited fabric resources on its backplane, MicroTCA uses the carrier hub (MCH) for this purpose. Unfortunately, MCH products from commercial vendors are limited to standard bus protocols such as PCI Express, Serial Rapid IO and 10/40GbE. While these protocols have exceptional throughput capability,more » they are neither deterministic nor necessarily low-latency. To overcome this limitation, an MCH has been developed based on the Xilinx Virtex-7 690T FPGA. This MCH provides the system architect/developer complete flexibility in both the interface protocol and routing of information between blades. In this paper, we present the application of this configurable MCH concept to the Machine Protection System under development for the Spallation Neutron Sources's proton accelerator. Specifically, we demonstrate the use of the configurable MCH as a 12x4-lane crossbar switch using the Aurora protocol to achieve a deterministic, low-latency data link. In this configuration, the crossbar has an aggregate bandwidth of 48 GB/s.« less
Development of flame resistant treatment for nomex fibrous structures
NASA Technical Reports Server (NTRS)
Toy, M. S.
1978-01-01
Technology which renders aramid fibrous structures flame resistant through chemical modification was developed. The project scaled up flame resistant treatment from laboratory fabric swatches of a few inches to efficiently producing ten yards of commercial width (41 inches) aromatic polyamide. The radiation intensity problem of the processor was resolved. Further improvement of the processor cooling system was recommended for two reasons: (1) To advance current technology of flame proofing Nomex fabric to higher oxygen enriched atmospheres; and (2) To adapt the processor for direct applicability to low cost commercial fabrics.
A miniature on-chip multi-functional ECG signal processor with 30 µW ultra-low power consumption.
Liu, Xin; Zheng, Yuan Jin; Phyu, Myint Wai; Zhao, Bin; Je, Minkyu; Yuan, Xiao Jun
2010-01-01
In this paper, a miniature low-power Electrocardiogram (ECG) signal processing application specific integrated circuit (ASIC) chip is proposed. This chip provides multiple critical functions for ECG analysis using a systematic wavelet transform algorithm and a novel SRAM-based ASIC architecture, while achieves low cost and high performance. Using 0.18 µm CMOS technology and 1 V power supply, this ASIC chip consumes only 29 µW and occupies an area of 3 mm(2). This on-chip ECG processor is highly suitable for reliable real-time cardiac status monitoring applications.
Latency of TCP applications over the ATM-WAN using the GFR service category
NASA Astrophysics Data System (ADS)
Chen, Kuo-Hsien; Siliquini, John F.; Budrikis, Zigmantas
1998-10-01
The GFR service category has been proposed for data services in ATM networks. Since users are ultimately interested in data service that provide high efficiency and low latency, it is important to study the latency performance for data traffic of the GFR service category in an ATM network. Today much of the data traffic utilizes the TCP/IP protocol suite and in this paper we study through simulation the latency of TCP applications running over a wide-area ATM network utilizing the GFR service category using a realistic TCP traffic model. From this study, we find that during congestion periods the reserved bandwidth in GFR can improve the latency performance for TCP applications. However, due to TCP 'Slow Start' data segment generation dynamics, we show that a large proportion of TCP segments are discarded under network congestion even when the reserved bandwidth is equal to the average generated rate of user data. Therefore, a user experiences worse than expected latency performance when the network is congested. In this study we also examine the effects of segment size on the latency performance of TCP applications using the GFR service category.
Terzenidis, Nikos; Moralis-Pegios, Miltiadis; Mourgias-Alexandris, George; Vyrsokinos, Konstantinos; Pleros, Nikos
2018-04-02
Departing from traditional server-centric data center architectures towards disaggregated systems that can offer increased resource utilization at reduced cost and energy envelopes, the use of high-port switching with highly stringent latency and bandwidth requirements becomes a necessity. We present an optical switch architecture exploiting a hybrid broadcast-and-select/wavelength routing scheme with small-scale optical feedforward buffering. The architecture is experimentally demonstrated at 10Gb/s, reporting error-free performance with a power penalty of <2.5dB. Moreover, network simulations for a 256-node system, revealed low-latency values of only 605nsec, at throughput values reaching 80% when employing 2-packet-size optical buffers, while multi-rack network performance was also investigated.
The artificial retina for track reconstruction at the LHC crossing rate
NASA Astrophysics Data System (ADS)
Abba, A.; Bedeschi, F.; Citterio, M.; Caponio, F.; Cusimano, A.; Geraci, A.; Marino, P.; Morello, M. J.; Neri, N.; Punzi, G.; Piucci, A.; Ristori, L.; Spinella, F.; Stracka, S.; Tonelli, D.
2016-04-01
We present the results of an R&D study for a specialized processor capable of precisely reconstructing events with hundreds of charged-particle tracks in pixel and silicon strip detectors at 40 MHz, thus suitable for processing LHC events at the full crossing frequency. For this purpose we design and test a massively parallel pattern-recognition algorithm, inspired to the current understanding of the mechanisms adopted by the primary visual cortex of mammals in the early stages of visual-information processing. The detailed geometry and charged-particle's activity of a large tracking detector are simulated and used to assess the performance of the artificial retina algorithm. We find that high-quality tracking in large detectors is possible with sub-microsecond latencies when the algorithm is implemented in modern, high-speed, high-bandwidth FPGA devices.
The P-Mesh: A Commodity-based Scalable Network Architecture for Clusters
NASA Technical Reports Server (NTRS)
Nitzberg, Bill; Kuszmaul, Chris; Stockdale, Ian; Becker, Jeff; Jiang, John; Wong, Parkson; Tweten, David (Technical Monitor)
1998-01-01
We designed a new network architecture, the P-Mesh which combines the scalability and fault resilience of a torus with the performance of a switch. We compare the scalability, performance, and cost of the hub, switch, torus, tree, and P-Mesh architectures. The latter three are capable of scaling to thousands of nodes, however, the torus has severe performance limitations with that many processors. The tree and P-Mesh have similar latency, bandwidth, and bisection bandwidth, but the P-Mesh outperforms the switch architecture (a lower bound for tree performance) on 16-node NAB Parallel Benchmark tests by up to 23%, and costs 40% less. Further, the P-Mesh has better fault resilience characteristics. The P-Mesh architecture trades increased management overhead for lower cost, and is a good bridging technology while the price of tree uplinks is expensive.
An Analysis of Performance Enhancement Techniques for Overset Grid Applications
NASA Technical Reports Server (NTRS)
Djomehri, J. J.; Biswas, R.; Potsdam, M.; Strawn, R. C.; Biegel, Bryan (Technical Monitor)
2002-01-01
The overset grid methodology has significantly reduced time-to-solution of high-fidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement techniques on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machine. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.
Generalized hypercube structures and hyperswitch communication network
NASA Technical Reports Server (NTRS)
Young, Steven D.
1992-01-01
This paper discusses an ongoing study that uses a recent development in communication control technology to implement hybrid hypercube structures. These architectures are similar to binary hypercubes, but they also provide added connectivity between the processors. This added connectivity increases communication reliability while decreasing the latency of interprocessor message passing. Because these factors directly determine the speed that can be obtained by multiprocessor systems, these architectures are attractive for applications such as remote exploration and experimentation, where high performance and ultrareliability are required. This paper describes and enumerates these architectures and discusses how they can be implemented with a modified version of the hyperswitch communication network (HCN). The HCN is analyzed because it has three attractive features that enable these architectures to be effective: speed, fault tolerance, and the ability to pass multiple messages simultaneously through the same hyperswitch controller.
NASA Astrophysics Data System (ADS)
Yang, Mei; Jiao, Fengjun; Li, Shulian; Li, Hengqiang; Chen, Guangwen
2015-08-01
A self-sustained, complete and miniaturized methanol fuel processor has been developed based on modular integration and microreactor technology. The fuel processor is comprised of one methanol oxidative reformer, one methanol combustor and one two-stage CO preferential oxidation unit. Microchannel heat exchanger is employed to recover heat from hot stream, miniaturize system size and thus achieve high energy utilization efficiency. By optimized thermal management and proper operation parameter control, the fuel processor can start up in 10 min at room temperature without external heating. A self-sustained state is achieved with H2 production rate of 0.99 Nm3 h-1 and extremely low CO content below 25 ppm. This amount of H2 is sufficient to supply a 1 kWe proton exchange membrane fuel cell. The corresponding thermal efficiency of whole processor is higher than 86%. The size and weight of the assembled reactors integrated with microchannel heat exchangers are 1.4 L and 5.3 kg, respectively, demonstrating a very compact construction of the fuel processor.
Primary display latency criteria based on flying qualities and performance data
NASA Technical Reports Server (NTRS)
Funk, John D., Jr.; Beck, Corin P.; Johns, John B.
1993-01-01
With a pilots' increasing use of visual cue augmentation, much requiring extensive pre-processing, there is a need to establish criteria for new avionics/display design. The timeliness and synchronization of the augmented cues is vital to ensure the performance quality required for precision mission task elements (MTEs) where augmented cues are the primary source of information to the pilot. Processing delays incurred while transforming sensor-supplied flight information into visual cues are unavoidable. Relationships between maximum control system delays and associated flying qualities levels are documented in MIL-F-83300 and MIL-F-8785. While cues representing aircraft status may be just as vital to the pilot as prompt control response for operations in instrument meteorological conditions, presently, there are no specification requirements on avionics system latency. To produce data relating avionics system latency to degradations in flying qualities, the Navy conducted two simulation investigations. During the investigations, flying qualities and performance data were recorded as simulated avionics system latency was varied. Correlated results of the investigation indicates that there is a detrimental impact of latency on flying qualities. Analysis of these results and consideration of key factors influencing their application indicate that: (1) Task performance degrades and pilot workload increases as latency is increased. Inconsistency in task performance increases as latency increases. (2) Latency reduces the probability of achieving Level 1 handling qualities with avionics system latency as low as 70 ms. (3) The data suggest that the achievement of desired performance will be ensured only at display latency values below 120 ms. (4) These data also suggest that avoidance of inadequate performance will be ensured only at display latency values below 150 ms.
NASA Astrophysics Data System (ADS)
Badoni, D.; Bizzarri, M.; Bonaiuto, V.; Checcucci, B.; De Simone, N.; Federici, L.; Fucci, A.; Paoluzzi, G.; Papi, A.; Piccini, M.; Salamon, A.; Salina, G.; Santovetti, E.; Sargeni, F.; Venditti, S.
2014-01-01
The goal of the NA62 experiment at the CERN SPS is the measurement of the Branching Ratio of the very rare kaon decay K+→π+ ν bar nu with a 10% accuracy by collecting 100 events in two years of data taking. An efficient photon veto system is needed to reject the K+→π+ π0 background and a liquid krypton electromagnetic calorimeter will be used for this purpose in the 1-10 mrad angular region. The L0 trigger system for the calorimeter consists of a peak reconstruction algorithm implemented on FPGA by using a mixed parallel architecture based on soft core Altera NIOS II embedded processors together with custom VHDL modules. This solution allows an efficient and flexible reconstruction of the energy-deposition peak. The system will be totally composed of 36 TEL62 boards, 108 mezzanine cards and 215 high-performance FPGAs. We describe the design, current status and the results of the first performance tests.
Psychophysics of a Nociceptive Test in the Mouse: Ambient Temperature as a Key Factor for Variation
Pincedé, Ivanne; Pollin, Bernard; Meert, Theo; Plaghki, Léon; Le Bars, Daniel
2012-01-01
Background The mouse is increasingly used in biomedical research, notably in behavioral neurosciences for the development of tests or models of pain. Our goal was to provide the scientific community with an outstanding tool that allows the determination of psychophysical descriptors of a nociceptive reaction, which are inaccessible with conventional methods: namely the true threshold, true latency, conduction velocity of the peripheral fibers that trigger the response and latency of the central decision-making process. Methodology/Principal Findings Basically, the procedures involved heating of the tail with a CO2 laser, recording of tail temperature with an infrared camera and stopping the heating when the animal reacted. The method is based mainly on the measurement of three observable variables, namely the initial temperature, the heating rate and the temperature reached at the actual moment of the reaction following random variations in noxious radiant heat. The initial temperature of the tail, which itself depends on the ambient temperature, very markedly influenced the behavioral threshold, the behavioral latency and the conduction velocity of the peripheral fibers but not the latency of the central decision-making. Conclusions/Significance We have validated a psychophysical approach to nociceptive reactions for the mouse, which has already been described for rats and Humans. It enables the determination of four variables, which contribute to the overall latency of the response. The usefulness of such an approach was demonstrated by providing new fundamental findings regarding the influence of ambient temperature on nociceptive processes. We conclude by challenging the validity of using as “pain index" the reaction time of a behavioral response to an increasing heat stimulus and emphasize the need for a very careful control of the ambient temperature, as a prevailing environmental source of variation, during any behavioral testing of mice. PMID:22629325
An 81.6 μW FastICA processor for epileptic seizure detection.
Yang, Chia-Hsiang; Shih, Yi-Hsin; Chiueh, Herming
2015-02-01
To improve the performance of epileptic seizure detection, independent component analysis (ICA) is applied to multi-channel signals to separate artifacts and signals of interest. FastICA is an efficient algorithm to compute ICA. To reduce the energy dissipation, eigenvalue decomposition (EVD) is utilized in the preprocessing stage to reduce the convergence time of iterative calculation of ICA components. EVD is computed efficiently through an array structure of processing elements running in parallel. Area-efficient EVD architecture is realized by leveraging the approximate Jacobi algorithm, leading to a 77.2% area reduction. By choosing proper memory element and reduced wordlength, the power and area of storage memory are reduced by 95.6% and 51.7%, respectively. The chip area is minimized through fixed-point implementation and architectural transformations. Given a latency constraint of 0.1 s, an 86.5% area reduction is achieved compared to the direct-mapped architecture. Fabricated in 90 nm CMOS, the core area of the chip is 0.40 mm(2). The FastICA processor, part of an integrated epileptic control SoC, dissipates 81.6 μW at 0.32 V. The computation delay of a frame of 256 samples for 8 channels is 84.2 ms. Compared to prior work, 0.5% power dissipation, 26.7% silicon area, and 3.4 × computation speedup are achieved. The performance of the chip was verified by human dataset.
Improved UT1 Predictions through Low-Latency VLBI Observations
2010-03-14
J Geod (2010) 84:399–402 DOI 10.1007/s00190-010-0372-8 SHORT NOTE Improved UT1 predictions through low-latency VLBI observations Brian Luzum · Axel...polar motion and nutation on UT1 determinations from VLBI Intensive obser- vations. J Geod 82(12):863. doi:10.1007/s00190-008-0212-2 Ray JR, Carter WE...Behrend D (2007) The International VLBI Service for Geodesy and Astrometry (IVS): current capabilities and future prospects. J Geod 81(6–8):479. doi
Susceptibility of linear and nonlinear otoacoustic emission components to low-dose styrene exposure.
Tognola, G; Chiaramello, E; Sisto, R; Moleti, A
2015-03-01
To investigate potential susceptibility of active cochlear mechanisms to low-level styrene exposure by comparing TEOAEs in workers and controls. Two advanced analysis techniques were applied to detect sub-clinical changes in linear and nonlinear cochlear mechanisms of OAE generation: the wavelet transform to decompose TEOAEs into time-frequency components and extract signal-to-noise ratio and latency of each component, and the bispectrum to detect and extract nonlinear TEOAE contributions as quadratic frequency couplings (QFCs). Two cohorts of workers were examined: subjects exposed exclusively to styrene (N = 9), and subjects exposed to styrene and noise (N = 6). The control group was perfectly matched by age and sex to the exposed group. Exposed subjects showed significantly lowered SNR in TEOAE components at mid-to-high frequencies (above 1.6 kHz) and a shift of QFC distribution towards lower frequencies than controls. No systematic differences were observed in latency. Low-level styrene exposure may have induced a modification of cochlear functionality as concerns linear and nonlinear OAE generation mechanisms. The lack of change in latency seems to suggest that the OAE components, where generation region and latency are tightly coupled, may not have been affected by styrene and noise exposure levels considered here.
Reducing the PAPR in FBMC-OQAM systems with low-latency trellis-based SLM technique
NASA Astrophysics Data System (ADS)
Bulusu, S. S. Krishna Chaitanya; Shaiek, Hmaied; Roviras, Daniel
2016-12-01
Filter-bank multi-carrier (FBMC) modulations, and more specifically FBMC-offset quadrature amplitude modulation (OQAM), are seen as an interesting alternative to orthogonal frequency division multiplexing (OFDM) for the 5th generation radio access technology. In this paper, we investigate the problem of peak-to-average power ratio (PAPR) reduction for FBMC-OQAM signals. Recently, it has been shown that FBMC-OQAM with trellis-based selected mapping (TSLM) scheme not only is superior to any scheme based on symbol-by-symbol approach but also outperforms that of the OFDM with classical SLM scheme. This paper is an extension of that work, where we analyze the TSLM in terms of computational complexity, required hardware memory, and latency issues. We have proposed an improvement to the TSLM, which requires very less hardware memory, compared to the originally proposed TSLM, and also have low latency. Additionally, the impact of the time duration of partial PAPR on the performance of TSLM is studied, and its lower bound has been identified by proposing a suitable time duration. Also, a thorough and fair comparison of performance has been done with an existing trellis-based scheme proposed in literature. The simulation results show that the proposed low-latency TSLM yields better PAPR reduction performance with relatively less hardware memory requirements.
Influence of fatigue and velocity on the latency and recruitment order of scapular muscles.
Mendez-Rebolledo, Guillermo; Gatica-Rojas, Valeska; Guzman-Muñoz, Eduardo; Martinez-Valdes, Eduardo; Guzman-Venegas, Rodrigo; Berral de la Rosa, Francisco Jose
2018-07-01
To determine the influence of velocity and fatigue on scapular muscle activation latency and recruitment order during a voluntary arm raise task, in healthy individuals. Cross-sectional study. University laboratory. Twenty three male adults per group (high-velocity and low-velocity). Onset latency of scapular muscles [Anterior deltoid (AD), lower trapezius (LT), middle trapezius (MT), upper trapezius (UT), and serratus anterior (SA)] was assessed by surface electromyography. The participants were assigned to one of two groups: low-velocity or high-velocity. Both groups performed a voluntary arm raise task in the scapular plane under two conditions: no-fatigue and fatigue. The UT showed early activation (p < 0.01) in the fatigue condition when performing the arm raise task at a high velocity. At a low velocity and with no muscular fatigue, the recruitment order was MT, LT, SA, AD, and UT. However, the recruitment order changed in the high-velocity with muscular fatigue condition, since the recruitment order was UT, AD, SA, LT, and MT. The simultaneous presence of fatigue and high-velocity in an arm raise task is associated with a decrease in the UT activation latency and a modification of the recruitment order of scapular muscles. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Kavic, Michael; Cregg C. Yancey, Brandon E. Bear, Bernadine Akukwe, Kevin Chen, Jayce Dowell, Jonathan D. Gough, Jonah Kanner, Kenneth Obenberger, Peter Shawhan, John H. Simonetti , Gregory B. Taylor , Jr-Wei Tsai
2016-01-01
We explore opportunities for multi-messenger astronomy using gravitational waves (GWs) and prompt, transient low-frequency radio emission to study highly energetic astrophysical events. We review the literature on possible sources of correlated emission of GWs and radio transients, highlighting proposed mechanisms that lead to a short-duration, high-flux radio pulse originating from the merger of two neutron stars or from a superconducting cosmic string cusp. We discuss the detection prospects for each of these mechanisms by low-frequency dipole array instruments such as LWA1, the Low Frequency Array and the Murchison Widefield Array. We find that a broad range of models may be tested by searching for radio pulses that, when de-dispersed, are temporally and spatially coincident with a LIGO/Virgo GW trigger within a ˜30 s time window and ˜200-500 deg(2) sky region. We consider various possible observing strategies and discuss their advantages and disadvantages. Uniquely, for low-frequency radio arrays, dispersion can delay the radio pulse until after low-latency GW data analysis has identified and reported an event candidate, enabling a prompt radio signal to be captured by a deliberately targeted beam. If neutron star mergers do have detectable prompt radio emissions, a coincident search with the GW detector network and low-frequency radio arrays could increase the LIGO/Virgo effective search volume by up to a factor of ˜2. For some models, we also map the parameter space that may be constrained by non-detections.
NASA Astrophysics Data System (ADS)
Yancey, Cregg C.; Bear, Brandon E.; Akukwe, Bernadine; Chen, Kevin; Dowell, Jayce; Gough, Jonathan D.; Kanner, Jonah; Kavic, Michael; Obenberger, Kenneth; Shawhan, Peter; Simonetti, John H.; -Wei Tsai, Gregory B. Taylor, Jr.
2015-10-01
We explore opportunities for multi-messenger astronomy using gravitational waves (GWs) and prompt, transient low-frequency radio emission to study highly energetic astrophysical events. We review the literature on possible sources of correlated emission of GWs and radio transients, highlighting proposed mechanisms that lead to a short-duration, high-flux radio pulse originating from the merger of two neutron stars or from a superconducting cosmic string cusp. We discuss the detection prospects for each of these mechanisms by low-frequency dipole array instruments such as LWA1, the Low Frequency Array and the Murchison Widefield Array. We find that a broad range of models may be tested by searching for radio pulses that, when de-dispersed, are temporally and spatially coincident with a LIGO/Virgo GW trigger within a ˜30 s time window and ˜200-500 deg2 sky region. We consider various possible observing strategies and discuss their advantages and disadvantages. Uniquely, for low-frequency radio arrays, dispersion can delay the radio pulse until after low-latency GW data analysis has identified and reported an event candidate, enabling a prompt radio signal to be captured by a deliberately targeted beam. If neutron star mergers do have detectable prompt radio emissions, a coincident search with the GW detector network and low-frequency radio arrays could increase the LIGO/Virgo effective search volume by up to a factor of ˜2. For some models, we also map the parameter space that may be constrained by non-detections.
Cueing properties of the decrease of white noise intensity for avoidance conditioning in cats.
Zieliński, K
1979-01-01
In the main experiment two groups of 6 cats each were trained in active bar-pressing avoidance to a CS consisting of either a 10 dB or 20 dB decrease of the background white noise of 70 dB intensity. The two groups did not differ in rapidity of learning, however cats trained to the greater change .in background noise performed avoidance responses with shorter latencies than did cats trained to smaller change. Within-groups comparisons of cumulative distributions of response latencies for consecutive Vincentized fifths of avoidance acquisition showed the greatest changes in the region of latencies longer than the median latency of instrumental responses. On the other hand, the effects of CS intensity found in between-groups comparisons were located in the region of latencies shorter than the median latency of either group. Comparisons with data obtained in a complementary experiment employing additional 17 cats showed that subjects trained to stimuli less intense than the background noise level were marked by an exceptionally low level of avoidance responding with latencies shorter than 1.1 s, which was lower than expected from the probability of intertrial responses for this period of time. Due to this property of stimuli less intense than the background, the distributions of response latencies were moved to the right, in effect, prefrontal lesions influenced a greater part of latency distributions than in cats trained to stimuli more intense than the background.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gebis, Joseph; Oliker, Leonid; Shalf, John
The disparity between microprocessor clock frequencies and memory latency is a primary reason why many demanding applications run well below peak achievable performance. Software controlled scratchpad memories, such as the Cell local store, attempt to ameliorate this discrepancy by enabling precise control over memory movement; however, scratchpad technology confronts the programmer and compiler with an unfamiliar and difficult programming model. In this work, we present the Virtual Vector Architecture (ViVA), which combines the memory semantics of vector computers with a software-controlled scratchpad memory in order to provide a more effective and practical approach to latency hiding. ViVA requires minimal changesmore » to the core design and could thus be easily integrated with conventional processor cores. To validate our approach, we implemented ViVA on the Mambo cycle-accurate full system simulator, which was carefully calibrated to match the performance on our underlying PowerPC Apple G5 architecture. Results show that ViVA is able to deliver significant performance benefits over scalar techniques for a variety of memory access patterns as well as two important memory-bound compact kernels, corner turn and sparse matrix-vector multiplication -- achieving 2x-13x improvement compared the scalar version. Overall, our preliminary ViVA exploration points to a promising approach for improving application performance on leading microprocessors with minimal design and complexity costs, in a power efficient manner.« less
NASA Technical Reports Server (NTRS)
Pedretti, Kevin T.; Fineberg, Samuel A.; Kutler, Paul (Technical Monitor)
1997-01-01
A variety of different network technologies and topologies are currently being evaluated as part of the Whitney Project. This paper reports on the implementation and performance of a Fast Ethernet network configured in a 4x4 2D torus topology in a testbed cluster of 'commodity' Pentium Pro PCs. Several benchmarks were used for performance evaluation: an MPI point to point message passing benchmark, an MPI collective communication benchmark, and the NAS Parallel Benchmarks version 2.2 (NPB2). Our results show that for point to point communication on an unloaded network, the hub and 1 hop routes on the torus have about the same bandwidth and latency. However, the bandwidth decreases and the latency increases on the torus for each additional route hop. Collective communication benchmarks show that the torus provides roughly four times more aggregate bandwidth and eight times faster MPI barrier synchronizations than a hub based network for 16 processor systems. Finally, the SOAPBOX benchmarks, which simulate real-world CFD applications, generally demonstrated substantially better performance on the torus than on the hub. In the few cases the hub was faster, the difference was negligible. In total, our experimental results lead to the conclusion that for Fast Ethernet networks, the torus topology has better performance and scales better than a hub based network.
Preterm delivery at low gestational age: risk factors for short latency. A multivariated analysis
Marzano, Sara; Padula, Francesco; Meloni, Paolo; Anceschi, Maurizio Marco
2008-01-01
Objective The aim of this study is to identify the risk factors for a short latency in preterm delivery at low gestational ages (GA). Study design A retrospective analysis involving, between January 2004 and May 2006, 204 singleton pregnancies with admission diagnosis of preterm labor and, in particular, 91 pregnant women admitted between 24+0 and 31+6 weeks’ gestation. Results In pregnant women with a diagnosis of preterm labor at 24-31+6 weeks’ gestation, at ROC curve, a value of considering WBC and cervical dilatation, combined in the following formula (75.237 - (2.290 * “WBC”) - (10.787 * “cervical dilatation”)) <=33.101 has a 74.2% Sensitivity and a 78.3% Specificity in predicting a latency =< 4 days (+LR 3.42 and -LR 0.33) and a 70% Sensitivity and a 84.3% Specificity in predicting GA at delivery at 24-31 weeks’ gestation (+LR 4.46 and -LR 0.36). Conclusion We suggest a more strictly monitoring and a more aggressive therapy in presence of prognostic parameters of shorter latency. PMID:22439021
Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network
NASA Astrophysics Data System (ADS)
Ammendola A, R.; Biagioni, A.; Frezza, O.; Lo Cicero, F.; Lonardo, A.; Paolucci, P. S.; Rossetti, D.; Simula, F.; Tosoratto, L.; Vicini, P.
2014-06-01
APEnet+ is an INFN (Italian Institute for Nuclear Physics) project aiming to develop a custom 3-Dimensional torus interconnect network optimized for hybrid clusters CPU-GPU dedicated to High Performance scientific Computing. The APEnet+ interconnect fabric is built on a FPGA-based PCI-express board with 6 bi-directional off-board links showing 34 Gbps of raw bandwidth per direction, and leverages upon peer-to-peer capabilities of Fermi and Kepler-class NVIDIA GPUs to obtain real zero-copy, GPU-to-GPU low latency transfers. The minimization of APEnet+ transfer latency is achieved through the adoption of RDMA protocol implemented in FPGA with specialized hardware blocks tightly coupled with embedded microprocessor. This architecture provides a high performance low latency offload engine for both trasmit and receive side of data transactions: preliminary results are encouraging, showing 50% of bandwidth increase for large packet size transfers. In this paper we describe the APEnet+ architecture, detailing the hardware implementation and discuss the impact of such RDMA specialized hardware on host interface latency and bandwidth.
NASA Technical Reports Server (NTRS)
Kikuchi, Hideaki; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya; Shimojo, Fuyuki; Saini, Subhash
2003-01-01
Scalability of a low-cost, Intel Xeon-based, multi-Teraflop Linux cluster is tested for two high-end scientific applications: Classical atomistic simulation based on the molecular dynamics method and quantum mechanical calculation based on the density functional theory. These scalable parallel applications use space-time multiresolution algorithms and feature computational-space decomposition, wavelet-based adaptive load balancing, and spacefilling-curve-based data compression for scalable I/O. Comparative performance tests are performed on a 1,024-processor Linux cluster and a conventional higher-end parallel supercomputer, 1,184-processor IBM SP4. The results show that the performance of the Linux cluster is comparable to that of the SP4. We also study various effects, such as the sharing of memory and L2 cache among processors, on the performance.
MagPy: A Python toolbox for controlling Magstim transcranial magnetic stimulators.
McNair, Nicolas A
2017-01-30
To date, transcranial magnetic stimulation (TMS) studies manipulating stimulation parameters have largely used blocked paradigms. However, altering these parameters on a trial-by-trial basis in Magstim stimulators is complicated by the need to send regular (1Hz) commands to the stimulator. Additionally, effecting such control interferes with the ability to send TMS pulses or simultaneously present stimuli with high-temporal precision. This manuscript presents the MagPy toolbox, a Python software package that provides full control over Magstim stimulators via the serial port. It is able to maintain this control with no impact on concurrent processing, such as stimulus delivery. In addition, a specially-designed "QuickFire" serial cable is specified that allows MagPy to trigger TMS pulses with very low-latency. In a series of experimental simulations, MagPy was able to maintain uninterrupted remote control over the connected Magstim stimulator across all testing sessions. In addition, having MagPy enabled had no effect on stimulus timing - all stimuli were presented for precisely the duration specified. Finally, using the QuickFire cable, MagPy was able to elicit TMS pulses with sub-millisecond latencies. The MagPy toolbox allows for experiments that require manipulating stimulation parameters from trial to trial. Furthermore, it can achieve this in contexts that require tight control over timing, such as those seeking to combine TMS with fMRI or EEG. Together, the MagPy toolbox and QuickFire serial cable provide an effective means for controlling Magstim stimulators during experiments while ensuring high-precision timing. Copyright © 2016 Elsevier B.V. All rights reserved.
Effect of experimental stress in 2 different pain conditions affecting the facial muscles.
Woda, Alain; L'heveder, Gildas; Ouchchane, Lemlih; Bodéré, Céline
2013-05-01
Chronic facial muscle pain is a common feature in both fibromyalgia (FM) and myofascial (MF) pain conditions. In this controlled study, a possible difference in the mode of deregulation of the physiological response to a stressing stimulus was explored by applying an acute mental stress to FM and MF patients and to controls. The effects of the stress test were observed on pain, sympathetic variables, and both tonic and reflex electromyographic activities of masseteric and temporal muscles. The statistical analyses were performed through a generalized linear model including mixed effects. Painful reaction to the stressor was stronger (P < .001) and longer (P = .011) in FM than in MF independently of a higher pain level at baseline. The stress-induced autonomic changes only seen in FM patients did not reach significance. The electromyographic responses to the stress test were strongest for controls and weakest for FM. The stress test had no effect on reflex activity (area under the curve [AUC]) or latency, although AUC was high in FM and latencies were low in both pain groups. It is suggested that FM is characterized by a lower ability to adapt to acute stress than MF. This study showed that an acute psychosocial stress triggered several changes in 2 pain conditions including an increase in pain of larger amplitude in FM than in MF pain. Similar stress-induced changes should be explored as possible mechanisms for differentiation between dysfunctional pain conditions. Copyright © 2013 American Pain Society. Published by Elsevier Inc. All rights reserved.
Richter, Lars; Bruder, Ralf
2013-05-01
Most medical robotic systems require direct interaction or contact with the robot. Force-Torque (FT) sensors can easily be mounted to the robot to control the contact pressure. However, evaluation is often done in software, which leads to latencies. To overcome that, we developed an independent safety system, named FTA sensor, which is based on an FT sensor and an accelerometer. An embedded system (ES) runs a real-time monitoring system for continuously checking of the readings. In case of a collision or error, it instantaneously stops the robot via the robot's external emergency stop. We found that the ES implementing the FTA sensor has a maximum latency of [Formula: see text] ms to trigger the robot's emergency stop. For the standard settings in the application of robotized transcranial magnetic stimulation, the robot will stop after at most 4 mm. Therefore, it works as an independent safety layer preventing patient and/or operator from serious harm.
Getting the “kill” into “shock and kill”: strategies to eliminate latent HIV
Kim, Youry; Anderson, Jenny L.; Lewin, Sharon R.
2018-01-01
Despite the success of antiretroviral therapy (ART), there is currently no HIV cure and treatment is lifelong. HIV persists during ART due to long lived and proliferating latently infected CD4+ T-cells. One strategy to eliminate latency is to activate virus production using latency reversing agents (LRAs) with the goal of triggering cell death through virus-induced cytolysis or immune-mediated clearance. However, multiple studies have demonstrated that activation of viral transcription alone is insufficient to induce cell death and some LRAs may counteract cell death by promoting cell survival. Here, we review new approaches to induce death of latently infected cells through apoptosis and inhibition of pathways critical for cell survival, which are often hijacked by HIV proteins. Given advances in the commercial development of compounds that induce apoptosis in cancer chemotherapy, these agents could move rapidly into clinical trials, either alone or in combination with LRAs, to eliminate latent HIV infection. PMID:29324227
NASA Astrophysics Data System (ADS)
Topchiev, N. P.; Galper, A. M.; Arkhangelskiy, A. I.; Arkhangelskaja, I. V.; Kheymits, M. D.; Suchkov, S. I.; Yurkin, Y. T.
2017-01-01
Scientific project GAMMA-400 (Gamma Astronomical Multifunctional Modular Apparatus) relates to the new generation of space observatories intended to perform an indirect search for signatures of dark matter in the cosmic-ray fluxes, measurements of characteristics of diffuse gamma-ray emission and gamma-rays from the Sun during periods of solar activity, gamma-ray bursts, extended and point gamma-ray sources, electron/positron and cosmic-ray nuclei fluxes up to TeV energy region by means of the GAMMA-400 gamma-ray telescope represents the core of the scientific complex. The system of triggers and counting signals formation of the GAMMA-400 gamma-ray telescope constitutes the pipelined processor structure which collects data from the gamma-ray telescope subsystems and produces summary information used in forming the trigger decision for each event. The system design is based on the use of state-of-the-art reconfigurable logic devices and fast data links. The basic structure, logic of operation and distinctive features of the system are presented.
Data latency and the user community
NASA Astrophysics Data System (ADS)
Escobar, V. M.; Brown, M. E.; Carroll, M.
2013-12-01
The community using NASA Earth science observations in applications has grown significantly, with increasing sophistication to serve national interests. The National Research Council's Earth Science Decadal Survey report stated that the planning for applied and operational considerations in the missions should accompany the acquisition of new knowledge about Earth (NRC, 2007). This directive has made product applications at NASA an integral part of converting the data collected into actionable knowledge that can be used to inform policy. However, successfully bridging scientific research with operational decision making in different application areas requires looking into user data requirements and operational needs. This study was conducted to determine how users are incorporating NASA data into applications and operational processes. The approach included a review of published materials, direct interviews with mission representatives, and an online professional review, which was distributed to over 6000 individuals. We provide a complete description of the findings with definitions and explanations of what goes into measuring latency as well as how users and applications utilize NASA data products. We identified 3 classes of users: operational (need data in 3 hours or less), near real time (need data within a day of acquisition), and scientific users (need highest quality data, time independent). We also determined that most users with applications are interested in specific types of products that may come from multiple missions. These users will take the observations when they are available, however the observations may have additional applications value if they are available either by a certain time of day or within a period of time after acquisition. NASA has supported the need for access to low latency data on an ad-hoc basis and more substantively in stand-alone systems such as the MODIS Rapid Response system and more recently with LANCE. The increased level of support and advertising of that support has grown the community of individuals and organizations that use low-latency science data to supply decision making processes with updated information. These applications are increasingly high profile and have high societal value, which is of importance to NASA, the US Government and the broader society. The primary conclusions of this report revolve around clarifying NASA's intentions to support operations and near real time needs for quick access to data after acquisition. Specifically, clear statements from NASA Headquarters are needed to indicate the level of support that will be provided for low latency data products and whether that support should come directly from the missions or whether this support will come from separate stand-alone systems or direct broadcast. The results of the analysis clearly show the significant benefit to society for serving the needs of the agricultural, emergency response, environmental monitoring and weather communities who use low latency data today, and to grow the use of low latency NASA science data products into new communities in the future. These benefits can be achieved with a clear and consistent NASA policy on product latency.
Accommodation and vergence latencies in human infants
Tondel, Grazyna M.; Candy, T. Rowan
2008-01-01
Purpose Achieving simultaneous single and clear visual experience during postnatal development depends on the temporal relationship between accommodation and vergence, in addition to their accuracies. This study was designed to examine one component of the dynamic relationship, the latencies of the responses. Methods Infants and adults were tested in three conditions i) Binocular viewing of a target moving in depth at 5cm/s (closed loop) ii) monocular viewing of the same target (vergence open loop) iii) binocular viewing of a low spatial frequency Difference of Gaussian target during a prism induced step change in retinal disparity (accommodation open loop). Results There was a significant correlation between accommodation and vergence latencies in binocular conditions for infants from 7 to 23 weeks of age. Some of the infants, as young as 7 or 8 weeks, generated adult-like latencies of less than 0.5 s. Latencies in the vergence open loop and accommodation open loop conditions tended to be shorter for the stimulated system than the open loop system in both cases, and all latencies were typically less than 2 seconds across the infant age range. Conclusions Many infants between 7 and 23 weeks of age were able to generate accommodation and vergence responses with latencies of less than a second in full binocular closed loop conditions. The correlation between the latencies in the two systems suggests that they are limited by related factors from the earliest ages tested. PMID:18199466
Accommodation and vergence latencies in human infants.
Tondel, Grazyna M; Candy, T Rowan
2008-02-01
Achieving simultaneous single and clear visual experience during postnatal development depends on the temporal relationship between accommodation and vergence, in addition to their accuracies. This study was designed to examine one component of the dynamic relationship, the latencies of the responses. Infants and adults were tested in three conditions (i) binocular viewing of a target moving in depth at 5 cm/s (closed loop) (ii) monocular viewing of the same target (vergence open loop) (iii) binocular viewing of a low spatial frequency Difference of Gaussian target during a prism induced step change in retinal disparity (accommodation open loop). There was a significant correlation between accommodation and vergence latencies in binocular conditions for infants from 7 to 23 weeks of age. Some of the infants, as young as 7 or 8 weeks, generated adult-like latencies of less than 0.5 s. Latencies in the vergence open loop and accommodation open loop conditions tended to be shorter for the stimulated system than the open loop system in both cases, and all latencies were typically less than 2 s across the infant age range. Many infants between 7 and 23 weeks of age were able to generate accommodation and vergence responses with latencies of less than a second in full binocular closed loop conditions. The correlation between the latencies in the two systems suggests that they are limited by related factors from the earliest ages tested.
Cooperative use of advanced scanning technology for low-volume hardwood processors
Luis G. Occeña; Timothy J. Rayner; Daniel L. Schmoldt; A. Lynn Abbott
2001-01-01
Of the several hundreds of hardwood lumber sawmills across the country, the majority are small- to medium-sized facilities operated as small businesses in rural communities. Trends of increased log costs and limited availability are forcing wood processors to become more efficient in their operations. Still, small mills are less able to adopt new, more efficient...
The Minerva Multi-Microprocessor.
A multiprocessor system is described which is an experiment in low cost, extensible, multiprocessor architectures. Global issues such as inclusion of a central bus, design of the bus arbiter, and methods of interrupt handling are considered. The system initially includes two processor types, based on microprocessors, and these are discussed. Methods for reducing processor demand for the central bus are described.
A Wireless Headstage for Combined Optogenetics and Multichannel Electrophysiological Recording.
Gagnon-Turcotte, Gabriel; LeChasseur, Yoan; Bories, Cyril; Messaddeq, Younes; De Koninck, Yves; Gosselin, Benoit
2017-02-01
This paper presents a wireless headstage with real-time spike detection and data compression for combined optogenetics and multichannel electrophysiological recording. The proposed headstage, which is intended to perform both optical stimulation and electrophysiological recordings simultaneously in freely moving transgenic rodents, is entirely built with commercial off-the-shelf components, and includes 32 recording channels and 32 optical stimulation channels. It can detect, compress and transmit full action potential waveforms over 32 channels in parallel and in real time using an embedded digital signal processor based on a low-power field programmable gate array and a Microblaze microprocessor softcore. Such a processor implements a complete digital spike detector featuring a novel adaptive threshold based on a Sigma-delta control loop, and a wavelet data compression module using a new dynamic coefficient re-quantization technique achieving large compression ratios with higher signal quality. Simultaneous optical stimulation and recording have been performed in-vivo using an optrode featuring 8 microelectrodes and 1 implantable fiber coupled to a 465-nm LED, in the somatosensory cortex and the Hippocampus of a transgenic mouse expressing ChannelRhodospin (Thy1::ChR2-YFP line 4) under anesthetized conditions. Experimental results show that the proposed headstage can trigger neuron activity while collecting, detecting and compressing single cell microvolt amplitude activity from multiple channels in parallel while achieving overall compression ratios above 500. This is the first reported high-channel count wireless optogenetic device providing simultaneous optical stimulation and recording. Measured characteristics show that the proposed headstage can achieve up to 100% of true positive detection rate for signal-to-noise ratio (SNR) down to 15 dB, while achieving up to 97.28% at SNR as low as 5 dB. The implemented prototype features a lifespan of up to 105 minutes, and uses a lightweight (2.8 g) and compact [Formula: see text] rigid-flex printed circuit board.
Method for fast start of a fuel processor
Ahluwalia, Rajesh K [Burr Ridge, IL; Ahmed, Shabbir [Naperville, IL; Lee, Sheldon H. D. [Willowbrook, IL
2008-01-29
An improved fuel processor for fuel cells is provided whereby the startup time of the processor is less than sixty seconds and can be as low as 30 seconds, if not less. A rapid startup time is achieved by either igniting or allowing a small mixture of air and fuel to react over and warm up the catalyst of an autothermal reformer (ATR). The ATR then produces combustible gases to be subsequently oxidized on and simultaneously warm up water-gas shift zone catalysts. After normal operating temperature has been achieved, the proportion of air included with the fuel is greatly diminished.
NASA Astrophysics Data System (ADS)
Chen, Ming-Chih; Hsiao, Shen-Fu
In this paper, we propose an area-efficient design of Advanced Encryption Standard (AES) processor by applying a new common-expression-elimination (CSE) method to the sub-functions of various transformations required in AES. The proposed method reduces the area cost of realizing the sub-functions by extracting the common factors in the bit-level XOR/AND-based sum-of-product expressions of these sub-functions using a new CSE algorithm. Cell-based implementation results show that the AES processor with our proposed CSE method has significant area improvement compared with previous designs.
NASA Astrophysics Data System (ADS)
Zhang, Yuli; Han, Jun; Weng, Xinqian; He, Zhongzhu; Zeng, Xiaoyang
This paper presents an Application Specific Instruction-set Processor (ASIP) for the SHA-3 BLAKE algorithm family by instruction set extensions (ISE) from an RISC (reduced instruction set computer) processor. With a design space exploration for this ASIP to increase the performance and reduce the area cost, we accomplish an efficient hardware and software implementation of BLAKE algorithm. The special instructions and their well-matched hardware function unit improve the calculation of the key section of the algorithm, namely G-functions. Also, relaxing the time constraint of the special function unit can decrease its hardware cost, while keeping the high data throughput of the processor. Evaluation results reveal the ASIP achieves 335Mbps and 176Mbps for BLAKE-256 and BLAKE-512. The extra area cost is only 8.06k equivalent gates. The proposed ASIP outperforms several software approaches on various platforms in cycle per byte. In fact, both high throughput and low hardware cost achieved by this programmable processor are comparable to that of ASIC implementations.
NASA Astrophysics Data System (ADS)
Xie, Yiwei; Geng, Zihan; Zhuang, Leimeng; Burla, Maurizio; Taddei, Caterina; Hoekman, Marcel; Leinse, Arne; Roeloffzen, Chris G. H.; Boller, Klaus-J.; Lowery, Arthur J.
2017-12-01
Integrated optical signal processors have been identified as a powerful engine for optical processing of microwave signals. They enable wideband and stable signal processing operations on miniaturized chips with ultimate control precision. As a promising application, such processors enables photonic implementations of reconfigurable radio frequency (RF) filters with wide design flexibility, large bandwidth, and high-frequency selectivity. This is a key technology for photonic-assisted RF front ends that opens a path to overcoming the bandwidth limitation of current digital electronics. Here, the recent progress of integrated optical signal processors for implementing such RF filters is reviewed. We highlight the use of a low-loss, high-index-contrast stoichiometric silicon nitride waveguide which promises to serve as a practical material platform for realizing high-performance optical signal processors and points toward photonic RF filters with digital signal processing (DSP)-level flexibility, hundreds-GHz bandwidth, MHz-band frequency selectivity, and full system integration on a chip scale.
Alvarenga, Kátia de Freitas; Alvarez Bernardez-Braga, Gabriela Rosito; Zucki, Fernanda; Duarte, Josilene Luciene; Lopes, Andrea Cintra; Feniman, Mariza Ribeiro
2013-01-01
Summary Introduction: The effects of lead on children's health have been widely studied. Aim: To analyze the correlation between the long latency auditory evoked potential N2 and cognitive P3 with the level of lead poisoning in Brazilian children. Methods: This retrospective study evaluated 20 children ranging in age from 7 to 14 years at the time of audiological and electrophysiological evaluations. We performed periodic surveys of the lead concentration in the blood and basic audiological evaluations. Furthermore, we studied the auditory evoked potential long latency N2 and cognitive P3 by analyzing the absolute latency of the N2 and P3 potentials and the P3 amplitude recorded at Cz. At the time of audiological and electrophysiological evaluations, the average concentration of lead in the blood was less than 10 ug/dL. Results: In conventional audiologic evaluations, all children had hearing thresholds below 20 dBHL for the frequencies tested and normal tympanometry findings; the auditory evoked potential long latency N2 and cognitive P3 were present in 95% of children. No significant correlations were found between the blood lead concentration and latency (p = 0.821) or amplitude (p = 0.411) of the P3 potential. However, the latency of the N2 potential increased with the concentration of lead in the blood, with a significant correlation (p = 0.030). Conclusion: Among Brazilian children with low lead exposure, a significant correlation was found between blood lead levels and the average latency of the auditory evoked potential long latency N2; however, a significant correlation was not observed for the amplitude and latency of the cognitive potential P3. PMID:25991992
DOE Office of Scientific and Technical Information (OSTI.GOV)
Milic, A.
The high luminosities of L > 10{sup 34} cm{sup -2}s{sup -1} at the Large Hadron Collider (LHC) at CERN produce an intense radiation environment that the detectors and their electronics must withstand. The ATLAS detector is a multi-purpose apparatus constructed to explore the new particle physics regime opened by the LHC. Of the many decay particles observed by the ATLAS detector, the energy of the created electrons and photons is measured by a sampling calorimeter technique that uses Liquid Argon (LAr) as its active medium. The front end (FE) electronic readout of the ATLAS LAr calorimeter located on the detectormore » itself consists of a combined analog and digital processing system. In order to exploit the higher luminosity while keeping the same trigger bandwidth of 100 kHz, higher transverse granularity, higher resolution and longitudinal shower shape information will be provided from the LAr calorimeter to the Level-l trigger processors. New trigger readout electronics have been designed for this purpose, which will withstand the radiation dose levels expected for an integrated luminosity of 3000 fb{sup -1} during the high luminosity LHC (HL-LHC), which is well above the original LHC design qualifications. (authors)« less
NASA Technical Reports Server (NTRS)
Ellis, S. R.; Adelstein, B. D.; Baumeler, S.; Jense, G. J.; Jacoby, R. H.; Trejo, Leonard (Technical Monitor)
1998-01-01
Several common defects that we have sought to minimize in immersing virtual environments are: static sensor spatial distortion, visual latency, and low update rates. Human performance within our environments during large amplitude 3D tracking was assessed by objective and subjective methods in the presence and absence of these defects. Results show that 1) removal of our relatively small spatial sensor distortion had minor effects on the tracking activity, 2) an Adapted Cooper-Harper controllability scale proved the most sensitive subjective indicator of the degradation of dynamic fidelity caused by increasing latency and decreasing frame rates, and 3) performance, as measured by normalized RMS tracking error or subjective impressions, was more markedly influenced by changing visual latency than by update rate.
Reconfigurable-logic-based fiber channel network card
NASA Astrophysics Data System (ADS)
Casselman, Steve
1996-10-01
Currently all networking hardware must have predefined tradeoffs between latency and bandwidth. In some applications one feature is more important than the other. We present a system where the tradeoff can be made on a case by case basis. To show this we implement an extremely low latency semaphore passing network within a point to point system.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-27
... Latency Network Connections December 20, 2011. I. Introduction On October 31, 2011, NASDAQ OMX BX, Inc... establish a program for offering low latency network connections and to establish the initial fees for such connections. The Exchange also proposed administrative modifications to Exchange Rule 7034. The proposed rule...
Benoist, Jean-Michel; Pincedé, Ivanne; Ballantyne, Kay; Plaghki, Léon; Le Bars, Daniel
2008-09-03
The quantitative end-point for many behavioral tests of nociception is the reaction time, i.e. the time lapse between the beginning of the application of a stimulus, e.g. heat, and the evoked response. Since it is technically impossible to heat the skin instantaneously by conventional means, the question of the significance of the reaction time to radiant heat remains open. We developed a theoretical framework, a related experimental paradigm and a model to analyze in psychophysical terms the "tail-flick" responses of rats to random variations of noxious radiant heat. A CO(2) laser was used to avoid the drawbacks associated with standard methods of thermal stimulation. Heating of the skin was recorded with an infrared camera and was stopped by the reaction of the animal. For the first time, we define and determine two key descriptors of the behavioral response, namely the behavioral threshold (Tbeta) and the behavioral latency (Lbeta). By employing more than one site of stimulation, the paradigm allows determination of the conduction velocity of the peripheral fibers that trigger the response (V) and an estimation of the latency (Ld) of the central decision-making process. Ld (approximately 130 ms) is unaffected by ambient or skin temperature changes that affect the behavioral threshold (approximately 42.2-44.9 degrees C in the 20-30 degrees C range), behavioral latency (<500 ms), and the conduction velocity of the peripheral fibers that trigger the response (approximately 0.35-0.76 m/s in the 20-30 degrees C range). We propose a simple model that is verified experimentally and that computes the variations in the so-called "tail-flick latency" (TFL) caused by changes in either the power of the radiant heat source, the initial temperature of the skin, or the site of stimulation along the tail. This approach enables the behavioral determinations of latent psychophysical (Tbeta, Lbeta, Ld) and neurophysiological (V) variables that have been previously inaccessible with conventional methods. Such an approach satisfies the repeated requests for improving nociceptive tests and offers a potentially heuristic progress for studying nociceptive behavior on more firm physiological and psychophysical grounds. The validity of using a reaction time of a behavioral response to an increasing heat stimulus as a "pain index" is challenged. This is illustrated by the predicted temperature-dependent variations of the behavioral TFL elicited by spontaneous variations of the temperature of the tail for thermoregulation.
Tomlins, Keith Ian; Chijioke, Ugo; Westby, Andrew
2018-01-01
Gari, a fermented and dried semolina made from cassava, is one of the most common foods in West Africa. Recently introduced biofortified yellow cassava containing provitamin A carotenoids could help tackle vitamin A deficiency prevalent in those areas. However there are concerns because of the low retention of carotenoids during gari processing compared to other processes (e.g. boiling). The aim of the study was to assess the levels of true retention in trans–β-carotene during gari processing and investigate the causes of low retention. Influence of processing step, processor (3 commercial processors) and variety (TMS 01/1371; 01/1368 and 01/1412) were assessed. It was shown that low true retention (46% on average) during gari processing may be explained by not only chemical losses (i.e. due to roasting temperature) but also by physical losses (i.e. due to leaching of carotenoids in discarded liquids): true retention in the liquid lost from grating negatively correlated with true retention retained in the mash (R = -0.914). Moreover, true retention followed the same pattern as lost water at the different processing steps (i.e. for the commercial processors). Variety had a significant influence on true retention, carotenoid content, and trans-cis isomerisation but the processor type had little effect. It is the first time that the importance of physical carotenoid losses was demonstrated during processing of biofortified crops. PMID:29561886
Bechoff, Aurélie; Tomlins, Keith Ian; Chijioke, Ugo; Ilona, Paul; Westby, Andrew; Boy, Erick
2018-01-01
Gari, a fermented and dried semolina made from cassava, is one of the most common foods in West Africa. Recently introduced biofortified yellow cassava containing provitamin A carotenoids could help tackle vitamin A deficiency prevalent in those areas. However there are concerns because of the low retention of carotenoids during gari processing compared to other processes (e.g. boiling). The aim of the study was to assess the levels of true retention in trans-β-carotene during gari processing and investigate the causes of low retention. Influence of processing step, processor (3 commercial processors) and variety (TMS 01/1371; 01/1368 and 01/1412) were assessed. It was shown that low true retention (46% on average) during gari processing may be explained by not only chemical losses (i.e. due to roasting temperature) but also by physical losses (i.e. due to leaching of carotenoids in discarded liquids): true retention in the liquid lost from grating negatively correlated with true retention retained in the mash (R = -0.914). Moreover, true retention followed the same pattern as lost water at the different processing steps (i.e. for the commercial processors). Variety had a significant influence on true retention, carotenoid content, and trans-cis isomerisation but the processor type had little effect. It is the first time that the importance of physical carotenoid losses was demonstrated during processing of biofortified crops.
Seeing the hand while reaching speeds up on-line responses to a sudden change in target position
Reichenbach, Alexandra; Thielscher, Axel; Peer, Angelika; Bülthoff, Heinrich H; Bresciani, Jean-Pierre
2009-01-01
Goal-directed movements are executed under the permanent supervision of the central nervous system, which continuously processes sensory afferents and triggers on-line corrections if movement accuracy seems to be compromised. For arm reaching movements, visual information about the hand plays an important role in this supervision, notably improving reaching accuracy. Here, we tested whether visual feedback of the hand affects the latency of on-line responses to an external perturbation when reaching for a visual target. Two types of perturbation were used: visual perturbation consisted in changing the spatial location of the target and kinesthetic perturbation in applying a force step to the reaching arm. For both types of perturbation, the hand trajectory and the electromyographic (EMG) activity of shoulder muscles were analysed to assess whether visual feedback of the hand speeds up on-line corrections. Without visual feedback of the hand, on-line responses to visual perturbation exhibited the longest latency. This latency was reduced by about 10% when visual feedback of the hand was provided. On the other hand, the latency of on-line responses to kinesthetic perturbation was independent of the availability of visual feedback of the hand. In a control experiment, we tested the effect of visual feedback of the hand on visual and kinesthetic two-choice reaction times – for which coordinate transformation is not critical. Two-choice reaction times were never facilitated by visual feedback of the hand. Taken together, our results suggest that visual feedback of the hand speeds up on-line corrections when the position of the visual target with respect to the body must be re-computed during movement execution. This facilitation probably results from the possibility to map hand- and target-related information in a common visual reference frame. PMID:19675067
Fulton, Jeremy; LeMoine, Christophe M R; Bucking, Carol; Brix, Kevin V; Walsh, Patrick J; McDonald, M Danielle
2017-03-15
The Gulf toadfish (Opsanus beta) has a fully functional ornithine urea cycle (O-UC) that allows it to excrete nitrogenous waste in the form of urea. Interestingly, urea is excreted in a pulse across the gill that lasts 1-3h and occurs once or twice a day. Both the stress hormone, cortisol, and the neurotransmitter, serotonin (5-HT) are involved in the control of pulsatile urea excretion. This and other evidence suggests that urea pulsing may be linked to toadfish social behavior. The hypothesis of the present study was that toadfish urea pulses can be triggered by waterborne chemical cues from conspecifics. Our findings indicate that exposure to seawater that held a donor conspecific for up to 48h (pre-conditioned seawater; PC-SW) induced a urea pulse within 7h in naïve conspecifics compared to a pulse latency of 20h when exposed to seawater alone. Factors such as PC-SW intensity and donor body mass influenced the pulse latency response of naïve conspecifics. Fractionation and heat treatment of PC-SW to narrow possible signal candidates revealed that the active chemical was both water-soluble and heat-stable. Fish exposed to urea, cortisol or 5-HT in seawater did not have a pulse latency that was significantly different than seawater alone; however, ammonia, perhaps in the form of NH 4 Cl, was found to be a factor in the pulse latency response of toadfish to PC-SW and could be one component of a multi-component cue used for chemical communication in toadfish. Further studies are needed to fully identify the chemical cue as well as determine its adaptive significance in this marine teleost fish. Copyright © 2016. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Kim, Ronny Yongho; Jung, Inuk; Kim, Young Yong
IEEE 802.16m is an advanced air interface standard which is under development for IMT-Advanced systems, known as 4G systems. IEEE 802.16m is designed to provide a high data rate and a Quality of Service (QoS) level in order to meet user service requirements, and is especially suitable for mobilized environments. There are several factors that have great impact on such requirements. As one of the major factors, we mainly focus on latency issues. In IEEE 802.16m, an enhanced layer 2 handover scheme, described as Entry Before Break (EBB) was proposed and adopted to reduce handover latency. EBB provides significant handover interruption time reduction with respect to the legacy IEEE 802.16 handover scheme. Fast handovers for mobile IPv6 (FMIPv6) was standardized by Internet Engineering Task Force (IETF) in order to provide reduced handover interruption time from IP layer perspective. Since FMIPv6 utilizes link layer triggers to reduce handover latency, it is very critical to jointly design FMIPv6 with its underlying link layer protocol. However, FMIPv6 based on new handover scheme, EBB has not been proposed. In this paper, we propose an improved cross-layering design for FMIPv6 based on the IEEE 802.16m EBB handover. In comparison with the conventional FMIPv6 based on the legacy IEEE 802.16 network, the overall handover interruption time can be significantly reduced by employing the proposed design. Benefits of this improvement on latency reduction for mobile user applications are thoroughly investigated with both numerical analysis and simulation on various IP applications.
Wang, Ling; Li, Guangyu; Yao, Zhi Q; Moorman, Jonathan P; Ning, Shunbin
2015-09-01
MicroRNAs (miRNAs) function as key regulators in immune responses and cancer development. In the contexts of infection with oncogenic viruses, miRNAs are engaged in viral persistence, latency establishment and maintenance, and oncogenesis. In this review, we summarize the potential roles and mechanisms of viral and cellular miRNAs in the host-pathogen interactions during infection with selected tumor viruses and HIV, which include (i) repressing viral replication and facilitating latency establishment by targeting viral transcripts, (ii) evading innate and adaptive immune responses via toll-like receptors, RIG-I-like receptors, T-cell receptor, and B-cell receptor pathways by targeting signaling molecules such as TRAF6, IRAK1, IKKε, and MyD88, as well as downstream targets including regulatory cytokines such as tumor necrosis factor α, interferon γ, interleukin 10, and transforming growth factor β, (iii) antagonizing intrinsic and extrinsic apoptosis pathways by targeting pro-apoptotic or anti-apoptotic gene transcripts such as the Bcl-2 family and caspase-3, (iv) modulating cell proliferation and survival through regulation of the Wnt, PI3K/Akt, Erk/MAPK, and Jak/STAT signaling pathways, as well as the signaling pathways triggered by viral oncoproteins such as Epstein-Barr Virus LMP1, by targeting Wnt-inhibiting factor 1, SHIP, pTEN, and SOCSs, and (v) regulating cell cycle progression by targeting cell cycle inhibitors such as p21/WAF1 and p27/KIP1. Further elucidation of the interaction between miRNAs and these key biological events will facilitate our understanding of the pathogenesis of viral latency and oncogenesis and may lead to the identification of miRNAs as novel targets for developing new therapeutic or preventive interventions. Copyright © 2015 John Wiley & Sons, Ltd.
Babjack, Destiny L; Cernicky, Brandon; Sobotka, Andrew J; Basler, Lee; Struthers, Devon; Kisic, Richard; Barone, Kimberly; Zuccolotto, Anthony P
2015-09-01
Using differing computer platforms and audio output devices to deliver audio stimuli often introduces (1) substantial variability across labs and (2) variable time between the intended and actual sound delivery (the sound onset latency). Fast, accurate audio onset latencies are particularly important when audio stimuli need to be delivered precisely as part of studies that depend on accurate timing (e.g., electroencephalographic, event-related potential, or multimodal studies), or in multisite studies in which standardization and strict control over the computer platforms used is not feasible. This research describes the variability introduced by using differing configurations and introduces a novel approach to minimizing audio sound latency and variability. A stimulus presentation and latency assessment approach is presented using E-Prime and Chronos (a new multifunction, USB-based data presentation and collection device). The present approach reliably delivers audio stimuli with low latencies that vary by ≤1 ms, independent of hardware and Windows operating system (OS)/driver combinations. The Chronos audio subsystem adopts a buffering, aborting, querying, and remixing approach to the delivery of audio, to achieve a consistent 1-ms sound onset latency for single-sound delivery, and precise delivery of multiple sounds that achieves standard deviations of 1/10th of a millisecond without the use of advanced scripting. Chronos's sound onset latencies are small, reliable, and consistent across systems. Testing of standard audio delivery devices and configurations highlights the need for careful attention to consistency between labs, experiments, and multiple study sites in their hardware choices, OS selections, and adoption of audio delivery systems designed to sidestep the audio latency variability issue.
White, Cory H.; Moesker, Bastiaan; Beliakova-Bethell, Nadejda; Martins, Laura J.; Richman, Douglas D.; Planelles, Vicente; Woelk, Christopher H.
2016-01-01
The search for an HIV-1 cure has been greatly hindered by the presence of a viral reservoir that persists despite antiretroviral therapy (ART). Studies of HIV-1 latency in vivo are also complicated by the low proportion of latently infected cells in HIV-1 infected individuals. A number of models of HIV-1 latency have been developed to examine the signaling pathways and viral determinants of latency and reactivation. A primary cell model of HIV-1 latency, which incorporates the generation of primary central memory CD4 T cells (TCM), full-length virus infection (HIVNL4-3) and ART to suppress virus replication, was used to investigate the establishment of HIV latency using RNA-Seq. Initially, an investigation of host and viral gene expression in the resting and activated states of this model indicated that the resting condition was reflective of a latent state. Then, a comparison of the host transcriptome between the uninfected and latently infected conditions of this model identified 826 differentially expressed genes, many of which were related to p53 signaling. Inhibition of the transcriptional activity of p53 by pifithrin-α during HIV-1 infection reduced the ability of HIV-1 to be reactivated from its latent state by an unknown mechanism. In conclusion, this model may be used to screen latency reversing agents utilized in shock and kill approaches to cure HIV, to search for cellular markers of latency, and to understand the mechanisms by which HIV-1 establishes latency. PMID:27898737
White, Cory H; Moesker, Bastiaan; Beliakova-Bethell, Nadejda; Martins, Laura J; Spina, Celsa A; Margolis, David M; Richman, Douglas D; Planelles, Vicente; Bosque, Alberto; Woelk, Christopher H
2016-11-01
The search for an HIV-1 cure has been greatly hindered by the presence of a viral reservoir that persists despite antiretroviral therapy (ART). Studies of HIV-1 latency in vivo are also complicated by the low proportion of latently infected cells in HIV-1 infected individuals. A number of models of HIV-1 latency have been developed to examine the signaling pathways and viral determinants of latency and reactivation. A primary cell model of HIV-1 latency, which incorporates the generation of primary central memory CD4 T cells (TCM), full-length virus infection (HIVNL4-3) and ART to suppress virus replication, was used to investigate the establishment of HIV latency using RNA-Seq. Initially, an investigation of host and viral gene expression in the resting and activated states of this model indicated that the resting condition was reflective of a latent state. Then, a comparison of the host transcriptome between the uninfected and latently infected conditions of this model identified 826 differentially expressed genes, many of which were related to p53 signaling. Inhibition of the transcriptional activity of p53 by pifithrin-α during HIV-1 infection reduced the ability of HIV-1 to be reactivated from its latent state by an unknown mechanism. In conclusion, this model may be used to screen latency reversing agents utilized in shock and kill approaches to cure HIV, to search for cellular markers of latency, and to understand the mechanisms by which HIV-1 establishes latency.
What triggers catch-up saccades during visual tracking?
de Brouwer, Sophie; Yuksel, Demet; Blohm, Gunnar; Missal, Marcus; Lefèvre, Philippe
2002-03-01
When tracking moving visual stimuli, primates orient their visual axis by combining two kinds of eye movements, smooth pursuit and saccades, that have very different dynamics. Yet, the mechanisms that govern the decision to switch from one type of eye movement to the other are still poorly understood, even though they could bring a significant contribution to the understanding of how the CNS combines different kinds of control strategies to achieve a common motor and sensory goal. In this study, we investigated the oculomotor responses to a large range of different combinations of position error and velocity error during visual tracking of moving stimuli in humans. We found that the oculomotor system uses a prediction of the time at which the eye trajectory will cross the target, defined as the "eye crossing time" (T(XE)). The eye crossing time, which depends on both position error and velocity error, is the criterion used to switch between smooth and saccadic pursuit, i.e., to trigger catch-up saccades. On average, for T(XE) between 40 and 180 ms, no saccade is triggered and target tracking remains purely smooth. Conversely, when T(XE) becomes smaller than 40 ms or larger than 180 ms, a saccade is triggered after a short latency (around 125 ms).
Sasaki, S
1999-10-01
Functional connections of single reticulospinal neurons (RSNs) in the nucleus reticularis gigantocellularis (NRG) with ipsilateral dorsal neck motoneurons were examined with the spike-triggered averaging technique. Extracellular spikes of single NRG-RSNs activated antidromically from the C6, but not from the L1 segment (C-RSNs) were used as the trigger. These neurons were monosynaptically activated from the superior colliculus and the cerebral peduncle. Single-RSN PSPs were recorded in 43 dorsal neck motoneurons [biventer cervicis and complexus (BCC) and splenius (SPL)] for 21 NRG-RSNs and 135 motoneurons tested. All synaptic potentials were EPSPs, and most of their latencies, measured from the triggering spikes, were 0.8-1.5 ms, which is in a monosynaptic range. The amplitudes of single-RSN EPSPs were 10-360 microV. Spike-triggered averaging revealed single-RSN EPSPs in multiple motoneurons of the same species (SPL or BCC), their locations extending up to nearly 1 mm rostrocaudally. Synaptic connections of single RSNs with both SPL and BCC motoneurons were also found with some predominance for one of them. The results provide direct evidence that NRG-RSNs make monosynaptic excitatory connections with SPL and BCC motoneurons. It appears that some NRG-RSNs connect predominantly with SPL motoneurons and others with BCC motoneurons.
NASA Astrophysics Data System (ADS)
Cochran, E. S.; Lawrence, J. F.; Christensen, C. M.; Chung, A. I.; Neighbors, C.; Saltzman, J.
2010-12-01
The Quake-Catcher Network (QCN) involves the community in strong motion data collection by utilizing volunteer computing techniques and low-cost MEMS accelerometers. Volunteer computing provides a mechanism to expand strong-motion seismology with minimal infrastructure costs, while promoting community participation in science. Micro-Electro-Mechanical Systems (MEMS) triaxial accelerometers can be attached to a desktop computer via USB and are internal to many laptops. Preliminary shake table tests show the MEMS accelerometers can record high-quality seismic data with instrument response similar to research-grade strong-motion sensors. QCN began distributing sensors and software to K-12 schools and the general public in April 2008 and has grown to roughly 1500 stations worldwide. We also recently tested whether sensors could be quickly deployed as part of a Rapid Aftershock Mobilization Program (RAMP) following the 2010 M8.8 Maule, Chile earthquake. Volunteers are recruited through media reports, web-based sensor request forms, as well as social networking sites. Using data collected to date, we examine whether a distributed sensing network can provide valuable seismic data for earthquake detection and characterization while promoting community participation in earthquake science. We utilize client-side triggering algorithms to determine when significant ground shaking occurs and this metadata is sent to the main QCN server. On average, trigger metadata are received within 1-10 seconds from the observation of a trigger; the larger data latencies are correlated with greater server-station distances. When triggers are detected, we determine if the triggers correlate to others in the network using spatial and temporal clustering of incoming trigger information. If a minimum number of triggers are detected then a QCN-event is declared and an initial earthquake location and magnitude is estimated. Initial analysis suggests that the estimated locations and magnitudes are similar to those reported in regional and global catalogs. As the network expands, it will become increasingly important to provide volunteers access to the data they collect, both to encourage continued participation in the network and to improve community engagement in scientific discourse related to seismic hazard. In the future, we hope to provide access to both images and raw data from seismograms in formats accessible to the general public through existing seismic data archives (e.g. IRIS, SCSN) and/or through the QCN project website. While encouraging community participation in seismic data collection, we can extend the capabilities of existing seismic networks to rapidly detect and characterize strong motion events. In addition, the dense waveform observations may provide high-resolution ground shaking information to improve source imaging and seismic risk assessment.
NASA Astrophysics Data System (ADS)
Handhika, T.; Bustamam, A.; Ernastuti, Kerami, D.
2017-07-01
Multi-thread programming using OpenMP on the shared-memory architecture with hyperthreading technology allows the resource to be accessed by multiple processors simultaneously. Each processor can execute more than one thread for a certain period of time. However, its speedup depends on the ability of the processor to execute threads in limited quantities, especially the sequential algorithm which contains a nested loop. The number of the outer loop iterations is greater than the maximum number of threads that can be executed by a processor. The thread distribution technique that had been found previously only be applied by the high-level programmer. This paper generates a parallelization procedure for low-level programmer in dealing with 2-level nested loop problems with the maximum number of threads that can be executed by a processor is smaller than the number of the outer loop iterations. Data preprocessing which is related to the number of the outer loop and the inner loop iterations, the computational time required to execute each iteration and the maximum number of threads that can be executed by a processor are used as a strategy to determine which parallel region that will produce optimal speedup.
A Low-Power ASIC Signal Processor for a Vestibular Prosthesis.
Töreyin, Hakan; Bhatti, Pamela T
2016-06-01
A low-power ASIC signal processor for a vestibular prosthesis (VP) is reported. Fabricated with TI 0.35 μm CMOS technology and designed to interface with implanted inertial sensors, the digitally assisted analog signal processor operates extensively in the CMOS subthreshold region. During its operation the ASIC encodes head motion signals captured by the inertial sensors as electrical pulses ultimately targeted for in-vivo stimulation of vestibular nerve fibers. To achieve this, the ASIC implements a coordinate system transformation to correct for misalignment between natural sensors and implanted inertial sensors. It also mimics the frequency response characteristics and frequency encoding mappings of angular and linear head motions observed at the peripheral sense organs, semicircular canals and otolith. Overall the design occupies an area of 6.22 mm (2) and consumes 1.24 mW when supplied with ± 1.6 V.
A Low-Power ASIC Signal Processor for a Vestibular Prosthesis
Töreyin, Hakan; Bhatti, Pamela T.
2017-01-01
A low-power ASIC signal processor for a vestibular prosthesis (VP) is reported. Fabricated with TI 0.35 μm CMOS technology and designed to interface with implanted inertial sensors, the digitally assisted analog signal processor operates extensively in the CMOS subthreshold region. During its operation the ASIC encodes head motion signals captured by the inertial sensors as electrical pulses ultimately targeted for in-vivo stimulation of vestibular nerve fibers. To achieve this, the ASIC implements a coordinate system transformation to correct for misalignment between natural sensors and implanted inertial sensors. It also mimics the frequency response characteristics and frequency encoding mappings of angular and linear head motions observed at the peripheral sense organs, semicircular canals and otolith. Overall the design occupies an area of 6.22 mm2 and consumes 1.24 mW when supplied with ± 1.6 V. PMID:26800546
Color sensor and neural processor on one chip
NASA Astrophysics Data System (ADS)
Fiesler, Emile; Campbell, Shannon R.; Kempem, Lother; Duong, Tuan A.
1998-10-01
Low-cost, compact, and robust color sensor that can operate in real-time under various environmental conditions can benefit many applications, including quality control, chemical sensing, food production, medical diagnostics, energy conservation, monitoring of hazardous waste, and recycling. Unfortunately, existing color sensor are either bulky and expensive or do not provide the required speed and accuracy. In this publication we describe the design of an accurate real-time color classification sensor, together with preprocessing and a subsequent neural network processor integrated on a single complementary metal oxide semiconductor (CMOS) integrated circuit. This one-chip sensor and information processor will be low in cost, robust, and mass-producible using standard commercial CMOS processes. The performance of the chip and the feasibility of its manufacturing is proven through computer simulations based on CMOS hardware parameters. Comparisons with competing methodologies show a significantly higher performance for our device.
NASA Technical Reports Server (NTRS)
Felippa, Carlos A.
1989-01-01
This is the fifth of a set of five volumes which describe the software architecture for the Computational Structural Mechanics Testbed. Derived from NICE, an integrated software system developed at Lockheed Palo Alto Research Laboratory, the architecture is composed of the command language (CLAMP), the command language interpreter (CLIP), and the data manager (GAL). Volumes 1, 2, and 3 (NASA CR's 178384, 178385, and 178386, respectively) describe CLAMP and CLIP and the CLIP-processor interface. Volumes 4 and 5 (NASA CR's 178387 and 178388, respectively) describe GAL and its low-level I/O. CLAMP, an acronym for Command Language for Applied Mechanics Processors, is designed to control the flow of execution of processors written for NICE. Volume 5 describes the low-level data management component of the NICE software. It is intended only for advanced programmers involved in maintenance of the software.
Evaluation of MERIS products from Baltic Sea coastal waters rich in CDOM
NASA Astrophysics Data System (ADS)
Beltrán-Abaunza, J. M.; Kratzer, S.; Brockmann, C.
2013-11-01
In this study, retrievals of the medium resolution imaging spectrometer (MERIS) reflectances and water quality products using 4 different coastal processing algorithms freely available are assessed by comparison against sea-truthing data. The study is based on a pair-wise comparison using processor-dependent quality flags for the retrieval of valid common macro-pixels. This assessment is required in order to ensure the reliability of monitoring systems based on MERIS data, such as the Swedish coastal and lake monitoring system (http.vattenkvalitet.se). The results show that the pre-processing with the Improved Contrast between Ocean and Land (ICOL) processor, correcting for adjacency effects, improve the retrieval of spectral reflectance for all processors, Therefore, it is recommended that the ICOL processor should be applied when Baltic coastal waters are investigated. Chlorophyll was retrieved best using the FUB (Free University of Berlin) processing algorithm, although overestimations in the range 18-26.5%, dependent on the compared pairs, were obtained. At low chlorophyll concentrations (< 2.5 mg m-3), random errors dominated in the retrievals with the MEGS (MERIS ground segment processor) processor. The lowest bias and random errors were obtained with MEGS for suspended particulate matter, for which overestimations in te range of 8-16% were found. Only the FUB retrieved CDOM (Coloured Dissolved Organic Matter) correlate with in situ values. However, a large systematic underestimation appears in the estimates that nevertheless may be corrected for by using a~local correction factor. The MEGS has the potential to be used as an operational processing algorithm for the Himmerfjärden bay and adjacent areas, but it requires further improvement of the atmospheric correction for the blue bands and better definition at relatively low chlorophyll concentrations in presence of high CDOM attenuation.
Evaluation of MERIS products from Baltic Sea coastal waters rich in CDOM
NASA Astrophysics Data System (ADS)
Beltrán-Abaunza, J. M.; Kratzer, S.; Brockmann, C.
2014-05-01
In this study, retrievals of the medium resolution imaging spectrometer (MERIS) reflectances and water quality products using four different coastal processing algorithms freely available are assessed by comparison against sea-truthing data. The study is based on a pair-wise comparison using processor-dependent quality flags for the retrieval of valid common macro-pixels. This assessment is required in order to ensure the reliability of monitoring systems based on MERIS data, such as the Swedish coastal and lake monitoring system (http://vattenkvalitet.se). The results show that the pre-processing with the Improved Contrast between Ocean and Land (ICOL) processor, correcting for adjacency effects, improves the retrieval of spectral reflectance for all processors. Therefore, it is recommended that the ICOL processor should be applied when Baltic coastal waters are investigated. Chlorophyll was retrieved best using the FUB (Free University of Berlin) processing algorithm, although overestimations in the range 18-26.5%, dependent on the compared pairs, were obtained. At low chlorophyll concentrations (< 2.5 mg m-3), data dispersion dominated in the retrievals with the MEGS (MERIS ground segment processor) processor. The lowest bias and data dispersion were obtained with MEGS for suspended particulate matter, for which overestimations in the range of 8-16% were found. Only the FUB retrieved CDOM (coloured dissolved organic matter) correlate with in situ values. However, a large systematic underestimation appears in the estimates that nevertheless may be corrected for by using a local correction factor. The MEGS has the potential to be used as an operational processing algorithm for the Himmerfjärden bay and adjacent areas, but it requires further improvement of the atmospheric correction for the blue bands and better definition at relatively low chlorophyll concentrations in the presence of high CDOM attenuation.
The AMchip04 and the processing unit prototype for the FastTracker
NASA Astrophysics Data System (ADS)
Andreani, A.; Annovi, A.; Beretta, M.; Bogdan, M.; Citterio, M.; Alberti, F.; Giannetti, P.; Lanza, A.; Magalotti, D.; Piendibene, M.; Shochet, M.; Stabile, A.; Tang, J.; Tompkins, L.; Volpi, G.
2012-08-01
Modern experiments search for extremely rare processes hidden in much larger background levels. As the experiment`s complexity, the accelerator backgrounds and luminosity increase we need increasingly complex and exclusive event selection. We present the first prototype of a new Processing Unit (PU), the core of the FastTracker processor (FTK). FTK is a real time tracking device for the ATLAS experiment`s trigger upgrade. The computing power of the PU is such that a few hundred of them will be able to reconstruct all the tracks with transverse momentum above 1 GeV/c in ATLAS events up to Phase II instantaneous luminosities (3 × 1034 cm-2 s-1) with an event input rate of 100 kHz and a latency below a hundred microseconds. The PU provides massive computing power to minimize the online execution time of complex tracking algorithms. The time consuming pattern recognition problem, generally referred to as the ``combinatorial challenge'', is solved by the Associative Memory (AM) technology exploiting parallelism to the maximum extent; it compares the event to all pre-calculated ``expectations'' or ``patterns'' (pattern matching) simultaneously, looking for candidate tracks called ``roads''. This approach reduces to a linear behavior the typical exponential complexity of the CPU based algorithms. Pattern recognition is completed by the time data are loaded into the AM devices. We report on the design of the first Processing Unit prototypes. The design had to address the most challenging aspects of this technology: a huge number of detector clusters (``hits'') must be distributed at high rate with very large fan-out to all patterns (10 Million patterns will be located on 128 chips placed on a single board) and a huge number of roads must be collected and sent back to the FTK post-pattern-recognition functions. A network of high speed serial links is used to solve the data distribution problem.
NASA Technical Reports Server (NTRS)
Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.
NASA Technical Reports Server (NTRS)
Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter. In order to increase overall robustness, the vehicle also has an alternate method of triggering the drogue parachute deployment based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this velocity-based trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers excellent performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.
Fast, externally triggered, digital phase controller for an optical lattice
NASA Astrophysics Data System (ADS)
Sadgrove, Mark; Nakagawa, Ken'ichi
2011-11-01
We present a method to control the phase of an optical lattice according to an external trigger signal. The method has a latency of less than 30 μs. Two phase locked digital synthesizers provide the driving signal for two acousto-optic modulators which control the frequency and phase of the counter-propagating beams which form a standing wave (optical lattice). A micro-controller with an external interrupt function is connected to the desired external signal, and updates the phase register of one of the synthesizers when the external signal changes. The standing wave (period λ/2 = 390 nm) can be moved by units of 49 nm with a mean jitter of 28 nm. The phase change is well known due to the digital nature of the synthesizer, and does not need calibration. The uses of the scheme include coherent control of atomic matter-wave dynamics.
Mid-latency evoked potentials in self-reported impulsive aggression.
Houston, R J; Stanford, M S
2001-02-01
The present study was conducted to examine psychophysiological differences in arousability among individuals who display impulsive aggressive outbursts. Amplitude and latency for the mid-latency evoked potentials (P1, N1 and P2) were obtained at scalp electrode sites. The evoking stimuli were three intensities (low, medium, high) of photic stimulation. Compared to non-aggressive controls, impulsive aggressive subjects showed significantly reduced P1 amplitude, which is indicative of an inefficient sensory gating mechanism. In addition, these subjects exhibited significantly larger N1 amplitude implying an enhanced orienting of attention to stimuli. Impulsive aggressive subjects also exhibited shorter P1, N1 and P2 peak latency. These results suggest that impulsive aggressive individuals may display quicker orienting and processing of stimuli in an attempt to compensate for low resting arousal levels. Finally, impulsive aggressive subjects augmented the P1-N1 component more frequently than controls, which is consistent with previous studies examining impulsivity and sensation seeking. Together, these findings extend previous work concerning the underlying physiology of impulsive aggression. It has been suggested that impulsive aggressive individuals may attempt to compensate for low resting arousal levels by engaging in stimulus seeking behaviors. Accordingly, the present findings imply similar physiological compensatory responses as demonstrated by heightened orienting of attention, processing and arousability. In addition, a compromised sensory gating system in impulsive aggressors may exacerbate such circumstances, and lead to later cognitive processing deficits.
Dragas, Jelena; Jäckel, David; Hierlemann, Andreas; Franke, Felix
2017-01-01
Reliable real-time low-latency spike sorting with large data throughput is essential for studies of neural network dynamics and for brain-machine interfaces (BMIs), in which the stimulation of neural networks is based on the networks' most recent activity. However, the majority of existing multi-electrode spike-sorting algorithms are unsuited for processing high quantities of simultaneously recorded data. Recording from large neuronal networks using large high-density electrode sets (thousands of electrodes) imposes high demands on the data-processing hardware regarding computational complexity and data transmission bandwidth; this, in turn, entails demanding requirements in terms of chip area, memory resources and processing latency. This paper presents computational complexity optimization techniques, which facilitate the use of spike-sorting algorithms in large multi-electrode-based recording systems. The techniques are then applied to a previously published algorithm, on its own, unsuited for large electrode set recordings. Further, a real-time low-latency high-performance VLSI hardware architecture of the modified algorithm is presented, featuring a folded structure capable of processing the activity of hundreds of neurons simultaneously. The hardware is reconfigurable “on-the-fly” and adaptable to the nonstationarities of neuronal recordings. By transmitting exclusively spike time stamps and/or spike waveforms, its real-time processing offers the possibility of data bandwidth and data storage reduction. PMID:25415989
Dragas, Jelena; Jackel, David; Hierlemann, Andreas; Franke, Felix
2015-03-01
Reliable real-time low-latency spike sorting with large data throughput is essential for studies of neural network dynamics and for brain-machine interfaces (BMIs), in which the stimulation of neural networks is based on the networks' most recent activity. However, the majority of existing multi-electrode spike-sorting algorithms are unsuited for processing high quantities of simultaneously recorded data. Recording from large neuronal networks using large high-density electrode sets (thousands of electrodes) imposes high demands on the data-processing hardware regarding computational complexity and data transmission bandwidth; this, in turn, entails demanding requirements in terms of chip area, memory resources and processing latency. This paper presents computational complexity optimization techniques, which facilitate the use of spike-sorting algorithms in large multi-electrode-based recording systems. The techniques are then applied to a previously published algorithm, on its own, unsuited for large electrode set recordings. Further, a real-time low-latency high-performance VLSI hardware architecture of the modified algorithm is presented, featuring a folded structure capable of processing the activity of hundreds of neurons simultaneously. The hardware is reconfigurable “on-the-fly” and adaptable to the nonstationarities of neuronal recordings. By transmitting exclusively spike time stamps and/or spike waveforms, its real-time processing offers the possibility of data bandwidth and data storage reduction.
Begum, Tahamina; Reza, Faruque; Ahmed, Izmer; Abdullah, Jafri Malin
2014-03-01
Simple geometric and organic shapes and their arrangement are being used in different neuropsychology tests for the assessment of cognitive function, special memory and also for the therapy purpose in different patient groups. Until now there is no electrophysiological evidence of cognitive function determination for simple geometric, organic shapes and their arrangement. Then the main objective of this study is to know the cortical processing and amplitude, latency of visual induced N170 and P300 event related potential components on different geometric, organic shapes and their arrangement and different educational influence on it, which is worthwhile to know for the early and better treatment for those patient groups. While education influenced on cognitive function by using auditory oddball task, little is known about the influence of education on cognitive function induced by visual attention task in case of the choice of geometric, organic shapes and their arrangements. Using a 128-electrode sensor net, we studied the responses of the choice of the different geometric and organic shapes randomly in experiment 1 and their arrangements in experiment 2 in the high, medium and low education groups. In both experiments, subjects push the button "1" or "2" if like or dislike, respectively. Total 45 healthy subjects (15 in each group) were recruited. ERPs were measured from 11 electrode sites and analyzed to see the evoked N170/N240 and P300 ERP components. There were no differences between like and dislike in amplitudes even in latencies in every stimulus in both experiments. We fixed geometric shapes and organic shapes stimuli only, not like and dislike. Upon the stimulus types, N170 ERP component was found instead of N240, in occipito-temporal (T5, T6, O1 and O2) locations where the amplitude is the highest at O2 location and P300 was distributed in the central (Cz and Pz) locations in both experiments in all groups. In experiment 1, significant low amplitude and non-significant larger latency of the N170 component are found out at O1 location for both stimuli in low education group comparing medium education groups, but in experiment 2, there is no significant difference between stimuli among groups in amplitude and latency. In both experiments, P300 component was found in Cz and Pz locations though the amplitudes are higher at Cz than Pz areas. In experiment 1, medium education group evoked significantly (geometric shape stimuli, P = 0.05; organic shape stimuli, P = 0.02) higher amplitude of P300 component comparing low education group at Cz location. Whereas, there is no significant difference of amplitudes among groups across stimuli in Cz and Pz locations in experiment 2. Latencies have no significant differences in both experiments among groups also, but longer latency are found in low education group at Cz location comparing medium education group, though not significant. We conclude that simple geometric shapes, organic shapes and their arrangements evoked visual N170 component at temporo-occipital areas with right lateralization and P300 ERP component at centro-parietal areas. Significant low amplitude of N170 and P300 ERP components and longer latencies during different shape stimuli in low education group prove that, low education significantly influence on visual cognitive functions in low education group.
NASA Astrophysics Data System (ADS)
Krivda, M.; NA62 Collaboration
2013-08-01
The main aim of the NA62 experiment (NA62 Technical Design Report,
Performance Enhancement Strategies for Multi-Block Overset Grid CFD Applications
NASA Technical Reports Server (NTRS)
Djomehri, M. Jahed; Biswas, Rupak
2003-01-01
The overset grid methodology has significantly reduced time-to-solution of highfidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement strategies on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machinc. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Details of a sophisticated graph partitioning technique for grid grouping are also provided. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.
Optimizing CMS build infrastructure via Apache Mesos
NASA Astrophysics Data System (ADS)
Abdurachmanov, David; Degano, Alessandro; Elmer, Peter; Eulisse, Giulio; Mendez, David; Muzaffar, Shahzad
2015-12-01
The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous environment of commodity x86-64 processors and Linux. Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of nodes. We present how we migrated our continuous integration system to schedule jobs on a relatively small Apache Mesos enabled cluster and how this resulted in better resource usage, higher peak performance and lower latency thanks to the dynamic scheduling capabilities of Mesos.
A Latency-Tolerant Partitioner for Distributed Computing on the Information Power Grid
NASA Technical Reports Server (NTRS)
Das, Sajal K.; Harvey, Daniel J.; Biwas, Rupak; Kwak, Dochan (Technical Monitor)
2001-01-01
NASA's Information Power Grid (IPG) is an infrastructure designed to harness the power of graphically distributed computers, databases, and human expertise, in order to solve large-scale realistic computational problems. This type of a meta-computing environment is necessary to present a unified virtual machine to application developers that hides the intricacies of a highly heterogeneous environment and yet maintains adequate security. In this paper, we present a novel partitioning scheme. called MinEX, that dynamically balances processor workloads while minimizing data movement and runtime communication, for applications that are executed in a parallel distributed fashion on the IPG. We also analyze the conditions that are required for the IPG to be an effective tool for such distributed computations. Our results show that MinEX is a viable load balancer provided the nodes of the IPG are connected by a high-speed asynchronous interconnection network.
FPGA implementation of image dehazing algorithm for real time applications
NASA Astrophysics Data System (ADS)
Kumar, Rahul; Kaushik, Brajesh Kumar; Balasubramanian, R.
2017-09-01
Weather degradation such as haze, fog, mist, etc. severely reduces the effective range of visual surveillance. This degradation is a spatially varying phenomena, which makes this problem non trivial. Dehazing is an essential preprocessing stage in applications such as long range imaging, border security, intelligent transportation system, etc. However, these applications require low latency of the preprocessing block. In this work, single image dark channel prior algorithm is modified and implemented for fast processing with comparable visual quality of the restored image/video. Although conventional single image dark channel prior algorithm is computationally expensive, it yields impressive results. Moreover, a two stage image dehazing architecture is introduced, wherein, dark channel and airlight are estimated in the first stage. Whereas, transmission map and intensity restoration are computed in the next stages. The algorithm is implemented using Xilinx Vivado software and validated by using Xilinx zc702 development board, which contains an Artix7 equivalent Field Programmable Gate Array (FPGA) and ARM Cortex A9 dual core processor. Additionally, high definition multimedia interface (HDMI) has been incorporated for video feed and display purposes. The results show that the dehazing algorithm attains 29 frames per second for the image resolution of 1920x1080 which is suitable of real time applications. The design utilizes 9 18K_BRAM, 97 DSP_48, 6508 FFs and 8159 LUTs.
Kamiya, Kazunobu; Suzuki, Noboru
2016-12-01
Some aluminium complexes are excellent catalysts of cationic polymerisation and are used for low-temperature and fast-curing adhesive, used in electronic part mounting. Microencapsulation is a suitable technique for getting high latency of the catalysts and long shelf life of the adhesives. For the higher latency in a cycloaliphatic epoxy compound, the microcapsule surface which retained small amount of aluminium complex was coated with epoxy polymer and the effect was examined. From the X-ray photoelectron spectroscopic results, the surface was recognised to be sufficiently coated and the differential scanning calorimetric analyses showed that the coating did not significantly affect the low-temperature and fast-curing properties of adhesive. After storing the mixture of cycloaliphatic epoxy compound, coated microcapsules, triphenylsilanol and silane coupling agent for 48 h at room temperature, the increase in viscosity was only 0.01 Pa s, resulting in the excellent shelf life.
Cvijetic, Neda; Tanaka, Akihiro; Kanonakis, Konstantinos; Wang, Ting
2014-08-25
We demonstrate the first SDN-controlled optical topology-reconfigurable mobile fronthaul (MFH) architecture for bidirectional coordinated multipoint (CoMP) and low latency inter-cell device-to-device (D2D) connectivity in the 5G mobile networking era. SDN-based OpenFlow control is used to dynamically instantiate the CoMP and inter-cell D2D features as match/action combinations in control plane flow tables of software-defined optical and electrical switching elements. Dynamic re-configurability is thereby introduced into the optical MFH topology, while maintaining back-compatibility with legacy fiber deployments. 10 Gb/s peak rates with <7 μs back-to-back transmission latency and 29.6 dB total power budget are experimentally demonstrated, confirming the attractiveness of the new approach for optical MFH of future 5G mobile systems.
Latency Determination and Compensation in Real-Time Gnss/ins Integrated Navigation Systems
NASA Astrophysics Data System (ADS)
Solomon, P. D.; Wang, J.; Rizos, C.
2011-09-01
Unmanned Aerial Vehicle (UAV) technology is now commonplace in many defence and civilian environments. However, the high cost of owning and operating a sophisticated UAV has slowed their adoption in many commercial markets. Universities and research groups are actively experimenting with UAVs to further develop the technology, particularly for automated flying operations. The two main UAV platforms used are fixed-wing and helicopter. Helicopter-based UAVs offer many attractive features over fixed-wing UAVs, including vertical take-off, the ability to loiter, and highly dynamic flight. However the control and navigation of helicopters are significantly more demanding than those of fixed-wing UAVs and as such require a high bandwidth real-time Position, Velocity, Attitude (PVA) navigation system. In practical Real-Time Navigation Systems (RTNS) there are delays in the processing of the GNSS data prior to the fusion of the GNSS data with the INS measurements. This latency must be compensated for otherwise it degrades the solution of the navigation filter. This paper investigates the effect of latency in the arrival time of the GNSS data in a RTNS. Several test drives and flights were conducted with a low-cost RTNS, and compared with a high quality GNSS/INS solution. A technique for the real-time, automated and accurate estimation of the GNSS latency in low-cost systems was developed and tested. The latency estimates were then verified through cross-correlation with the time-stamped measurements from the reference system. A delayed measurement Extended Kalman Filter was then used to allow for the real-time fusing of the delayed measurements, and then a final system developed for on-the-fly measurement and compensation of GNSS latency in a RTNS.
ERIC Educational Resources Information Center
Delattre, Marie; Bonin, Patrick; Barry, Christopher
2006-01-01
The authors examined the effect of sound-to-spelling regularity on written spelling latencies and writing durations in a dictation task in which participants had to write each target word 3 times in succession. The authors found that irregular words (i.e., those containing low-probability phoneme-to-grapheme mappings) were slower both to…
Federal Register 2010, 2011, 2012, 2013, 2014
2013-08-13
... latency fiber connection option, and provide a waiver of installation fees for subscriptions through..., including a 40Gb fiber connection, a 10Gb fiber connection, a 1Gb fiber connection, and a 1Gb copper... fiber connection offering, which uses new ultra- low latency switches.\\4\\ A switch is a type of network...
The development of a general purpose ARM-based processing unit for the ATLAS TileCal sROD
NASA Astrophysics Data System (ADS)
Cox, M. A.; Reed, R.; Mellado, B.
2015-01-01
After Phase-II upgrades in 2022, the data output from the LHC ATLAS Tile Calorimeter will increase significantly. ARM processors are common in mobile devices due to their low cost, low energy consumption and high performance. It is proposed that a cost-effective, high data throughput Processing Unit (PU) can be developed by using several consumer ARM processors in a cluster configuration to allow aggregated processing performance and data throughput while maintaining minimal software design difficulty for the end-user. This PU could be used for a variety of high-level functions on the high-throughput raw data such as spectral analysis and histograms to detect possible issues in the detector at a low level. High-throughput I/O interfaces are not typical in consumer ARM System on Chips but high data throughput capabilities are feasible via the novel use of PCI-Express as the I/O interface to the ARM processors. An overview of the PU is given and the results for performance and throughput testing of four different ARM Cortex System on Chips are presented.
Castillo, Encarnación; López-Ramos, Juan A.; Morales, Diego P.
2018-01-01
Security is a critical challenge for the effective expansion of all new emerging applications in the Internet of Things paradigm. Therefore, it is necessary to define and implement different mechanisms for guaranteeing security and privacy of data interchanged within the multiple wireless sensor networks being part of the Internet of Things. However, in this context, low power and low area are required, limiting the resources available for security and thus hindering the implementation of adequate security protocols. Group keys can save resources and communications bandwidth, but should be combined with public key cryptography to be really secure. In this paper, a compact and unified co-processor for enabling Elliptic Curve Cryptography along to Advanced Encryption Standard with low area requirements and Group-Key support is presented. The designed co-processor allows securing wireless sensor networks with independence of the communications protocols used. With an area occupancy of only 2101 LUTs over Spartan 6 devices from Xilinx, it requires 15% less area while achieving near 490% better performance when compared to cryptoprocessors with similar features in the literature. PMID:29337921
Low-voltage analog front-end processor design for ISFET-based sensor and H+ sensing applications
NASA Astrophysics Data System (ADS)
Chung, Wen-Yaw; Yang, Chung-Huang; Peng, Kang-Chu; Yeh, M. H.
2003-04-01
This paper presents a modular-based low-voltage analog-front-end processor design in a 0.5mm double-poly double-metal CMOS technology for Ion Sensitive Field Effect Transistor (ISFET)-based sensor and H+ sensing applications. To meet the potentiometric response of the ISFET that is proportional to various H+ concentrations, the constant-voltage and constant current (CVCS) testing configuration has been used. Low-voltage design skills such as bulk-driven input pair, folded-cascode amplifier, bootstrap switch control circuits have been designed and integrated for 1.5V supply and nearly rail-to-rail analog to digital signal processing. Core modules consist of an 8-bit two-step analog-digital converter and bulk-driven pre-amplifiers have been developed in this research. The experimental results show that the proposed circuitry has an acceptable linearity to 0.1 pH-H+ sensing conversions with the buffer solution in the range of pH2 to pH12. The processor has a potential usage in battery-operated and portable healthcare devices and environmental monitoring applications.
Jiang, Hua; Lu, Wenke; Zhang, Guoan
2013-07-01
In this paper, we propose a low insertion loss and miniaturization wavelet transform and inverse transform processor using surface acoustic wave (SAW) devices. The new SAW wavelet transform devices (WTDs) use the structure with two electrode-widths-controlled (EWC) single phase unidirectional transducers (SPUDT-SPUDT). This structure consists of the input withdrawal weighting interdigital transducer (IDT) and the output overlap weighting IDT. Three experimental devices for different scales 2(-1), 2(-2), and 2(-3) are designed and measured. The minimum insertion loss of the three devices reaches 5.49dB, 4.81dB, and 5.38dB respectively which are lower than the early results. Both the electrode width and the number of electrode pairs are reduced, thus making the three devices much smaller than the early devices. Therefore, the method described in this paper is suitable for implementing an arbitrary multi-scale low insertion loss and miniaturization wavelet transform and inverse transform processor using SAW devices. Copyright © 2013 Elsevier B.V. All rights reserved.
Parrilla, Luis; Castillo, Encarnación; López-Ramos, Juan A; Álvarez-Bermejo, José A; García, Antonio; Morales, Diego P
2018-01-16
Security is a critical challenge for the effective expansion of all new emerging applications in the Internet of Things paradigm. Therefore, it is necessary to define and implement different mechanisms for guaranteeing security and privacy of data interchanged within the multiple wireless sensor networks being part of the Internet of Things. However, in this context, low power and low area are required, limiting the resources available for security and thus hindering the implementation of adequate security protocols. Group keys can save resources and communications bandwidth, but should be combined with public key cryptography to be really secure. In this paper, a compact and unified co-processor for enabling Elliptic Curve Cryptography along to Advanced Encryption Standard with low area requirements and Group-Key support is presented. The designed co-processor allows securing wireless sensor networks with independence of the communications protocols used. With an area occupancy of only 2101 LUTs over Spartan 6 devices from Xilinx, it requires 15% less area while achieving near 490% better performance when compared to cryptoprocessors with similar features in the literature.
The Zadko Telescope: Exploring the Transient Universe
NASA Astrophysics Data System (ADS)
Coward, D. M.; Gendre, B.; Tanga, P.; Turpin, D.; Zadko, J.; Dodson, R.; Devogéle, M.; Howell, E. J.; Kennewell, J. A.; Boër, M.; Klotz, A.; Dornic, D.; Moore, J. A.; Heary, A.
2017-01-01
The Zadko telescope is a 1 m f/4 Cassegrain telescope, situated in the state of Western Australia about 80-km north of Perth. The facility plays a niche role in Australian astronomy, as it is the only meter class facility in Australia dedicated to automated follow-up imaging of alerts or triggers received from different external instruments/detectors spanning the entire electromagnetic spectrum. Furthermore, the location of the facility at a longitude not covered by other meter class facilities provides an important resource for time critical projects. This paper reviews the status of the Zadko facility and science projects since it began robotic operations in March 2010. We report on major upgrades to the infrastructure and equipment (2012-2014) that has resulted in significantly improved robotic operations. Second, we review the core science projects, which include automated rapid follow-up of gamma ray burst (GRB) optical afterglows, imaging of neutrino counterpart candidates from the ANTARES neutrino observatory, photometry of rare (Barbarian) asteroids, supernovae searches in nearby galaxies. Finally, we discuss participation in newly commencing international projects, including the optical follow-up of gravitational wave (GW) candidates from the United States and European GW observatory network and present first tests for very low latency follow-up of fast radio bursts. In the context of these projects, we outline plans for a future upgrade that will optimise the facility for alert triggered imaging from the radio, optical, high-energy, neutrino, and GW bands.
Pasman, J W; Rotteveel, J J; de Graaf, R; Stegeman, D F; Visco, Y M
1992-12-01
Recent studies on the maturation of auditory brainstem evoked responses (ABRs) present conflicting results, whereas only sparse reports exist with respect to the maturation of middle latency auditory evoked responses (MLRs) and auditory cortical evoked responses (ACRs). The present study reports the effect of preterm birth on the maturation of auditory evoked responses in low risk preterm infants (27-34 weeks conceptional age). The ABRs indicate a consistent trend towards longer latencies for all individual ABR components and towards longer interpeak latencies in preterm infants. The MLR shows longer latencies for early component P0 in preterm infants. The ACRs show a remarkable difference between preterm and term infants. At 40 weeks CA the latencies of ACR components Na and P2 are significantly longer in term infants, whereas at 52 weeks CA the latencies of the same ACR components are shorter in term infants. The results support the hypothesis that retarded myelination of the central auditory pathway is partially responsible for differences found between preterm infants and term infants with respect to late ABR components and early MLR component P0. Furthermore, mild conductive hearing loss in preterm infants may also play its role. A more complex mechanism is implicated to account for the findings noted with respect to MLR component Na and ACR components Na and P2.
Implementing wavelet inverse-transform processor with surface acoustic wave device.
Lu, Wenke; Zhu, Changchun; Liu, Qinghong; Zhang, Jingduan
2013-02-01
The objective of this research was to investigate the implementation schemes of the wavelet inverse-transform processor using surface acoustic wave (SAW) device, the length function of defining the electrodes, and the possibility of solving the load resistance and the internal resistance for the wavelet inverse-transform processor using SAW device. In this paper, we investigate the implementation schemes of the wavelet inverse-transform processor using SAW device. In the implementation scheme that the input interdigital transducer (IDT) and output IDT stand in a line, because the electrode-overlap envelope of the input IDT is identical with the one of the output IDT (i.e. the two transducers are identical), the product of the input IDT's frequency response and the output IDT's frequency response can be implemented, so that the wavelet inverse-transform processor can be fabricated. X-112(0)Y LiTaO(3) is used as a substrate material to fabricate the wavelet inverse-transform processor. The size of the wavelet inverse-transform processor using this implementation scheme is small, so its cost is low. First, according to the envelope function of the wavelet function, the length function of the electrodes is defined, then, the lengths of the electrodes can be calculated from the length function of the electrodes, finally, the input IDT and output IDT can be designed according to the lengths and widths for the electrodes. In this paper, we also present the load resistance and the internal resistance as the two problems of the wavelet inverse-transform processor using SAW devices. The solutions to these problems are achieved in this study. When the amplifiers are subjected to the input end and output end for the wavelet inverse-transform processor, they can eliminate the influence of the load resistance and the internal resistance on the output voltage of the wavelet inverse-transform processor using SAW device. Copyright © 2012 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yancey, Cregg C.; Shawhan, Peter; Bear, Brandon E.
We explore opportunities for multi-messenger astronomy using gravitational waves (GWs) and prompt, transient low-frequency radio emission to study highly energetic astrophysical events. We review the literature on possible sources of correlated emission of GWs and radio transients, highlighting proposed mechanisms that lead to a short-duration, high-flux radio pulse originating from the merger of two neutron stars or from a superconducting cosmic string cusp. We discuss the detection prospects for each of these mechanisms by low-frequency dipole array instruments such as LWA1, the Low Frequency Array and the Murchison Widefield Array. We find that a broad range of models may bemore » tested by searching for radio pulses that, when de-dispersed, are temporally and spatially coincident with a LIGO/Virgo GW trigger within a ∼30 s time window and ∼200–500 deg{sup 2} sky region. We consider various possible observing strategies and discuss their advantages and disadvantages. Uniquely, for low-frequency radio arrays, dispersion can delay the radio pulse until after low-latency GW data analysis has identified and reported an event candidate, enabling a prompt radio signal to be captured by a deliberately targeted beam. If neutron star mergers do have detectable prompt radio emissions, a coincident search with the GW detector network and low-frequency radio arrays could increase the LIGO/Virgo effective search volume by up to a factor of ∼2. For some models, we also map the parameter space that may be constrained by non-detections.« less
Discrete sensitivity derivatives of the Navier-Stokes equations with a parallel Krylov solver
NASA Technical Reports Server (NTRS)
Ajmani, Kumud; Taylor, Arthur C., III
1994-01-01
This paper solves an 'incremental' form of the sensitivity equations derived by differentiating the discretized thin-layer Navier Stokes equations with respect to certain design variables of interest. The equations are solved with a parallel, preconditioned Generalized Minimal RESidual (GMRES) solver on a distributed-memory architecture. The 'serial' sensitivity analysis code is parallelized by using the Single Program Multiple Data (SPMD) programming model, domain decomposition techniques, and message-passing tools. Sensitivity derivatives are computed for low and high Reynolds number flows over a NACA 1406 airfoil on a 32-processor Intel Hypercube, and found to be identical to those computed on a single-processor Cray Y-MP. It is estimated that the parallel sensitivity analysis code has to be run on 40-50 processors of the Intel Hypercube in order to match the single-processor processing time of a Cray Y-MP.
Scalable architecture for a room temperature solid-state quantum information processor.
Yao, N Y; Jiang, L; Gorshkov, A V; Maurer, P C; Giedke, G; Cirac, J I; Lukin, M D
2012-04-24
The realization of a scalable quantum information processor has emerged over the past decade as one of the central challenges at the interface of fundamental science and engineering. Here we propose and analyse an architecture for a scalable, solid-state quantum information processor capable of operating at room temperature. Our approach is based on recent experimental advances involving nitrogen-vacancy colour centres in diamond. In particular, we demonstrate that the multiple challenges associated with operation at ambient temperature, individual addressing at the nanoscale, strong qubit coupling, robustness against disorder and low decoherence rates can be simultaneously achieved under realistic, experimentally relevant conditions. The architecture uses a novel approach to quantum information transfer and includes a hierarchy of control at successive length scales. Moreover, it alleviates the stringent constraints currently limiting the realization of scalable quantum processors and will provide fundamental insights into the physics of non-equilibrium many-body quantum systems.
A performance study of the time-varying cache behavior: a study on APEX, Mantevo, NAS, and PARSEC
Siddique, Nafiul A.; Grubel, Patricia A.; Badawy, Abdel-Hameed A.; ...
2017-09-20
Cache has long been used to minimize the latency of main memory accesses by storing frequently used data near the processor. Processor performance depends on the underlying cache performance. Therefore, significant research has been done to identify the most crucial metrics of cache performance. Although the majority of research focuses on measuring cache hit rates and data movement as the primary cache performance metrics, cache utilization is significantly important. We investigate the application’s locality using cache utilization metrics. In addition, we present cache utilization and traditional cache performance metrics as the program progresses providing detailed insights into the dynamic applicationmore » behavior on parallel applications from four benchmark suites running on multiple cores. We explore cache utilization for APEX, Mantevo, NAS, and PARSEC, mostly scientific benchmark suites. Our results indicate that 40% of the data bytes in a cache line are accessed at least once before line eviction. Also, on average a byte is accessed two times before the cache line is evicted for these applications. Moreover, we present runtime cache utilization, as well as, conventional performance metrics that illustrate a holistic understanding of cache behavior. To facilitate this research, we build a memory simulator incorporated into the Structural Simulation Toolkit (Rodrigues et al. in SIGMETRICS Perform Eval Rev 38(4):37–42, 2011). Finally, our results suggest that variable cache line size can result in better performance and can also conserve power.« less
A performance study of the time-varying cache behavior: a study on APEX, Mantevo, NAS, and PARSEC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Siddique, Nafiul A.; Grubel, Patricia A.; Badawy, Abdel-Hameed A.
Cache has long been used to minimize the latency of main memory accesses by storing frequently used data near the processor. Processor performance depends on the underlying cache performance. Therefore, significant research has been done to identify the most crucial metrics of cache performance. Although the majority of research focuses on measuring cache hit rates and data movement as the primary cache performance metrics, cache utilization is significantly important. We investigate the application’s locality using cache utilization metrics. In addition, we present cache utilization and traditional cache performance metrics as the program progresses providing detailed insights into the dynamic applicationmore » behavior on parallel applications from four benchmark suites running on multiple cores. We explore cache utilization for APEX, Mantevo, NAS, and PARSEC, mostly scientific benchmark suites. Our results indicate that 40% of the data bytes in a cache line are accessed at least once before line eviction. Also, on average a byte is accessed two times before the cache line is evicted for these applications. Moreover, we present runtime cache utilization, as well as, conventional performance metrics that illustrate a holistic understanding of cache behavior. To facilitate this research, we build a memory simulator incorporated into the Structural Simulation Toolkit (Rodrigues et al. in SIGMETRICS Perform Eval Rev 38(4):37–42, 2011). Finally, our results suggest that variable cache line size can result in better performance and can also conserve power.« less
Generic Software Architecture for Launchers
NASA Astrophysics Data System (ADS)
Carre, Emilien; Gast, Philippe; Hiron, Emmanuel; Leblanc, Alain; Lesens, David; Mescam, Emmanuelle; Moro, Pierre
2015-09-01
The definition and reuse of generic software architecture for launchers is not so usual for several reasons: the number of European launcher families is very small (Ariane 5 and Vega for these last decades); the real time constraints (reactivity and determinism needs) are very hard; low levels of versatility are required (implying often an ad hoc development of the launcher mission). In comparison, satellites are often built on a generic platform made up of reusable hardware building blocks (processors, star-trackers, gyroscopes, etc.) and reusable software building blocks (middleware, TM/TC, On Board Control Procedure, etc.). If some of these reasons are still valid (e.g. the limited number of development), the increase of the available CPU power makes today an approach based on a generic time triggered middleware (ensuring the full determinism of the system) and a centralised mission and vehicle management (offering more flexibility in the design and facilitating the long term maintenance) achievable. This paper presents an example of generic software architecture which could be envisaged for future launchers, based on the previously described principles and supported by model driven engineering and automatic code generation.
Polito, Letizia; Davin, Annalisa; Vaccaro, Roberta; Abbondanza, Simona; Govoni, Stefano; Racchi, Marco; Guaita, Antonio
2015-04-01
Previous studies have documented the involvement of the central nervous system serotonin in promoting wakefulness. There are few and conflicting results over whether there is an actual association between bearing the short allele of serotonin transporter promoter polymorphism (5-HTTLPR) and worse sleep quality. This study examined whether sleep onset latency complaint is associated with the 5-HTTLPR triallelic polymorphism in the SLC6A4 gene promoter and whether this polymorphism influences the relationship between sleep onset latency complaint and depressive symptoms in elderly people. A total of 1321 community-dwelling individuals aged 70-74 years were interviewed for sleep onset latency complaint and for sleep medication consumption. Participants' genomic DNA was typed for 5-HTTLPR and rs25531 polymorphisms. Depressive symptoms were evaluated with the Geriatric Depression Scale Short form and general medical comorbidity was assessed by the Cumulative Illness Rating Scale. The presence of a past history of depression was recorded. The S' allele of the 5-HTTLPR triallelic polymorphism was associated with sleep onset latency complaint. This association was maintained after adjusting for depressive symptoms, sex, age, history of depression and medical comorbidity. After stratification for 5-HTTLPR/rs25531, only in S'S' individuals high depressive symptoms were actually associated with sleep onset latency complaint. These data indicate that the low-expressing 5-HTTLPR triallelic polymorphism is an independent risk factor for sleep onset latency disturbance. Furthermore, the 5-HTTLPR genotype influences the association between depressive symptoms and sleep onset latency complaint. © 2014 European Sleep Research Society.
Analysis of a hardware and software fault tolerant processor for critical applications
NASA Technical Reports Server (NTRS)
Dugan, Joanne B.
1993-01-01
Computer systems for critical applications must be designed to tolerate software faults as well as hardware faults. A unified approach to tolerating hardware and software faults is characterized by classifying faults in terms of duration (transient or permanent) rather than source (hardware or software). Errors arising from transient faults can be handled through masking or voting, but errors arising from permanent faults require system reconfiguration to bypass the failed component. Most errors which are caused by software faults can be considered transient, in that they are input-dependent. Software faults are triggered by a particular set of inputs. Quantitative dependability analysis of systems which exhibit a unified approach to fault tolerance can be performed by a hierarchical combination of fault tree and Markov models. A methodology for analyzing hardware and software fault tolerant systems is applied to the analysis of a hypothetical system, loosely based on the Fault Tolerant Parallel Processor. The models consider both transient and permanent faults, hardware and software faults, independent and related software faults, automatic recovery, and reconfiguration.
NASA Technical Reports Server (NTRS)
Pordes, Ruth (Editor)
1989-01-01
Papers on real-time computer applications in nuclear, particle, and plasma physics are presented, covering topics such as expert systems tactics in testing FASTBUS segment interconnect modules, trigger control in a high energy physcis experiment, the FASTBUS read-out system for the Aleph time projection chamber, a multiprocessor data acquisition systems, DAQ software architecture for Aleph, a VME multiprocessor system for plasma control at the JT-60 upgrade, and a multiasking, multisinked, multiprocessor data acquisition front end. Other topics include real-time data reduction using a microVAX processor, a transputer based coprocessor for VEDAS, simulation of a macropipelined multi-CPU event processor for use in FASTBUS, a distributed VME control system for the LISA superconducting Linac, a distributed system for laboratory process automation, and a distributed system for laboratory process automation. Additional topics include a structure macro assembler for the event handler, a data acquisition and control system for Thomson scattering on ATF, remote procedure execution software for distributed systems, and a PC-based graphic display real-time particle beam uniformity.
A FPGA embedded web server for remote monitoring and control of smart sensors networks.
Magdaleno, Eduardo; Rodríguez, Manuel; Pérez, Fernando; Hernández, David; García, Enrique
2013-12-27
This article describes the implementation of a web server using an embedded Altera NIOS II IP core, a general purpose and configurable RISC processor which is embedded in a Cyclone FPGA. The processor uses the μCLinux operating system to support a Boa web server of dynamic pages using Common Gateway Interface (CGI). The FPGA is configured to act like the master node of a network, and also to control and monitor a network of smart sensors or instruments. In order to develop a totally functional system, the FPGA also includes an implementation of the time-triggered protocol (TTP/A). Thus, the implemented master node has two interfaces, the webserver that acts as an Internet interface and the other to control the network. This protocol is widely used to connecting smart sensors and actuators and microsystems in embedded real-time systems in different application domains, e.g., industrial, automotive, domotic, etc., although this protocol can be easily replaced by any other because of the inherent characteristics of the FPGA-based technology.
A FPGA Embedded Web Server for Remote Monitoring and Control of Smart Sensors Networks
Magdaleno, Eduardo; Rodríguez, Manuel; Pérez, Fernando; Hernández, David; García, Enrique
2014-01-01
This article describes the implementation of a web server using an embedded Altera NIOS II IP core, a general purpose and configurable RISC processor which is embedded in a Cyclone FPGA. The processor uses the μCLinux operating system to support a Boa web server of dynamic pages using Common Gateway Interface (CGI). The FPGA is configured to act like the master node of a network, and also to control and monitor a network of smart sensors or instruments. In order to develop a totally functional system, the FPGA also includes an implementation of the time-triggered protocol (TTP/A). Thus, the implemented master node has two interfaces, the webserver that acts as an Internet interface and the other to control the network. This protocol is widely used to connecting smart sensors and actuators and microsystems in embedded real-time systems in different application domains, e.g., industrial, automotive, domotic, etc., although this protocol can be easily replaced by any other because of the inherent characteristics of the FPGA-based technology. PMID:24379047
A distributed control system for the lower-hybrid current drive system on the Tokamak de Varennes
NASA Astrophysics Data System (ADS)
Bagdoo, J.; Guay, J. M.; Chaudron, G.-A.; Decoste, R.; Demers, Y.; Hubbard, A.
1990-08-01
An rf current drive system with an output power of 1 MW at 3.7 GHz is under development for the Tokamak de Varennes. The control system is based on an Ethernet local-area network of programmable logic controllers as front end, personal computers as consoles, and CAMAC-based DSP processors. The DSP processors ensure the PID control of the phase and rf power of each klystron, and the fast protection of high-power rf hardware, all within a 40 μs loop. Slower control and protection, event sequencing and the run-time database are provided by the programmable logic controllers, which communicate, via the LAN, with the consoles. The latter run a commercial process-control console software. The LAN protocol respects the first four layers of the ISO/OSI 802.3 standard. Synchronization with the tokamak control system is provided by commercially available CAMAC timing modules which trigger shot-related events and reference waveform generators. A detailed description of each subsystem and a performance evaluation of the system will be presented.
Multiple channel data acquisition system
Crawley, H. Bert; Rosenberg, Eli I.; Meyer, W. Thomas; Gorbics, Mark S.; Thomas, William D.; McKay, Roy L.; Homer, Jr., John F.
1990-05-22
A multiple channel data acquisition system for the transfer of large amounts of data from a multiplicity of data channels has a plurality of modules which operate in parallel to convert analog signals to digital data and transfer that data to a communications host via a FASTBUS. Each module has a plurality of submodules which include a front end buffer (FEB) connected to input circuitry having an analog to digital converter with cache memory for each of a plurality of channels. The submodules are interfaced with the FASTBUS via a FASTBUS coupler which controls a module bus and a module memory. The system is triggered to effect rapid parallel data samplings which are stored to the cache memories. The cache memories are uploaded to the FEBs during which zero suppression occurs. The data in the FEBs is reformatted and compressed by a local processor during transfer to the module memory. The FASTBUS coupler is used by the communications host to upload the compressed and formatted data from the module memory. The local processor executes programs which are downloaded to the module memory through the FASTBUS coupler.
Multiple channel data acquisition system
Crawley, H.B.; Rosenberg, E.I.; Meyer, W.T.; Gorbics, M.S.; Thomas, W.D.; McKay, R.L.; Homer, J.F. Jr.
1990-05-22
A multiple channel data acquisition system for the transfer of large amounts of data from a multiplicity of data channels has a plurality of modules which operate in parallel to convert analog signals to digital data and transfer that data to a communications host via a FASTBUS. Each module has a plurality of submodules which include a front end buffer (FEB) connected to input circuitry having an analog to digital converter with cache memory for each of a plurality of channels. The submodules are interfaced with the FASTBUS via a FASTBUS coupler which controls a module bus and a module memory. The system is triggered to effect rapid parallel data samplings which are stored to the cache memories. The cache memories are uploaded to the FEBs during which zero suppression occurs. The data in the FEBs is reformatted and compressed by a local processor during transfer to the module memory. The FASTBUS coupler is used by the communications host to upload the compressed and formatted data from the module memory. The local processor executes programs which are downloaded to the module memory through the FASTBUS coupler. 25 figs.
NASA Astrophysics Data System (ADS)
Giusi, Giovanni; Liu, Scige J.; Di Giorgio, Anna M.; Galli, Emanuele; Pezzuto, Stefano; Farina, Maria; Spinoglio, Luigi
2014-08-01
SAFARI (SpicA FAR infrared Instrument) is a far-infrared imaging Fourier Transform Spectrometer for the SPICA mission. The Digital Processing Unit (DPU) of the instrument implements the functions of controlling the overall instrument and implementing the science data compression and packing. The DPU design is based on the use of a LEON family processor. In SAFARI, all instrument components are connected to the central DPU via SpaceWire links. On these links science data, housekeeping and commands flows are in some cases multiplexed, therefore the interface control shall be able to cope with variable throughput needs. The effective data transfer workload can be an issue for the overall system performances and becomes a critical parameter for the on-board software design, both at application layer level and at lower, and more HW related, levels. To analyze the system behavior in presence of the expected SAFARI demanding science data flow, we carried out a series of performance tests using the standard GR-CPCI-UT699 LEON3-FT Development Board, provided by Aeroflex/Gaisler, connected to the emulator of the SAFARI science data links, in a point-to-point topology. Two different communication protocols have been used in the tests, the ECSS-E-ST-50-52C RMAP protocol and an internally defined one, the SAFARI internal data handling protocol. An incremental approach has been adopted to measure the system performances at different levels of the communication protocol complexity. In all cases the performance has been evaluated by measuring the CPU workload and the bus latencies. The tests have been executed initially in a custom low level execution environment and finally using the Real- Time Executive for Multiprocessor Systems (RTEMS), which has been selected as the operating system to be used onboard SAFARI. The preliminary results of the carried out performance analysis confirmed the possibility of using a LEON3 CPU processor in the SAFARI DPU, but pointed out, in agreement with previous similar studies, the need of carefully designing the overall architecture to implement some of the DPU functionalities on additional processing devices.
Low-cost real-time infrared scene generation for image projection and signal injection
NASA Astrophysics Data System (ADS)
Buford, James A., Jr.; King, David E.; Bowden, Mark H.
1998-07-01
As cost becomes an increasingly important factor in the development and testing of Infrared sensors and flight computer/processors, the need for accurate hardware-in-the- loop (HWIL) simulations is critical. In the past, expensive and complex dedicated scene generation hardware was needed to attain the fidelity necessary for accurate testing. Recent technological advances and innovative applications of established technologies are beginning to allow development of cost-effective replacements for dedicated scene generators. These new scene generators are mainly constructed from commercial-off-the-shelf (COTS) hardware and software components. At the U.S. Army Aviation and Missile Command (AMCOM) Missile Research, Development, and Engineering Center (MRDEC), researchers have developed such a dynamic IR scene generator (IRSG) built around COTS hardware and software. The IRSG is used to provide dynamic inputs to an IR scene projector for in-band seeker testing and for direct signal injection into the seeker or processor electronics. AMCOM MRDEC has developed a second generation IRSG, namely IRSG2, using the latest Silicon Graphics Incorporated (SGI) Onyx2 with Infinite Reality graphics. As reported in previous papers, the SGI Onyx Reality Engine 2 is the platform of the original IRSG that is now referred to as IRSG1. IRSG1 has been in operation and used daily for the past three years on several IR projection and signal injection HWIL programs. Using this second generation IRSG, frame rates have increased from 120 Hz to 400 Hz and intensity resolution from 12 bits to 16 bits. The key features of the IRSGs are real time missile frame rates and frame sizes, dynamic missile-to-target(s) viewpoint updated each frame in real-time by a six-degree-of- freedom (6DOF) system under test (SUT) simulation, multiple dynamic objects (e.g. targets, terrain/background, countermeasures, and atmospheric effects), latency compensation, point-to-extended source anti-aliased targets, and sensor modeling effects. This paper provides a comparison between the IRSG1 and IRSG2 systems and focuses on the IRSG software, real time features, and database development tools.
LCTS on ALPHASAT and Sentinel 1a: in orbit status of the LEO to geo data relay system
NASA Astrophysics Data System (ADS)
Zech, H.; Heine, F.; Troendle, D.; Pimentel, P. M.; Panzlaff, K.; Motzigemba, M.; Meyer, R.; Philipp-May, S.
2017-11-01
The performance of sensors for Earth Observation Missions is constantly improving. This drives the need for a reliable, high-speed data transfer capability from a Low Earth Orbit (LEO) spacecraft (S/C) to ground. In addition, for the transfer of time-critical data to ground, a low latency between data generation in orbit and data reception at the respective mission control center is of high importance. Laser communication between Satellites for high data transmission in combination with a GEO data relay system for reducing the latency time addresses these requirements.
Umriukhin, P E; Grigorchuk, O S
2015-12-01
In the presented study we investigated the possibility to use the open field behavior data for prediction of corticosterone level in rat blood plasma before and after stress. It is shown that the most reliable open field behavior parameters, reflecting high probability of significant upregulation of corticosterone after 3 hours of immobilization, are the short latency of first movement and low locomotor activity during the test. Rats with high corticosterone at normal non-stress conditions are characterized by low locomotor activity and on the contrary long latency period for the entrance of open field center.
Readout and trigger for the AFP detector at ATLAS experiment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kocian, M.
AFP, the ATLAS Forward Proton consists of silicon detectors at 205 m and 217 m on each side of ATLAS. In 2016 two detectors in one side were installed. The FEI4 chips are read at 160 Mbps over the optical fibers. The DAQ system uses a FPGA board with Artix chip and a mezzanine card with RCE data processing module based on a Zynq chip with ARM processor running ArchLinux. Finally, in this paper we give an overview of the AFP detector with the commissioning steps taken to integrate with the ATLAS TDAQ. Furthermore first performance results are presented.
Readout and trigger for the AFP detector at ATLAS experiment
Kocian, M.
2017-01-25
AFP, the ATLAS Forward Proton consists of silicon detectors at 205 m and 217 m on each side of ATLAS. In 2016 two detectors in one side were installed. The FEI4 chips are read at 160 Mbps over the optical fibers. The DAQ system uses a FPGA board with Artix chip and a mezzanine card with RCE data processing module based on a Zynq chip with ARM processor running ArchLinux. Finally, in this paper we give an overview of the AFP detector with the commissioning steps taken to integrate with the ATLAS TDAQ. Furthermore first performance results are presented.
Nanomagnet Logic: Architectures, design, and benchmarking
NASA Astrophysics Data System (ADS)
Kurtz, Steven J.
Nanomagnet Logic (NML) is an emerging technology being studied as a possible replacement or supplementary device for Complimentary Metal-Oxide-Semiconductor (CMOS) Field-Effect Transistors (FET) by the year 2020. NML devices offer numerous potential advantages including: low energy operation, steady state non-volatility, radiation hardness and a clear path to fabrication and integration with CMOS. However, maintaining both low-energy operation and non-volatility while scaling from the device to the architectural level is non-trivial as (i) nearest neighbor interactions within NML circuits complicate the modeling of ensemble nanomagnet behavior and (ii) the energy intensive clock structures required for re-evaluation and NML's relatively high latency challenge its ability to offer system-level performance wins against other emerging nanotechnologies. Thus, further research efforts are required to model more complex circuits while also identifying circuit design techniques that balance low-energy operation with steady state non-volatility. In addition, further work is needed to design and model low-power on-chip clocks while simultaneously identifying application spaces where NML systems (including clock overhead) offer sufficient energy savings to merit their inclusion in future processors. This dissertation presents research advancing the understanding and modeling of NML at all levels including devices, circuits, and line clock structures while also benchmarking NML against both scaled CMOS and tunneling FETs (TFET) devices. This is accomplished through the development of design tools and methodologies for (i) quantifying both energy and stability in NML circuits and (ii) evaluating line-clocked NML system performance. The application of these newly developed tools improves the understanding of ideal design criteria (i.e., magnet size, clock wire geometry, etc.) for NML architectures. Finally, the system-level performance evaluation tool offers the ability to project what advancements are required for NML to realize performance improvements over scaled-CMOS hardware equivalents at the functional unit and/or application-level.
Albattat, Ali; Gruenwald, Benjamin C.; Yucelen, Tansel
2016-01-01
The last decade has witnessed an increased interest in physical systems controlled over wireless networks (networked control systems). These systems allow the computation of control signals via processors that are not attached to the physical systems, and the feedback loops are closed over wireless networks. The contribution of this paper is to design and analyze event-triggered decentralized and distributed adaptive control architectures for uncertain networked large-scale modular systems; that is, systems consist of physically-interconnected modules controlled over wireless networks. Specifically, the proposed adaptive architectures guarantee overall system stability while reducing wireless network utilization and achieving a given system performance in the presence of system uncertainties that can result from modeling and degraded modes of operation of the modules and their interconnections between each other. In addition to the theoretical findings including rigorous system stability and the boundedness analysis of the closed-loop dynamical system, as well as the characterization of the effect of user-defined event-triggering thresholds and the design parameters of the proposed adaptive architectures on the overall system performance, an illustrative numerical example is further provided to demonstrate the efficacy of the proposed decentralized and distributed control approaches. PMID:27537894
Albattat, Ali; Gruenwald, Benjamin C; Yucelen, Tansel
2016-08-16
The last decade has witnessed an increased interest in physical systems controlled over wireless networks (networked control systems). These systems allow the computation of control signals via processors that are not attached to the physical systems, and the feedback loops are closed over wireless networks. The contribution of this paper is to design and analyze event-triggered decentralized and distributed adaptive control architectures for uncertain networked large-scale modular systems; that is, systems consist of physically-interconnected modules controlled over wireless networks. Specifically, the proposed adaptive architectures guarantee overall system stability while reducing wireless network utilization and achieving a given system performance in the presence of system uncertainties that can result from modeling and degraded modes of operation of the modules and their interconnections between each other. In addition to the theoretical findings including rigorous system stability and the boundedness analysis of the closed-loop dynamical system, as well as the characterization of the effect of user-defined event-triggering thresholds and the design parameters of the proposed adaptive architectures on the overall system performance, an illustrative numerical example is further provided to demonstrate the efficacy of the proposed decentralized and distributed control approaches.
Reducing the latency of the Fractal Iterative Method to half an iteration
NASA Astrophysics Data System (ADS)
Béchet, Clémentine; Tallon, Michel
2013-12-01
The fractal iterative method for atmospheric tomography (FRiM-3D) has been introduced to solve the wavefront reconstruction at the dimensions of an ELT with a low-computational cost. Previous studies reported the requirement of only 3 iterations of the algorithm in order to provide the best adaptive optics (AO) performance. Nevertheless, any iterative method in adaptive optics suffer from the intrinsic latency induced by the fact that one iteration can start only once the previous one is completed. Iterations hardly match the low-latency requirement of the AO real-time computer. We present here a new approach to avoid iterations in the computation of the commands with FRiM-3D, thus allowing low-latency AO response even at the scale of the European ELT (E-ELT). The method highlights the importance of "warm-start" strategy in adaptive optics. To our knowledge, this particular way to use the "warm-start" has not been reported before. Futhermore, removing the requirement of iterating to compute the commands, the computational cost of the reconstruction with FRiM-3D can be simplified and at least reduced to half the computational cost of a classical iteration. Thanks to simulations of both single-conjugate and multi-conjugate AO for the E-ELT,with FRiM-3D on Octopus ESO simulator, we demonstrate the benefit of this approach. We finally enhance the robustness of this new implementation with respect to increasing measurement noise, wind speed and even modeling errors.
Latency in Visionic Systems: Test Methods and Requirements
NASA Technical Reports Server (NTRS)
Bailey, Randall E.; Arthur, J. J., III; Williams, Steven P.; Kramer, Lynda J.
2005-01-01
A visionics device creates a pictorial representation of the external scene for the pilot. The ultimate objective of these systems may be to electronically generate a form of Visual Meteorological Conditions (VMC) to eliminate weather or time-of-day as an operational constraint and provide enhancement over actual visual conditions where eye-limiting resolution may be a limiting factor. Empirical evidence has shown that the total system delays or latencies including the imaging sensors and display systems, can critically degrade their utility, usability, and acceptability. Definitions and measurement techniques are offered herein as common test and evaluation methods for latency testing in visionics device applications. Based upon available data, very different latency requirements are indicated based upon the piloting task, the role in which the visionics device is used in this task, and the characteristics of the visionics cockpit display device including its resolution, field-of-regard, and field-of-view. The least stringent latency requirements will involve Head-Up Display (HUD) applications, where the visionics imagery provides situational information as a supplement to symbology guidance and command information. Conversely, the visionics system latency requirement for a large field-of-view Head-Worn Display application, providing a Virtual-VMC capability from which the pilot will derive visual guidance, will be the most stringent, having a value as low as 20 msec.
NASA Technical Reports Server (NTRS)
Smith, Kelly; Gay, Robert; Stachowiak, Susan
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.
Zierke, Stephanie; Bakos, Jason D
2010-04-12
Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
NASA Technical Reports Server (NTRS)
Phyne, J. R.; Nelson, M. D.
1975-01-01
The design and implementation of hardware and software systems involved in using a 40,000 bit/second communication line as the connecting link between an IMLAC PDS 1-D display computer and a Univac 1108 computer system were described. The IMLAC consists of two independent processors sharing a common memory. The display processor generates the deflection and beam control currents as it interprets a program contained in the memory; the minicomputer has a general instruction set and is responsible for starting and stopping the display processor and for communicating with the outside world through the keyboard, teletype, light pen, and communication line. The processing time associated with each data byte was minimized by designing the input and output processes as finite state machines which automatically sequence from each state to the next. Several tests of the communication link and the IMLAC software were made using a special low capacity computer grade cable between the IMLAC and the Univac.
On-board computational efficiency in real time UAV embedded terrain reconstruction
NASA Astrophysics Data System (ADS)
Partsinevelos, Panagiotis; Agadakos, Ioannis; Athanasiou, Vasilis; Papaefstathiou, Ioannis; Mertikas, Stylianos; Kyritsis, Sarantis; Tripolitsiotis, Achilles; Zervos, Panagiotis
2014-05-01
In the last few years, there is a surge of applications for object recognition, interpretation and mapping using unmanned aerial vehicles (UAV). Specifications in constructing those UAVs are highly diverse with contradictory characteristics including cost-efficiency, carrying weight, flight time, mapping precision, real time processing capabilities, etc. In this work, a hexacopter UAV is employed for near real time terrain mapping. The main challenge addressed is to retain a low cost flying platform with real time processing capabilities. The UAV weight limitation affecting the overall flight time, makes the selection of the on-board processing components particularly critical. On the other hand, surface reconstruction, as a computational demanding task, calls for a highly demanding processing unit on board. To merge these two contradicting aspects along with customized development, a System on a Chip (SoC) integrated circuit is proposed as a low-power, low-cost processor, which natively supports camera sensors and positioning and navigation systems. Modern SoCs, such as Omap3530 or Zynq, are classified as heterogeneous devices and provide a versatile platform, allowing access to both general purpose processors, such as the ARM11, as well as specialized processors, such as a digital signal processor and floating field-programmable gate array. A UAV equipped with the proposed embedded processors, allows on-board terrain reconstruction using stereo vision in near real time. Furthermore, according to the frame rate required, additional image processing may concurrently take place, such as image rectification andobject detection. Lastly, the onboard positioning and navigation (e.g., GNSS) chip may further improve the quality of the generated map. The resulting terrain maps are compared to ground truth geodetic measurements in order to access the accuracy limitations of the overall process. It is shown that with our proposed novel system,there is much potential in computational efficiency on board and in optimized time constraints.
A z-Vertex Trigger for Belle II
NASA Astrophysics Data System (ADS)
Skambraks, S.; Abudinén, F.; Chen, Y.; Feindt, M.; Frühwirth, R.; Heck, M.; Kiesling, C.; Knoll, A.; Neuhaus, S.; Paul, S.; Schieck, J.
2015-08-01
The Belle II experiment will go into operation at the upgraded SuperKEKB collider in 2016. SuperKEKB is designed to deliver an instantaneous luminosity L = 8 ×1035 cm - 2 s - 1. The experiment will therefore have to cope with a much larger machine background than its predecessor Belle, in particular from events outside of the interaction region. We present the concept of a track trigger, based on a neural network approach, that is able to suppress a large fraction of this background by reconstructing the z (longitudinal) position of the event vertex within the latency of the first level trigger. The trigger uses the hit information from the Central Drift Chamber (CDC) of Belle II within narrow cones in polar and azimuthal angle as well as in transverse momentum (“sectors”), and estimates the z-vertex without explicit track reconstruction. The preprocessing for the track trigger is based on the track information provided by the standard CDC trigger. It takes input from the 2D track finder, adds information from the stereo wires of the CDC, and finds the appropriate sectors in the CDC for each track. Within the sector, the z-vertex is estimated by a specialized neural network, with the drift times from the CDC as input and a continuous output corresponding to the scaled z-vertex. The neural algorithm will be implemented in programmable hardware. To this end a Virtex 7 FPGA board will be used, which provides at present the most promising solution for a fully parallelized implementation of neural networks or alternative multivariate methods. A high speed interface for external memory will be integrated into the platform, to be able to store the O(109) parameters required. The contribution presents the results of our feasibility studies and discusses the details of the envisaged hardware solution.
Arbitration in crossbar interconnect for low latency
Ohmacht, Martin; Sugavanam, Krishnan
2013-02-05
A system and method and computer program product for reducing the latency of signals communicated through a crossbar switch, the method including using at slave arbitration logic devices associated with Slave devices for which access is requested from one or more Master devices, two or more priority vector signals cycled among their use every clock cycle for selecting one of the requesting Master devices and updates the respective priority vector signal used every clock cycle. Similarly, each Master for which access is requested from one or more Slave devices, can have two or more priority vectors and can cycle among their use every clock cycle to further reduce latency and increase throughput performance via the crossbar.
Flow of a Gas Turbine Engine Low-Pressure Subsystem Simulated
NASA Technical Reports Server (NTRS)
Veres, Joseph P.
1997-01-01
The NASA Lewis Research Center is managing a task to numerically simulate overnight, on a parallel computing testbed, the aerodynamic flow in the complete low-pressure subsystem (LPS) of a gas turbine engine. The model solves the three-dimensional Navier- Stokes flow equations through all the components within the LPS, as well as the external flow around the engine nacelle. The LPS modeling task is being performed by Allison Engine Company under the Small Engine Technology contract. The large computer simulation was evaluated on networked computer systems using 8, 16, and 32 processors, with the parallel computing efficiency reaching 75 percent when 16 processors were used.
Park, Jeong Mee; Yong, Sang Yeol; Kim, Jong Heon; Kim, Hee; Park, Sang-Yoo
2014-01-01
Objective To compare the differences of diagnostic rates, of the two widely used test positions, in measuring vestibular evoked myogenic potentials (VEMP) and selecting the most appropriate analytical method for diagnostic criteria for the patients with vertigo. Methods Thirty-two patients with vertigo were tested in two comparative testing positions: turning the head to the opposite side of the evaluating side and bowing while in seated position, and bowing while in supine positions. Abnormalities were determined by prolonged latency of p13 or n23, shortening of the interpeak latency, and absence of VEMP formation. Results Using the three criteria above for determining abnormalities, both the seated and supine positions showed no significant differences in diagnostic rates, however, the concordance correlation of the two positions was low. When using only the prolonged latency of p13 or n23 in the two positions, diagnostic rates were not significantly different and their concordance correlation was high. On the other hand, using only the shortened interpeak latency in both positions showed no significant difference of diagnostic rates, and the degree of agreement between two positions was low. Conclusion Bowing while in seated position with the head turned in the opposite direction to the area being evaluated is found to be the best VEMP test position due to the consistent level of sternocleidomastoid muscle tension and the high level of compliance. Also, among other diagnostic analysis methods, using prolonged latency of p13 or n23 as the criterion is found to be the most appropriate method of analysis for the VEMP test. PMID:24855617
A Type of Low-Latency Data Gathering Method with Multi-Sink for Sensor Networks
Sha, Chao; Qiu, Jian-mei; Li, Shu-yan; Qiang, Meng-ye; Wang, Ru-chuan
2016-01-01
To balance energy consumption and reduce latency on data transmission in Wireless Sensor Networks (WSNs), a type of low-latency data gathering method with multi-Sink (LDGM for short) is proposed in this paper. The network is divided into several virtual regions consisting of three or less data gathering units and the leader of each region is selected according to its residual energy as well as distance to all of the other nodes. Only the leaders in each region need to communicate with the mobile Sinks which have effectively reduced energy consumption and the end-to-end delay. Moreover, with the help of the sleep scheduling and the sensing radius adjustment strategies, redundancy in network coverage could also be effectively reduced. Simulation results show that LDGM is energy efficient in comparison with MST as well as MWST and its time efficiency on data collection is higher than one Sink based data gathering methods. PMID:27338401
Low-Latency Telerobotics from Mars Orbit: The Case for Synergy Between Science and Human Exploration
NASA Technical Reports Server (NTRS)
Valinia, A.; Garvin, J. B.; Vondrak, R.; Thronson, H.; Lester, D.; Schmidt, G.; Fong, T.; Wilcox, B.; Sellers, P.; White, N.
2012-01-01
Initial, science-directed human exploration of Mars will benefit from capabilities in which human explorers remain in orbit to control telerobotic systems on the surface (Figure 1). Low-latency, high-bandwidth telerobotics (LLT) from Mars orbit offers opportunities for what the terrestrial robotics community considers to be high-quality telepresence. Such telepresence would provide high quality sensory perception and situation awareness, and even capabilities for dexterous manipulation as required for adaptive, informed selection of scientific samples [1]. Astronauts on orbit in close communication proximity to a surface exploration site (in order to minimize communication latency) represent a capability that would extend human cognition to Mars (and potentially for other bodies such as asteroids, Venus, the Moon, etc.) without the challenges, expense, and risk of putting those humans on hazardous surfaces or within deep gravity wells. Such a strategy may be consistent with goals for a human space flight program that, are currently being developed within NASA.
Monitoring data transfer latency in CMS computing operations
Bonacorsi, Daniele; Diotalevi, Tommaso; Magini, Nicolo; ...
2015-12-23
During the first LHC run, the CMS experiment collected tens of Petabytes of collision and simulated data, which need to be distributed among dozens of computing centres with low latency in order to make efficient use of the resources. While the desired level of throughput has been successfully achieved, it is still common to observe transfer workflows that cannot reach full completion in a timely manner due to a small fraction of stuck files which require operator intervention.For this reason, in 2012 the CMS transfer management system, PhEDEx, was instrumented with a monitoring system to measure file transfer latencies, andmore » to predict the completion time for the transfer of a data set. The operators can detect abnormal patterns in transfer latencies while the transfer is still in progress, and monitor the long-term performance of the transfer infrastructure to plan the data placement strategy.Based on the data collected for one year with the latency monitoring system, we present a study on the different factors that contribute to transfer completion time. As case studies, we analyze several typical CMS transfer workflows, such as distribution of collision event data from CERN or upload of simulated event data from the Tier-2 centres to the archival Tier-1 centres. For each workflow, we present the typical patterns of transfer latencies that have been identified with the latency monitor.We identify the areas in PhEDEx where a development effort can reduce the latency, and we show how we are able to detect stuck transfers which need operator intervention. Lastly, we propose a set of metrics to alert about stuck subscriptions and prompt for manual intervention, with the aim of improving transfer completion times.« less
Monitoring data transfer latency in CMS computing operations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bonacorsi, Daniele; Diotalevi, Tommaso; Magini, Nicolo
During the first LHC run, the CMS experiment collected tens of Petabytes of collision and simulated data, which need to be distributed among dozens of computing centres with low latency in order to make efficient use of the resources. While the desired level of throughput has been successfully achieved, it is still common to observe transfer workflows that cannot reach full completion in a timely manner due to a small fraction of stuck files which require operator intervention.For this reason, in 2012 the CMS transfer management system, PhEDEx, was instrumented with a monitoring system to measure file transfer latencies, andmore » to predict the completion time for the transfer of a data set. The operators can detect abnormal patterns in transfer latencies while the transfer is still in progress, and monitor the long-term performance of the transfer infrastructure to plan the data placement strategy.Based on the data collected for one year with the latency monitoring system, we present a study on the different factors that contribute to transfer completion time. As case studies, we analyze several typical CMS transfer workflows, such as distribution of collision event data from CERN or upload of simulated event data from the Tier-2 centres to the archival Tier-1 centres. For each workflow, we present the typical patterns of transfer latencies that have been identified with the latency monitor.We identify the areas in PhEDEx where a development effort can reduce the latency, and we show how we are able to detect stuck transfers which need operator intervention. Lastly, we propose a set of metrics to alert about stuck subscriptions and prompt for manual intervention, with the aim of improving transfer completion times.« less
Communication-Driven Codesign for Multiprocessor Systems
2004-01-01
processors, FPGA or ASIC subsystems, mi- croprocessors, and microcontrollers. When a processor is embedded within a SLOT architecture, one or more...Broderson, Low-power CMOS digital design, IEEE Journal of Solid-State Circuits 27 (1992), no. 4, 473–484. [25] L. Chao and E. Sha , Scheduling data-flow...1997), 239– 256 . [82] P. K. Murthy, E. G. Cohen, and S. Rowland, System Canvas: A new design en- vironment for embedded DSP and telecommunications
Fault-Tolerant, Real-Time, Multi-Core Computer System
NASA Technical Reports Server (NTRS)
Gostelow, Kim P.
2012-01-01
A document discusses a fault-tolerant, self-aware, low-power, multi-core computer for space missions with thousands of simple cores, achieving speed through concurrency. The proposed machine decides how to achieve concurrency in real time, rather than depending on programmers. The driving features of the system are simple hardware that is modular in the extreme, with no shared memory, and software with significant runtime reorganizing capability. The document describes a mechanism for moving ongoing computations and data that is based on a functional model of execution. Because there is no shared memory, the processor connects to its neighbors through a high-speed data link. Messages are sent to a neighbor switch, which in turn forwards that message on to its neighbor until reaching the intended destination. Except for the neighbor connections, processors are isolated and independent of each other. The processors on the periphery also connect chip-to-chip, thus building up a large processor net. There is no particular topology to the larger net, as a function at each processor allows it to forward a message in the correct direction. Some chip-to-chip connections are not necessarily nearest neighbors, providing short cuts for some of the longer physical distances. The peripheral processors also provide the connections to sensors, actuators, radios, science instruments, and other devices with which the computer system interacts.
Developing infrared array controller with software real time operating system
NASA Astrophysics Data System (ADS)
Sako, Shigeyuki; Miyata, Takashi; Nakamura, Tomohiko; Motohara, Kentaro; Uchimoto, Yuka Katsuno; Onaka, Takashi; Kataza, Hirokazu
2008-07-01
Real-time capabilities are required for a controller of a large format array to reduce a dead-time attributed by readout and data transfer. The real-time processing has been achieved by dedicated processors including DSP, CPLD, and FPGA devices. However, the dedicated processors have problems with memory resources, inflexibility, and high cost. Meanwhile, a recent PC has sufficient resources of CPUs and memories to control the infrared array and to process a large amount of frame data in real-time. In this study, we have developed an infrared array controller with a software real-time operating system (RTOS) instead of the dedicated processors. A Linux PC equipped with a RTAI extension and a dual-core CPU is used as a main computer, and one of the CPU cores is allocated to the real-time processing. A digital I/O board with DMA functions is used for an I/O interface. The signal-processing cores are integrated in the OS kernel as a real-time driver module, which is composed of two virtual devices of the clock processor and the frame processor tasks. The array controller with the RTOS realizes complicated operations easily, flexibly, and at a low cost.