multi-tasking multi-processor telemetry: Topics by Science.gov

Sample records for multi-tasking multi-processor telemetry

Telemetry Technology

NASA Technical Reports Server (NTRS)

1997-01-01

In 1990, Avtec Systems, Inc. developed its first telemetry boards for Goddard Space Flight Center. Avtec products now include PC/AT, PCI and VME-based high speed I/O boards and turn-key systems. The most recent and most successful technology transfer from NASA to Avtec is the Programmable Telemetry Processor (PTP), a personal computer- based, multi-channel telemetry front-end processing system originally developed to support the NASA communication (NASCOM) network. The PTP performs data acquisition, real-time network transfer, and store and forward operations. There are over 100 PTP systems located in NASA facilities and throughout the world.
Neurovision processor for designing intelligent sensors

NASA Astrophysics Data System (ADS)

Gupta, Madan M.; Knopf, George K.

1992-03-01

A programmable multi-task neuro-vision processor, called the Positive-Negative (PN) neural processor, is proposed as a plausible hardware mechanism for constructing robust multi-task vision sensors. The computational operations performed by the PN neural processor are loosely based on the neural activity fields exhibited by certain nervous tissue layers situated in the brain. The neuro-vision processor can be programmed to generate diverse dynamic behavior that may be used for spatio-temporal stabilization (STS), short-term visual memory (STVM), spatio-temporal filtering (STF) and pulse frequency modulation (PFM). A multi- functional vision sensor that performs a variety of information processing operations on time- varying two-dimensional sensory images can be constructed from a parallel and hierarchical structure of numerous individually programmed PN neural processors.
Multi-heuristic dynamic task allocation using genetic algorithms in a heterogeneous distributed system

PubMed Central

Page, Andrew J.; Keane, Thomas M.; Naughton, Thomas J.

2010-01-01

We present a multi-heuristic evolutionary task allocation algorithm to dynamically map tasks to processors in a heterogeneous distributed system. It utilizes a genetic algorithm, combined with eight common heuristics, in an effort to minimize the total execution time. It operates on batches of unmapped tasks and can preemptively remap tasks to processors. The algorithm has been implemented on a Java distributed system and evaluated with a set of six problems from the areas of bioinformatics, biomedical engineering, computer science and cryptography. Experiments using up to 150 heterogeneous processors show that the algorithm achieves better efficiency than other state-of-the-art heuristic algorithms. PMID:20862190
Safe and Efficient Support for Embeded Multi-Processors in ADA

NASA Astrophysics Data System (ADS)

Ruiz, Jose F.

2010-08-01

New software demands increasing processing power, and multi-processor platforms are spreading as the answer to achieve the required performance. Embedded real-time systems are also subject to this trend, but in the case of real-time mission-critical systems, the properties of reliability, predictability and analyzability are also paramount. The Ada 2005 language defined a subset of its tasking model, the Ravenscar profile, that provides the basis for the implementation of deterministic and time analyzable applications on top of a streamlined run-time system. This Ravenscar tasking profile, originally designed for single processors, has proven remarkably useful for modelling verifiable real-time single-processor systems. This paper proposes a simple extension to the Ravenscar profile to support multi-processor systems using a fully partitioned approach. The implementation of this scheme is simple, and it can be used to develop applications amenable to schedulability analysis.
Tactical Operations Analysis Support Facility.

DTIC Science & Technology

1981-05-01

Punch/Reader 2 DMC-11AR DDCMP Micro Processor 2 DMC-11DA Network Link Line Unit 2 DL-11E Async Serial Line Interface 4 Intel IN-1670 448K Words MOS Memory...86 5.3 VIRTUAL PROCESSORS - VAX-11/750 ........................... 89 5.4 A RELATIONAL DATA MANAGEMENT SYSTEM - ORACLE...Central Processing Unit (CPU) is a 16 bit processor for high-speed, real time applications, and for large multi-user, multi- task, time shared
Multi-Mission Simulation and Visualization for Real-Time Telemetry Display, Playback and EDL Event Reconstruction

NASA Technical Reports Server (NTRS)

Pomerantz, M. I.; Lim, C.; Myint, S.; Woodward, G.; Balaram, J.; Kuo, C.

2012-01-01

he Jet Propulsion Laboratory's Entry, Descent and Landing (EDL) Reconstruction Task has developed a software system that provides mission operations personnel and analysts with a real time telemetry-based live display, playback and post-EDL reconstruction capability that leverages the existing high-fidelity, physics-based simulation framework and modern game engine-derived 3D visualization system developed in the JPL Dynamics and Real Time Simulation (DARTS) Lab. Developed as a multi-mission solution, the EDL Telemetry Visualization (ETV) system has been used for a variety of projects including NASA's Mars Science Laboratory (MSL), NASA'S Low Density Supersonic Decelerator (LDSD) and JPL's MoonRise Lunar sample return proposal.
Parallelising a molecular dynamics algorithm on a multi-processor workstation

NASA Astrophysics Data System (ADS)

Müller-Plathe, Florian

1990-12-01

The Verlet neighbour-list algorithm is parallelised for a multi-processor Hewlett-Packard/Apollo DN10000 workstation. The implementation makes use of memory shared between the processors. It is a genuine master-slave approach by which most of the computational tasks are kept in the master process and the slaves are only called to do part of the nonbonded forces calculation. The implementation features elements of both fine-grain and coarse-grain parallelism. Apart from three calls to library routines, two of which are standard UNIX calls, and two machine-specific language extensions, the whole code is written in standard Fortran 77. Hence, it may be expected that this parallelisation concept can be transfered in parts or as a whole to other multi-processor shared-memory computers. The parallel code is routinely used in production work.
Next Generation Space Telescope Integrated Science Module Data System

NASA Technical Reports Server (NTRS)

Schnurr, Richard G.; Greenhouse, Matthew A.; Jurotich, Matthew M.; Whitley, Raymond; Kalinowski, Keith J.; Love, Bruce W.; Travis, Jeffrey W.; Long, Knox S.

1999-01-01

The Data system for the Next Generation Space Telescope (NGST) Integrated Science Module (ISIM) is the primary data interface between the spacecraft, telescope, and science instrument systems. This poster includes block diagrams of the ISIM data system and its components derived during the pre-phase A Yardstick feasibility study. The poster details the hardware and software components used to acquire and process science data for the Yardstick instrument compliment, and depicts the baseline external interfaces to science instruments and other systems. This baseline data system is a fully redundant, high performance computing system. Each redundant computer contains three 150 MHz power PC processors. All processors execute a commercially available real time multi-tasking operating system supporting, preemptive multi-tasking, file management and network interfaces. These six processors in the system are networked together. The spacecraft interface baseline is an extension of the network, which links the six processors. The final selection for Processor busses, processor chips, network interfaces, and high-speed data interfaces will be made during mid 2002.
Efficiency of static core turn-off in a system-on-a-chip with variation

DOEpatents

Cher, Chen-Yong; Coteus, Paul W; Gara, Alan; Kursun, Eren; Paulsen, David P; Schuelke, Brian A; Sheets, II, John E; Tian, Shurong

2013-10-29

A processor-implemented method for improving efficiency of a static core turn-off in a multi-core processor with variation, the method comprising: conducting via a simulation a turn-off analysis of the multi-core processor at the multi-core processor's design stage, wherein the turn-off analysis of the multi-core processor at the multi-core processor's design stage includes a first output corresponding to a first multi-core processor core to turn off; conducting a turn-off analysis of the multi-core processor at the multi-core processor's testing stage, wherein the turn-off analysis of the multi-core processor at the multi-core processor's testing stage includes a second output corresponding to a second multi-core processor core to turn off; comparing the first output and the second output to determine if the first output is referring to the same core to turn off as the second output; outputting a third output corresponding to the first multi-core processor core if the first output and the second output are both referring to the same core to turn off.
CQPSO scheduling algorithm for heterogeneous multi-core DAG task model

NASA Astrophysics Data System (ADS)

Zhai, Wenzheng; Hu, Yue-Li; Ran, Feng

2017-07-01

Efficient task scheduling is critical to achieve high performance in a heterogeneous multi-core computing environment. The paper focuses on the heterogeneous multi-core directed acyclic graph (DAG) task model and proposes a novel task scheduling method based on an improved chaotic quantum-behaved particle swarm optimization (CQPSO) algorithm. A task priority scheduling list was built. A processor with minimum cumulative earliest finish time (EFT) was acted as the object of the first task assignment. The task precedence relationships were satisfied and the total execution time of all tasks was minimized. The experimental results show that the proposed algorithm has the advantage of optimization abilities, simple and feasible, fast convergence, and can be applied to the task scheduling optimization for other heterogeneous and distributed environment.
Energy Efficient Real-Time Scheduling Using DPM on Mobile Sensors with a Uniform Multi-Cores

PubMed Central

Kim, Youngmin; Lee, Chan-Gun

2017-01-01

In wireless sensor networks (WSNs), sensor nodes are deployed for collecting and analyzing data. These nodes use limited energy batteries for easy deployment and low cost. The use of limited energy batteries is closely related to the lifetime of the sensor nodes when using wireless sensor networks. Efficient-energy management is important to extending the lifetime of the sensor nodes. Most effort for improving power efficiency in tiny sensor nodes has focused mainly on reducing the power consumed during data transmission. However, recent emergence of sensor nodes equipped with multi-cores strongly requires attention to be given to the problem of reducing power consumption in multi-cores. In this paper, we propose an energy efficient scheduling method for sensor nodes supporting a uniform multi-cores. We extend the proposed T-Ler plane based scheduling for global optimal scheduling of a uniform multi-cores and multi-processors to enable power management using dynamic power management. In the proposed approach, processor selection for a scheduling and mapping method between the tasks and processors is proposed to efficiently utilize dynamic power management. Experiments show the effectiveness of the proposed approach compared to other existing methods. PMID:29240695
Using all of your CPU's in HIPE

NASA Astrophysics Data System (ADS)

Jacobson, J. D.; Fadda, D.

2012-09-01

Modern computer architectures increasingly feature multi-core CPU's. For example, the MacbookPro features the Intel quad-core i7 processors. Through the use of hyper-threading, where each core can execute two threads simultaneously, the quad-core i7 can support eight simultaneous processing threads. All this on your laptop! This CPU power can now be put into service by scientists to perform data reduction tasks, but only if the software has been designed to take advantage of the multiple processor architectures. Up to now, software written for Herschel data reduction (HIPE), written in Jython and JAVA, is single-threaded and can only utilize a single processor. Users of HIPE do not get any advantage from the additional processors. Why not put all of the CPU resources to work reducing your data? We present a multi-threaded software application that corrects long-term transients in the signal from the PACS unchopped spectroscopy line scan mode. In this poster, we present a multi-threaded software framework to achieve performance improvements from parallel execution. We will show how a task to correct transients in the PACS Spectroscopy Pipeline for the un-chopped line scan mode, has been threaded. This computation-intensive task uses either a one-parameter or a three parameter exponential function, to characterize the transient. The task uses a JAVA implementation of Minpack, translated from the C (Moshier) and IDL (Markwardt) by the authors, to optimize the correction parameters. We also explain how to determine if a task can benefit from threading (Amdahl's Law), and if it is safe to thread. The design and implementation, using the JAVA concurrency package completions service is described. Pitfalls, timing bugs, thread safety, resource control, testing and performance improvements are described and plotted.
Phoenix Telemetry Processor

NASA Technical Reports Server (NTRS)

Stanboli, Alice

2013-01-01

Phxtelemproc is a C/C++ based telemetry processing program that processes SFDU telemetry packets from the Telemetry Data System (TDS). It generates Experiment Data Records (EDRs) for several instruments including surface stereo imager (SSI); robotic arm camera (RAC); robotic arm (RA); microscopy, electrochemistry, and conductivity analyzer (MECA); and the optical microscope (OM). It processes both uncompressed and compressed telemetry, and incorporates unique subroutines for the following compression algorithms: JPEG Arithmetic, JPEG Huffman, Rice, LUT3, RA, and SX4. This program was in the critical path for the daily command cycle of the Phoenix mission. The products generated by this program were part of the RA commanding process, as well as the SSI, RAC, OM, and MECA image and science analysis process. Its output products were used to advance science of the near polar regions of Mars, and were used to prove that water is found in abundance there. Phxtelemproc is part of the MIPL (Multi-mission Image Processing Laboratory) system. This software produced Level 1 products used to analyze images returned by in situ spacecraft. It ultimately assisted in operations, planning, commanding, science, and outreach.
Multi-Core Processor Memory Contention Benchmark Analysis Case Study

NASA Technical Reports Server (NTRS)

Simon, Tyler; McGalliard, James

2009-01-01

Multi-core processors dominate current mainframe, server, and high performance computing (HPC) systems. This paper provides synthetic kernel and natural benchmark results from an HPC system at the NASA Goddard Space Flight Center that illustrate the performance impacts of multi-core (dual- and quad-core) vs. single core processor systems. Analysis of processor design, application source code, and synthetic and natural test results all indicate that multi-core processors can suffer from significant memory subsystem contention compared to similar single-core processors.
Implementing the PM Programming Language using MPI and OpenMP - a New Tool for Programming Geophysical Models on Parallel Systems

NASA Astrophysics Data System (ADS)

Bellerby, Tim

2015-04-01

PM (Parallel Models) is a new parallel programming language specifically designed for writing environmental and geophysical models. The language is intended to enable implementers to concentrate on the science behind the model rather than the details of running on parallel hardware. At the same time PM leaves the programmer in control - all parallelisation is explicit and the parallel structure of any given program may be deduced directly from the code. This paper describes a PM implementation based on the Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) standards, looking at issues involved with translating the PM parallelisation model to MPI/OpenMP protocols and considering performance in terms of the competing factors of finer-grained parallelisation and increased communication overhead. In order to maximise portability, the implementation stays within the MPI 1.3 standard as much as possible, with MPI-2 MPI-IO file handling the only significant exception. Moreover, it does not assume a thread-safe implementation of MPI. PM adopts a two-tier abstract representation of parallel hardware. A PM processor is a conceptual unit capable of efficiently executing a set of language tasks, with a complete parallel system consisting of an abstract N-dimensional array of such processors. PM processors may map to single cores executing tasks using cooperative multi-tasking, to multiple cores or even to separate processing nodes, efficiently sharing tasks using algorithms such as work stealing. While tasks may move between hardware elements within a PM processor, they may not move between processors without specific programmer intervention. Tasks are assigned to processors using a nested parallelism approach, building on ideas from Reyes et al. (2009). The main program owns all available processors. When the program enters a parallel statement then either processors are divided out among the newly generated tasks (number of new tasks < number of processors) or tasks are divided out among the available processors (number of tasks > number of processors). Nested parallel statements may further subdivide the processor set owned by a given task. Tasks or processors are distributed evenly by default, but uneven distributions are possible under programmer control. It is also possible to explicitly enable child tasks to migrate within the processor set owned by their parent task, reducing load unbalancing at the potential cost of increased inter-processor message traffic. PM incorporates some programming structures from the earlier MIST language presented at a previous EGU General Assembly, while adopting a significantly different underlying parallelisation model and type system. PM code is available at www.pm-lang.org under an unrestrictive MIT license. Reference Ruymán Reyes, Antonio J. Dorta, Francisco Almeida, Francisco de Sande, 2009. Automatic Hybrid MPI+OpenMP Code Generation with llc, Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science Volume 5759, 185-195
Interactive high-resolution isosurface ray casting on multicore processors.

PubMed

Wang, Qin; JaJa, Joseph

2008-01-01

We present a new method for the interactive rendering of isosurfaces using ray casting on multi-core processors. This method consists of a combination of an object-order traversal that coarsely identifies possible candidate 3D data blocks for each small set of contiguous pixels, and an isosurface ray casting strategy tailored for the resulting limited-size lists of candidate 3D data blocks. While static screen partitioning is widely used in the literature, our scheme performs dynamic allocation of groups of ray casting tasks to ensure almost equal loads among the different threads running on multi-cores while maintaining spatial locality. We also make careful use of memory management environment commonly present in multi-core processors. We test our system on a two-processor Clovertown platform, each consisting of a Quad-Core 1.86-GHz Intel Xeon Processor, for a number of widely different benchmarks. The detailed experimental results show that our system is efficient and scalable, and achieves high cache performance and excellent load balancing, resulting in an overall performance that is superior to any of the previous algorithms. In fact, we achieve an interactive isosurface rendering on a 1024(2) screen for all the datasets tested up to the maximum size of the main memory of our platform.
CMS Readiness for Multi-Core Workload Scheduling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Perez-Calero Yzquierdo, A.; Balcas, J.; Hernandez, J.

In the present run of the LHC, CMS data reconstruction and simulation algorithms benefit greatly from being executed as multiple threads running on several processor cores. The complexity of the Run 2 events requires parallelization of the code to reduce the memory-per- core footprint constraining serial execution programs, thus optimizing the exploitation of present multi-core processor architectures. The allocation of computing resources for multi-core tasks, however, becomes a complex problem in itself. The CMS workload submission infrastructure employs multi-slot partitionable pilots, built on HTCondor and GlideinWMS native features, to enable scheduling of single and multi-core jobs simultaneously. This provides amore » solution for the scheduling problem in a uniform way across grid sites running a diversity of gateways to compute resources and batch system technologies. This paper presents this strategy and the tools on which it has been implemented. The experience of managing multi-core resources at the Tier-0 and Tier-1 sites during 2015, along with the deployment phase to Tier-2 sites during early 2016 is reported. The process of performance monitoring and optimization to achieve efficient and flexible use of the resources is also described.« less
CMS readiness for multi-core workload scheduling

NASA Astrophysics Data System (ADS)

Perez-Calero Yzquierdo, A.; Balcas, J.; Hernandez, J.; Aftab Khan, F.; Letts, J.; Mason, D.; Verguilov, V.

2017-10-01

In the present run of the LHC, CMS data reconstruction and simulation algorithms benefit greatly from being executed as multiple threads running on several processor cores. The complexity of the Run 2 events requires parallelization of the code to reduce the memory-per- core footprint constraining serial execution programs, thus optimizing the exploitation of present multi-core processor architectures. The allocation of computing resources for multi-core tasks, however, becomes a complex problem in itself. The CMS workload submission infrastructure employs multi-slot partitionable pilots, built on HTCondor and GlideinWMS native features, to enable scheduling of single and multi-core jobs simultaneously. This provides a solution for the scheduling problem in a uniform way across grid sites running a diversity of gateways to compute resources and batch system technologies. This paper presents this strategy and the tools on which it has been implemented. The experience of managing multi-core resources at the Tier-0 and Tier-1 sites during 2015, along with the deployment phase to Tier-2 sites during early 2016 is reported. The process of performance monitoring and optimization to achieve efficient and flexible use of the resources is also described.
Using Multi-Core Systems for Rover Autonomy

NASA Technical Reports Server (NTRS)

Clement, Brad; Estlin, Tara; Bornstein, Benjamin; Springer, Paul; Anderson, Robert C.

2010-01-01

Task Objectives are: (1) Develop and demonstrate key capabilities for rover long-range science operations using multi-core computing, (a) Adapt three rover technologies to execute on SOA multi-core processor (b) Illustrate performance improvements achieved (c) Demonstrate adapted capabilities with rover hardware, (2) Targeting three high-level autonomy technologies (a) Two for onboard data analysis (b) One for onboard command sequencing/planning, (3) Technologies identified as enabling for future missions, (4)Benefits will be measured along several metrics: (a) Execution time / Power requirements (b) Number of data products processed per unit time (c) Solution quality
Application of Advanced Multi-Core Processor Technologies to Oceanographic Research

DTIC Science & Technology

2013-09-30

STM32 NXP LPC series No Proprietary Microchip PIC32/DSPIC No > 500 mW; < 5 W ARM Cortex TI OMAP TI Sitara Broadcom BCM2835 Varies FPGA...1 DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. Application of Advanced Multi-Core Processor Technologies...state-of-the-art information processing architectures. OBJECTIVES Next-generation processor architectures (multi-core, multi-threaded) hold the

System, methods and apparatus for program optimization for multi-threaded processor architectures

DOEpatents

Bastoul, Cedric; Lethin, Richard A; Leung, Allen K; Meister, Benoit J; Szilagyi, Peter; Vasilache, Nicolas T; Wohlford, David E

2015-01-06

Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that allow for parallel execution of tasks. The first custom computing apparatus optimizes the code for parallelism, locality of operations and contiguity of memory accesses on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.
Methods and Apparatus for Aggregation of Multiple Pulse Code Modulation Channels into a Signal Time Division Multiplexing Stream

NASA Technical Reports Server (NTRS)

Chang, Chen J. (Inventor); Liaghati, Jr., Amir L. (Inventor); Liaghati, Mahsa L. (Inventor)

2018-01-01

Methods and apparatus are provided for telemetry processing using a telemetry processor. The telemetry processor can include a plurality of communications interfaces, a computer processor, and data storage. The telemetry processor can buffer sensor data by: receiving a frame of sensor data using a first communications interface and clock data using a second communications interface, receiving an end of frame signal using a third communications interface, and storing the received frame of sensor data in the data storage. After buffering the sensor data, the telemetry processor can generate an encapsulated data packet including a single encapsulated data packet header, the buffered sensor data, and identifiers identifying telemetry devices that provided the sensor data. A format of the encapsulated data packet can comply with a Consultative Committee for Space Data Systems (CCSDS) standard. The telemetry processor can send the encapsulated data packet using a fourth and a fifth communications interfaces.
LANDSAT-D flight segment operations manual. Appendix B: OBC software operations

NASA Technical Reports Server (NTRS)

Talipsky, R.

1981-01-01

The LANDSAT 4 satellite contains two NASA standard spacecraft computers and 65,536 words of memory. Onboard computer software is divided into flight executive and applications processors. Both applications processors and the flight executive use one or more of 67 system tables to obtain variables, constants, and software flags. Output from the software for monitoring operation is via 49 OBC telemetry reports subcommutated in the spacecraft telemetry. Information is provided about the flight software as it is used to control the various spacecraft operations and interpret operational OBC telemetry. Processor function descriptions, processor operation, software constraints, processor system tables, processor telemetry, and processor flow charts are presented.
Printed Multi-Turn Loop Antenna for RF Bio-Telemetry

NASA Technical Reports Server (NTRS)

Simons, Rainee N.; Hall, David G.; Miranda, Felix A.

2004-01-01

In this paper, a novel printed multi-turn loop antenna for contact-less powering and RF telemetry from implantable bio- MEMS sensors at a design frequency of 300 MHz is demonstrated. In addition, computed values of input reactance, radiation resistance, skin effect resistance, and radiation efficiency for the printed multi-turn loop antenna are presented. The computed input reactance is compared with the measured values and shown to be in fair agreement. The computed radiation efficiency at the design frequency is about 24 percent.
Configurable Multi-Purpose Processor

NASA Technical Reports Server (NTRS)

Valencia, J. Emilio; Forney, Chirstopher; Morrison, Robert; Birr, Richard

2010-01-01

Advancements in technology have allowed the miniaturization of systems used in aerospace vehicles. This technology is driven by the need for next-generation systems that provide reliable, responsive, and cost-effective range operations while providing increased capabilities such as simultaneous mission support, increased launch trajectories, improved launch, and landing opportunities, etc. Leveraging the newest technologies, the command and telemetry processor (CTP) concept provides for a compact, flexible, and integrated solution for flight command and telemetry systems and range systems. The CTP is a relatively small circuit board that serves as a processing platform for high dynamic, high vibration environments. The CTP can be reconfigured and reprogrammed, allowing it to be adapted for many different applications. The design is centered around a configurable field-programmable gate array (FPGA) device that contains numerous logic cells that can be used to implement traditional integrated circuits. The FPGA contains two PowerPC processors running the Vx-Works real-time operating system and are used to execute software programs specific to each application. The CTP was designed and developed specifically to provide telemetry functions; namely, the command processing, telemetry processing, and GPS metric tracking of a flight vehicle. However, it can be used as a general-purpose processor board to perform numerous functions implemented in either hardware or software using the FPGA s processors and/or logic cells. Functionally, the CTP was designed for range safety applications where it would ultimately become part of a vehicle s flight termination system. Consequently, the major functions of the CTP are to perform the forward link command processing, GPS metric tracking, return link telemetry data processing, error detection and correction, data encryption/ decryption, and initiate flight termination action commands. Also, the CTP had to be designed to survive and operate in a launch environment. Additionally, the CTP was designed to interface with the WFF (Wallops Flight Facility) custom-designed transceiver board which is used in the Low Cost TDRSS Transceiver (LCT2) also developed by WFF. The LCT2 s transceiver board demodulates commands received from the ground via the forward link and sends them to the CTP, where they are processed. The CTP inputs and processes data from the inertial measurement unit (IMU) and the GPS receiver board, generates status data, and then sends the data to the transceiver board where it is modulated and sent to the ground via the return link. Overall, the CTP has combined processing with the ability to interface to a GPS receiver, an IMU, and a pulse code modulation (PCM) communication link, while providing the capability to support common interfaces including Ethernet and serial interfaces boarding a relatively small-sized, lightweight package.
Performance evaluation of throughput computing workloads using multi-core processors and graphics processors

NASA Astrophysics Data System (ADS)

Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.

2017-11-01

Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.
Network Coding on Heterogeneous Multi-Core Processors for Wireless Sensor Networks

PubMed Central

Kim, Deokho; Park, Karam; Ro, Won W.

2011-01-01

While network coding is well known for its efficiency and usefulness in wireless sensor networks, the excessive costs associated with decoding computation and complexity still hinder its adoption into practical use. On the other hand, high-performance microprocessors with heterogeneous multi-cores would be used as processing nodes of the wireless sensor networks in the near future. To this end, this paper introduces an efficient network coding algorithm developed for the heterogenous multi-core processors. The proposed idea is fully tested on one of the currently available heterogeneous multi-core processors referred to as the Cell Broadband Engine. PMID:22164053
Data Telemetry and Acquisition System for Acoustic Signal Processing Investigations.

DTIC Science & Technology

1996-02-20

were VME- based computer systems operating under the VxWorks real - time operating system . Each system shared a common hardware and software... real - time operating system . It interfaces to the Berg PCM Decommutator board, which searches for the embedded synchronization word in the data and re...software were built on top of this architecture. The multi-tasking, message queue and memory management facilities of the VxWorks real - time operating system are
Implementing TCP/IP and a socket interface as a server in a message-passing operating system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hipp, E.; Wiltzius, D.

1990-03-01

The UNICOS 4.3BSD network code and socket transport interface are the basis of an explicit network server for NLTSS, a message passing operating system on the Cray YMP. A BSD socket user library provides access to the network server using an RPC mechanism. The advantages of this server methodology are its modularity and extensibility to migrate to future protocol suites (e.g. OSI) and transport interfaces. In addition, the network server is implemented in an explicit multi-tasking environment to take advantage of the Cray YMP multi-processor platform. 19 refs., 5 figs.
T-L Plane Abstraction-Based Energy-Efficient Real-Time Scheduling for Multi-Core Wireless Sensors.

PubMed

Kim, Youngmin; Lee, Ki-Seong; Pham, Ngoc-Son; Lee, Sun-Ro; Lee, Chan-Gun

2016-07-08

Energy efficiency is considered as a critical requirement for wireless sensor networks. As more wireless sensor nodes are equipped with multi-cores, there are emerging needs for energy-efficient real-time scheduling algorithms. The T-L plane-based scheme is known to be an optimal global scheduling technique for periodic real-time tasks on multi-cores. Unfortunately, there has been a scarcity of studies on extending T-L plane-based scheduling algorithms to exploit energy-saving techniques. In this paper, we propose a new T-L plane-based algorithm enabling energy-efficient real-time scheduling on multi-core sensor nodes with dynamic power management (DPM). Our approach addresses the overhead of processor mode transitions and reduces fragmentations of the idle time, which are inherent in T-L plane-based algorithms. Our experimental results show the effectiveness of the proposed algorithm compared to other energy-aware scheduling methods on T-L plane abstraction.
Multitask neurovision processor with extensive feedback and feedforward connections

NASA Astrophysics Data System (ADS)

Gupta, Madan M.; Knopf, George K.

1991-11-01

A multi-task neuro-vision parameter which performs a variety of information processing operations associated with the early stages of biological vision is presented. The network architecture of this neuro-vision processor, called the positive-negative (PN) neural processor, is loosely based on the neural activity fields exhibited by thalamic and cortical nervous tissue layers. The computational operation performed by the processor arises from the strength of the recurrent feedback among the numerous positive and negative neural computing units. By adjusting the feedback connections it is possible to generate diverse dynamic behavior that may be used for short-term visual memory (STVM), spatio-temporal filtering (STF), and pulse frequency modulation (PFM). The information attributes that are to be processes may be regulated by modifying the feedforward connections from the signal space to the neural processor.
Multi-level Hierarchical Poly Tree computer architectures

NASA Technical Reports Server (NTRS)

Padovan, Joe; Gute, Doug

1990-01-01

Based on the concept of hierarchical substructuring, this paper develops an optimal multi-level Hierarchical Poly Tree (HPT) parallel computer architecture scheme which is applicable to the solution of finite element and difference simulations. Emphasis is given to minimizing computational effort, in-core/out-of-core memory requirements, and the data transfer between processors. In addition, a simplified communications network that reduces the number of I/O channels between processors is presented. HPT configurations that yield optimal superlinearities are also demonstrated. Moreover, to generalize the scope of applicability, special attention is given to developing: (1) multi-level reduction trees which provide an orderly/optimal procedure by which model densification/simplification can be achieved, as well as (2) methodologies enabling processor grading that yields architectures with varying types of multi-level granularity.
Parallel evolution of image processing tools for multispectral imagery

NASA Astrophysics Data System (ADS)

Harvey, Neal R.; Brumby, Steven P.; Perkins, Simon J.; Porter, Reid B.; Theiler, James P.; Young, Aaron C.; Szymanski, John J.; Bloch, Jeffrey J.

2000-11-01

We describe the implementation and performance of a parallel, hybrid evolutionary-algorithm-based system, which optimizes image processing tools for feature-finding tasks in multi-spectral imagery (MSI) data sets. Our system uses an integrated spatio-spectral approach and is capable of combining suitably-registered data from different sensors. We investigate the speed-up obtained by parallelization of the evolutionary process via multiple processors (a workstation cluster) and develop a model for prediction of run-times for different numbers of processors. We demonstrate our system on Landsat Thematic Mapper MSI , covering the recent Cerro Grande fire at Los Alamos, NM, USA.
Equalizer: a scalable parallel rendering framework.

PubMed

Eilemann, Stefan; Makhinya, Maxim; Pajarola, Renato

2009-01-01

Continuing improvements in CPU and GPU performances as well as increasing multi-core processor and cluster-based parallelism demand for flexible and scalable parallel rendering solutions that can exploit multipipe hardware accelerated graphics. In fact, to achieve interactive visualization, scalable rendering systems are essential to cope with the rapid growth of data sets. However, parallel rendering systems are non-trivial to develop and often only application specific implementations have been proposed. The task of developing a scalable parallel rendering framework is even more difficult if it should be generic to support various types of data and visualization applications, and at the same time work efficiently on a cluster with distributed graphics cards. In this paper we introduce a novel system called Equalizer, a toolkit for scalable parallel rendering based on OpenGL which provides an application programming interface (API) to develop scalable graphics applications for a wide range of systems ranging from large distributed visualization clusters and multi-processor multipipe graphics systems to single-processor single-pipe desktop machines. We describe the system architecture, the basic API, discuss its advantages over previous approaches, present example configurations and usage scenarios as well as scalability results.
Dynamic Voltage-Frequency and Workload Joint Scaling Power Management for Energy Harvesting Multi-Core WSN Node SoC

PubMed Central

Li, Xiangyu; Xie, Nijie; Tian, Xinyue

2017-01-01

This paper proposes a scheduling and power management solution for energy harvesting heterogeneous multi-core WSN node SoC such that the system continues to operate perennially and uses the harvested energy efficiently. The solution consists of a heterogeneous multi-core system oriented task scheduling algorithm and a low-complexity dynamic workload scaling and configuration optimization algorithm suitable for light-weight platforms. Moreover, considering the power consumption of most WSN applications have the characteristic of data dependent behavior, we introduce branches handling mechanism into the solution as well. The experimental result shows that the proposed algorithm can operate in real-time on a lightweight embedded processor (MSP430), and that it can make a system do more valuable works and make more than 99.9% use of the power budget. PMID:28208730
Dynamic Voltage-Frequency and Workload Joint Scaling Power Management for Energy Harvesting Multi-Core WSN Node SoC.

PubMed

Li, Xiangyu; Xie, Nijie; Tian, Xinyue

2017-02-08

This paper proposes a scheduling and power management solution for energy harvesting heterogeneous multi-core WSN node SoC such that the system continues to operate perennially and uses the harvested energy efficiently. The solution consists of a heterogeneous multi-core system oriented task scheduling algorithm and a low-complexity dynamic workload scaling and configuration optimization algorithm suitable for light-weight platforms. Moreover, considering the power consumption of most WSN applications have the characteristic of data dependent behavior, we introduce branches handling mechanism into the solution as well. The experimental result shows that the proposed algorithm can operate in real-time on a lightweight embedded processor (MSP430), and that it can make a system do more valuable works and make more than 99.9% use of the power budget.
T-L Plane Abstraction-Based Energy-Efficient Real-Time Scheduling for Multi-Core Wireless Sensors

PubMed Central

Kim, Youngmin; Lee, Ki-Seong; Pham, Ngoc-Son; Lee, Sun-Ro; Lee, Chan-Gun

2016-01-01

Energy efficiency is considered as a critical requirement for wireless sensor networks. As more wireless sensor nodes are equipped with multi-cores, there are emerging needs for energy-efficient real-time scheduling algorithms. The T-L plane-based scheme is known to be an optimal global scheduling technique for periodic real-time tasks on multi-cores. Unfortunately, there has been a scarcity of studies on extending T-L plane-based scheduling algorithms to exploit energy-saving techniques. In this paper, we propose a new T-L plane-based algorithm enabling energy-efficient real-time scheduling on multi-core sensor nodes with dynamic power management (DPM). Our approach addresses the overhead of processor mode transitions and reduces fragmentations of the idle time, which are inherent in T-L plane-based algorithms. Our experimental results show the effectiveness of the proposed algorithm compared to other energy-aware scheduling methods on T-L plane abstraction. PMID:27399722
Multi-Objective Optimization for Trustworthy Tactical Networks: A Survey and Insights

DTIC Science & Technology

2013-06-01

existing data sources, gathering and maintaining the data needed , and completing and reviewing the collection of information. Send comments regarding...problems: using repeated cooperative games [12], hedonic games [25], and nontransferable utility cooperative games [27]. It should be noted that trust...examined an optimal task allocation problem in a distributed computing system where program modules need to be allocated to different processors to
Options for Parallelizing a Planning and Scheduling Algorithm

NASA Technical Reports Server (NTRS)

Clement, Bradley J.; Estlin, Tara A.; Bornstein, Benjamin D.

2011-01-01

Space missions have a growing interest in putting multi-core processors onboard spacecraft. For many missions processing power significantly slows operations. We investigate how continual planning and scheduling algorithms can exploit multi-core processing and outline different potential design decisions for a parallelized planning architecture. This organization of choices and challenges helps us with an initial design for parallelizing the CASPER planning system for a mesh multi-core processor. This work extends that presented at another workshop with some preliminary results.
Switch for serial or parallel communication networks

DOEpatents

Crosette, D.B.

1994-07-19

A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination. 9 figs.

Switch for serial or parallel communication networks

DOEpatents

Crosette, Dario B.

1994-01-01

A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination.
Virtual Machine Language 2.1

NASA Technical Reports Server (NTRS)

Riedel, Joseph E.; Grasso, Christopher A.

2012-01-01

VML (Virtual Machine Language) is an advanced computing environment that allows spacecraft to operate using mechanisms ranging from simple, time-oriented sequencing to advanced, multicomponent reactive systems. VML has developed in four evolutionary stages. VML 0 is a core execution capability providing multi-threaded command execution, integer data types, and rudimentary branching. VML 1 added named parameterized procedures, extensive polymorphism, data typing, branching, looping issuance of commands using run-time parameters, and named global variables. VML 2 added for loops, data verification, telemetry reaction, and an open flight adaptation architecture. VML 2.1 contains major advances in control flow capabilities for executable state machines. On the resource requirements front, VML 2.1 features a reduced memory footprint in order to fit more capability into modestly sized flight processors, and endian-neutral data access for compatibility with Intel little-endian processors. Sequence packaging has been improved with object-oriented programming constructs and the use of implicit (rather than explicit) time tags on statements. Sequence event detection has been significantly enhanced with multi-variable waiting, which allows a sequence to detect and react to conditions defined by complex expressions with multiple global variables. This multi-variable waiting serves as the basis for implementing parallel rule checking, which in turn, makes possible executable state machines. The new state machine feature in VML 2.1 allows the creation of sophisticated autonomous reactive systems without the need to develop expensive flight software. Users specify named states and transitions, along with the truth conditions required, before taking transitions. Transitions with the same signal name allow separate state machines to coordinate actions: the conditions distributed across all state machines necessary to arm a particular signal are evaluated, and once found true, that signal is raised. The selected signal then causes all identically named transitions in all present state machines to be taken simultaneously. VML 2.1 has relevance to all potential space missions, both manned and unmanned. It was under consideration for use on Orion.
Method and structure for skewed block-cyclic distribution of lower-dimensional data arrays in higher-dimensional processor grids

DOEpatents

Chatterjee, Siddhartha [Yorktown Heights, NY; Gunnels, John A [Brewster, NY

2011-11-08

A method and structure of distributing elements of an array of data in a computer memory to a specific processor of a multi-dimensional mesh of parallel processors includes designating a distribution of elements of at least a portion of the array to be executed by specific processors in the multi-dimensional mesh of parallel processors. The pattern of the designating includes a cyclical repetitive pattern of the parallel processor mesh, as modified to have a skew in at least one dimension so that both a row of data in the array and a column of data in the array map to respective contiguous groupings of the processors such that a dimension of the contiguous groupings is greater than one.
A Hybrid Task Graph Scheduler for High Performance Image Processing Workflows.

PubMed

Blattner, Timothy; Keyrouz, Walid; Bhattacharyya, Shuvra S; Halem, Milton; Brady, Mary

2017-12-01

Designing applications for scalability is key to improving their performance in hybrid and cluster computing. Scheduling code to utilize parallelism is difficult, particularly when dealing with data dependencies, memory management, data motion, and processor occupancy. The Hybrid Task Graph Scheduler (HTGS) improves programmer productivity when implementing hybrid workflows for multi-core and multi-GPU systems. The Hybrid Task Graph Scheduler (HTGS) is an abstract execution model, framework, and API that increases programmer productivity when implementing hybrid workflows for such systems. HTGS manages dependencies between tasks, represents CPU and GPU memories independently, overlaps computations with disk I/O and memory transfers, keeps multiple GPUs occupied, and uses all available compute resources. Through these abstractions, data motion and memory are explicit; this makes data locality decisions more accessible. To demonstrate the HTGS application program interface (API), we present implementations of two example algorithms: (1) a matrix multiplication that shows how easily task graphs can be used; and (2) a hybrid implementation of microscopy image stitching that reduces code size by ≈ 43% compared to a manually coded hybrid workflow implementation and showcases the minimal overhead of task graphs in HTGS. Both of the HTGS-based implementations show good performance. In image stitching the HTGS implementation achieves similar performance to the hybrid workflow implementation. Matrix multiplication with HTGS achieves 1.3× and 1.8× speedup over the multi-threaded OpenBLAS library for 16k × 16k and 32k × 32k size matrices, respectively.
Backend Control Processor for a Multi-Processor Relational Database Computer System.

DTIC Science & Technology

1984-12-01

SCHOOL OF ENGI. UNCRSIFID MPONTIFF DEC 84 AFXT/GCS/ENG/84D-22 F/O 9/2 L ommhhhhmhhml mhhhommhhhhhm i-2 8 -- U0. 11111= Q. 2 111.8IIII- 1111111..6...THESIS Presented to the Faculty of the School of Engineering of the Air Force Institute of Technology Air University In Partial Fulfillment of the...development of a Backend Multi-Processor Relational Database Computer System. This thesis addresses a single component of this system, the Backend Control
Fast multi-core based multimodal registration of 2D cross-sections and 3D datasets.

PubMed

Scharfe, Michael; Pielot, Rainer; Schreiber, Falk

2010-01-11

Solving bioinformatics tasks often requires extensive computational power. Recent trends in processor architecture combine multiple cores into a single chip to improve overall performance. The Cell Broadband Engine (CBE), a heterogeneous multi-core processor, provides power-efficient and cost-effective high-performance computing. One application area is image analysis and visualisation, in particular registration of 2D cross-sections into 3D image datasets. Such techniques can be used to put different image modalities into spatial correspondence, for example, 2D images of histological cuts into morphological 3D frameworks. We evaluate the CBE-driven PlayStation 3 as a high performance, cost-effective computing platform by adapting a multimodal alignment procedure to several characteristic hardware properties. The optimisations are based on partitioning, vectorisation, branch reducing and loop unrolling techniques with special attention to 32-bit multiplies and limited local storage on the computing units. We show how a typical image analysis and visualisation problem, the multimodal registration of 2D cross-sections and 3D datasets, benefits from the multi-core based implementation of the alignment algorithm. We discuss several CBE-based optimisation methods and compare our results to standard solutions. More information and the source code are available from http://cbe.ipk-gatersleben.de. The results demonstrate that the CBE processor in a PlayStation 3 accelerates computational intensive multimodal registration, which is of great importance in biological/medical image processing. The PlayStation 3 as a low cost CBE-based platform offers an efficient option to conventional hardware to solve computational problems in image processing and bioinformatics.
Some Issues in Programming Multi-Mini-Processors

DTIC Science & Technology

1975-01-01

Hardware ^nd software are to be combined optimally to perform that specialized task. This in essence is the stategy followed by the BBN group in...large memory is directly addressable. MIXED SOLUTIONS The most promising approach appears to involve mixing several of the previous solutions...mini- or micro-computers. Possibly the problem will be solved by avoiding it. Some new minis are appearing on the market now with large physical
Benchmarking NWP Kernels on Multi- and Many-core Processors

NASA Astrophysics Data System (ADS)

Michalakes, J.; Vachharajani, M.

2008-12-01

Increased computing power for weather, climate, and atmospheric science has provided direct benefits for defense, agriculture, the economy, the environment, and public welfare and convenience. Today, very large clusters with many thousands of processors are allowing scientists to move forward with simulations of unprecedented size. But time-critical applications such as real-time forecasting or climate prediction need strong scaling: faster nodes and processors, not more of them. Moreover, the need for good cost- performance has never been greater, both in terms of performance per watt and per dollar. For these reasons, the new generations of multi- and many-core processors being mass produced for commercial IT and "graphical computing" (video games) are being scrutinized for their ability to exploit the abundant fine- grain parallelism in atmospheric models. We present results of our work to date identifying key computational kernels within the dynamics and physics of a large community NWP model, the Weather Research and Forecast (WRF) model. We benchmark and optimize these kernels on several different multi- and many-core processors. The goals are to (1) characterize and model performance of the kernels in terms of computational intensity, data parallelism, memory bandwidth pressure, memory footprint, etc. (2) enumerate and classify effective strategies for coding and optimizing for these new processors, (3) assess difficulties and opportunities for tool or higher-level language support, and (4) establish a continuing set of kernel benchmarks that can be used to measure and compare effectiveness of current and future designs of multi- and many-core processors for weather and climate applications.
Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

NASA Astrophysics Data System (ADS)

Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

2015-09-01

The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.
Benchmarking GNU Radio Kernels and Multi-Processor Scheduling

DTIC Science & Technology

2013-01-14

AMD E350 APU , comparable to Atom • ARM Cortex A8 running on a Gumstix Overo on an Ettus USRP E110 The general testing procedure consists of • Build...Intel Atom, and the AMD E350 APU . 3.2 Multi-Processor Scheduling Figure 1: GFLOPs per second through an FFT array on an Intel i7. Example output from
Image matrix processor for fast multi-dimensional computations

DOEpatents

Roberson, George P.; Skeate, Michael F.

1996-01-01

An apparatus for multi-dimensional computation which comprises a computation engine, including a plurality of processing modules. The processing modules are configured in parallel and compute respective contributions to a computed multi-dimensional image of respective two dimensional data sets. A high-speed, parallel access storage system is provided which stores the multi-dimensional data sets, and a switching circuit routes the data among the processing modules in the computation engine and the storage system. A data acquisition port receives the two dimensional data sets representing projections through an image, for reconstruction algorithms such as encountered in computerized tomography. The processing modules include a programmable local host, by which they may be configured to execute a plurality of different types of multi-dimensional algorithms. The processing modules thus include an image manipulation processor, which includes a source cache, a target cache, a coefficient table, and control software for executing image transformation routines using data in the source cache and the coefficient table and loading resulting data in the target cache. The local host processor operates to load the source cache with a two dimensional data set, loads the coefficient table, and transfers resulting data out of the target cache to the storage system, or to another destination.
A new approach to telemetry data processing. Ph.D. Thesis - Maryland Univ.

NASA Technical Reports Server (NTRS)

Broglio, C. J.

1973-01-01

An approach for a preprocessing system for telemetry data processing was developed. The philosophy of the approach is the development of a preprocessing system to interface with the main processor and relieve it of the burden of stripping information from a telemetry data stream. To accomplish this task, a telemetry preprocessing language was developed. Also, a hardware device for implementing the operation of this language was designed using a cellular logic module concept. In the development of the hardware device and the cellular logic module, a distributed form of control was implemented. This is accomplished by a technique of one-to-one intermodule communications and a set of privileged communication operations. By transferring this control state from module to module, the control function is dispersed through the system. A compiler for translating the preprocessing language statements into an operations table for the hardware device was also developed. Finally, to complete the system design and verify it, a simulator for the collular logic module was written using the APL/360 system.
Who multi-tasks and why? Multi-tasking ability, perceived multi-tasking ability, impulsivity, and sensation seeking.

PubMed

Sanbonmatsu, David M; Strayer, David L; Medeiros-Ward, Nathan; Watson, Jason M

2013-01-01

The present study examined the relationship between personality and individual differences in multi-tasking ability. Participants enrolled at the University of Utah completed measures of multi-tasking activity, perceived multi-tasking ability, impulsivity, and sensation seeking. In addition, they performed the Operation Span in order to assess their executive control and actual multi-tasking ability. The findings indicate that the persons who are most capable of multi-tasking effectively are not the persons who are most likely to engage in multiple tasks simultaneously. To the contrary, multi-tasking activity as measured by the Media Multitasking Inventory and self-reported cell phone usage while driving were negatively correlated with actual multi-tasking ability. Multi-tasking was positively correlated with participants' perceived ability to multi-task ability which was found to be significantly inflated. Participants with a strong approach orientation and a weak avoidance orientation--high levels of impulsivity and sensation seeking--reported greater multi-tasking behavior. Finally, the findings suggest that people often engage in multi-tasking because they are less able to block out distractions and focus on a singular task. Participants with less executive control--low scorers on the Operation Span task and persons high in impulsivity--tended to report higher levels of multi-tasking activity.
Fast multi-core based multimodal registration of 2D cross-sections and 3D datasets

PubMed Central

2010-01-01

Background Solving bioinformatics tasks often requires extensive computational power. Recent trends in processor architecture combine multiple cores into a single chip to improve overall performance. The Cell Broadband Engine (CBE), a heterogeneous multi-core processor, provides power-efficient and cost-effective high-performance computing. One application area is image analysis and visualisation, in particular registration of 2D cross-sections into 3D image datasets. Such techniques can be used to put different image modalities into spatial correspondence, for example, 2D images of histological cuts into morphological 3D frameworks. Results We evaluate the CBE-driven PlayStation 3 as a high performance, cost-effective computing platform by adapting a multimodal alignment procedure to several characteristic hardware properties. The optimisations are based on partitioning, vectorisation, branch reducing and loop unrolling techniques with special attention to 32-bit multiplies and limited local storage on the computing units. We show how a typical image analysis and visualisation problem, the multimodal registration of 2D cross-sections and 3D datasets, benefits from the multi-core based implementation of the alignment algorithm. We discuss several CBE-based optimisation methods and compare our results to standard solutions. More information and the source code are available from http://cbe.ipk-gatersleben.de. Conclusions The results demonstrate that the CBE processor in a PlayStation 3 accelerates computational intensive multimodal registration, which is of great importance in biological/medical image processing. The PlayStation 3 as a low cost CBE-based platform offers an efficient option to conventional hardware to solve computational problems in image processing and bioinformatics. PMID:20064262
Progress Towards a Rad-Hydro Code for Modern Computing Architectures LA-UR-10-02825

NASA Astrophysics Data System (ADS)

Wohlbier, J. G.; Lowrie, R. B.; Bergen, B.; Calef, M.

2010-11-01

We are entering an era of high performance computing where data movement is the overwhelming bottleneck to scalable performance, as opposed to the speed of floating-point operations per processor. All multi-core hardware paradigms, whether heterogeneous or homogeneous, be it the Cell processor, GPGPU, or multi-core x86, share this common trait. In multi-physics applications such as inertial confinement fusion or astrophysics, one may be solving multi-material hydrodynamics with tabular equation of state data lookups, radiation transport, nuclear reactions, and charged particle transport in a single time cycle. The algorithms are intensely data dependent, e.g., EOS, opacity, nuclear data, and multi-core hardware memory restrictions are forcing code developers to rethink code and algorithm design. For the past two years LANL has been funding a small effort referred to as Multi-Physics on Multi-Core to explore ideas for code design as pertaining to inertial confinement fusion and astrophysics applications. The near term goals of this project are to have a multi-material radiation hydrodynamics capability, with tabular equation of state lookups, on cartesian and curvilinear block structured meshes. In the longer term we plan to add fully implicit multi-group radiation diffusion and material heat conduction, and block structured AMR. We will report on our progress to date.
Performance analysis of the GR712RC dual-core LEON3FT SPARC V8 processor in an asymmetric multi-processing environment

NASA Astrophysics Data System (ADS)

Giusi, Giovanni; Liu, Scige J.; Galli, Emanuele; Di Giorgio, Anna M.; Farina, Maria; Vertolli, Nello; Di Lellis, Andrea M.

2016-07-01

In this paper we present the results of a series of performance tests carried out on a prototype board mounting the Cobham Gaisler GR712RC Dual Core LEON3FT processor. The aim was the characterization of the performances of the dual core processor when used for executing a highly demanding lossless compression task, acting on data segments continuously copied from the static memory to the processor RAM. The selection of the compression activity to evaluate the performances was driven by the possibility of a comparison with previously executed tests on the Cobham/Aeroflex Gaisler UT699 LEON3FT SPARC™ V8. The results of the test activity have shown a factor 1.6 of improvement with respect to the previous tests, which can easily be improved by adopting a faster onboard board clock, and provided indications on the best size of the data chunks to be used in the compression activity.
Image matrix processor for fast multi-dimensional computations

DOEpatents

Roberson, G.P.; Skeate, M.F.

1996-10-15

An apparatus for multi-dimensional computation is disclosed which comprises a computation engine, including a plurality of processing modules. The processing modules are configured in parallel and compute respective contributions to a computed multi-dimensional image of respective two dimensional data sets. A high-speed, parallel access storage system is provided which stores the multi-dimensional data sets, and a switching circuit routes the data among the processing modules in the computation engine and the storage system. A data acquisition port receives the two dimensional data sets representing projections through an image, for reconstruction algorithms such as encountered in computerized tomography. The processing modules include a programmable local host, by which they may be configured to execute a plurality of different types of multi-dimensional algorithms. The processing modules thus include an image manipulation processor, which includes a source cache, a target cache, a coefficient table, and control software for executing image transformation routines using data in the source cache and the coefficient table and loading resulting data in the target cache. The local host processor operates to load the source cache with a two dimensional data set, loads the coefficient table, and transfers resulting data out of the target cache to the storage system, or to another destination. 10 figs.
Scheduling algorithms for automatic control systems for technological processes

NASA Astrophysics Data System (ADS)

Chernigovskiy, A. S.; Tsarev, R. Yu; Kapulin, D. V.

2017-01-01

Wide use of automatic process control systems and the usage of high-performance systems containing a number of computers (processors) give opportunities for creation of high-quality and fast production that increases competitiveness of an enterprise. Exact and fast calculations, control computation, and processing of the big data arrays - all of this requires the high level of productivity and, at the same time, minimum time of data handling and result receiving. In order to reach the best time, it is necessary not only to use computing resources optimally, but also to design and develop the software so that time gain will be maximal. For this purpose task (jobs or operations), scheduling techniques for the multi-machine/multiprocessor systems are applied. Some of basic task scheduling methods for the multi-machine process control systems are considered in this paper, their advantages and disadvantages come to light, and also some usage considerations, in case of the software for automatic process control systems developing, are made.
Early Student Support for Application of Advanced Multi-Core Processor Technologies to Oceanographic Research

DTIC Science & Technology

2016-05-07

REPORT DOCUMENTATION PAGE I . ... ... .. . ,...,.., ............. OMB No. 0704-0188 The public reporting burden for this collection of...Student Support for Appl ication of Advanced Multi- Core Processor N00014-12-1-0298 Technologies to Oceanographic Research Sb. GRANT NUMBER Sc...communications protocols (i.e. UART, I2C, and SPI), through the , ’ . handing off of the data to the server APis. By providing a common set of tools
Optimization of the Multi-Spectral Euclidean Distance Calculation for FPGA-based Spaceborne Systems

NASA Technical Reports Server (NTRS)

Cristo, Alejandro; Fisher, Kevin; Perez, Rosa M.; Martinez, Pablo; Gualtieri, Anthony J.

2012-01-01

Due to the high quantity of operations that spaceborne processing systems must carry out in space, new methodologies and techniques are being presented as good alternatives in order to free the main processor from work and improve the overall performance. These include the development of ancillary dedicated hardware circuits that carry out the more redundant and computationally expensive operations in a faster way, leaving the main processor free to carry out other tasks while waiting for the result. One of these devices is SpaceCube, a FPGA-based system designed by NASA. The opportunity to use FPGA reconfigurable architectures in space allows not only the optimization of the mission operations with hardware-level solutions, but also the ability to create new and improved versions of the circuits, including error corrections, once the satellite is already in orbit. In this work, we propose the optimization of a common operation in remote sensing: the Multi-Spectral Euclidean Distance calculation. For that, two different hardware architectures have been designed and implemented in a Xilinx Virtex-5 FPGA, the same model of FPGAs used by SpaceCube. Previous results have shown that the communications between the embedded processor and the circuit create a bottleneck that affects the overall performance in a negative way. In order to avoid this, advanced methods including memory sharing, Native Port Interface (NPI) connections and Data Burst Transfers have been used.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rodenbeck, Christopher T.; Young, Derek; Chou, Tina

A combined radar and telemetry system is described. The combined radar and telemetry system includes a processing unit that executes instructions, where the instructions define a radar waveform and a telemetry waveform. The processor outputs a digital baseband signal based upon the instructions, where the digital baseband signal is based upon the radar waveform and the telemetry waveform. A radar and telemetry circuit transmits, simultaneously, a radar signal and telemetry signal based upon the digital baseband signal.
Investigation into the Use of Texturing for Real-Time Computer Animation.

DTIC Science & Technology

1987-12-01

produce a rough polygon surface [7]. Research in the area of real time texturing has also been conducted. Using a specially designed multi-processor system ...Oka, Tsutsui, Ohba, Kurauchi and Tago have introduced real-time manipulation of texture mapped surfaces [8]. Using multi- processors, systems will...a call to the system function defpattern(n,size,mask) short n,size; short *mask, which takes as input an index to a system table of patterns, a
A highly efficient multi-core algorithm for clustering extremely large datasets

PubMed Central

2010-01-01

Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922
AltiVec performance increases for autonomous robotics for the MARSSCAPE architecture program

NASA Astrophysics Data System (ADS)

Gothard, Benny M.

2002-02-01

One of the main tall poles that must be overcome to develop a fully autonomous vehicle is the inability of the computer to understand its surrounding environment to a level that is required for the intended task. The military mission scenario requires a robot to interact in a complex, unstructured, dynamic environment. Reference A High Fidelity Multi-Sensor Scene Understanding System for Autonomous Navigation The Mobile Autonomous Robot Software Self Composing Adaptive Programming Environment (MarsScape) perception research addresses three aspects of the problem; sensor system design, processing architectures, and algorithm enhancements. A prototype perception system has been demonstrated on robotic High Mobility Multi-purpose Wheeled Vehicle and All Terrain Vehicle testbeds. This paper addresses the tall pole of processing requirements and the performance improvements based on the selected MarsScape Processing Architecture. The processor chosen is the Motorola Altivec-G4 Power PC(PPC) (1998 Motorola, Inc.), a highly parallized commercial Single Instruction Multiple Data processor. Both derived perception benchmarks and actual perception subsystems code will be benchmarked and compared against previous Demo II-Semi-autonomous Surrogate Vehicle processing architectures along with desktop Personal Computers(PC). Performance gains are highlighted with progress to date, and lessons learned and future directions are described.
Animal movement data: GPS telemetry, autocorrelation and the need for path-level analysis [chapter 7

Treesearch

Samuel A. Cushman

2010-01-01

In the previous chapter we presented the idea of a multi-layer, multi-scale, spatially referenced data-cube as the foundation for monitoring and for implementing flexible modeling of ecological pattern-process relationships in particulate, in context and to integrate these across large spatial extents at the grain of the strongest linkage between response and driving...
Autonomous Agents and Intelligent Assistants for Exploration Operations

NASA Technical Reports Server (NTRS)

Malin, Jane T.

2000-01-01

Human exploration of space will involve remote autonomous crew and systems in long missions. Data to earth will be delayed and limited. Earth control centers will not receive continuous real-time telemetry data, and there will be communication round trips of up to one hour. There will be reduced human monitoring on the planet and earth. When crews are present on the planet, they will be occupied with other activities, and system management will be a low priority task. Earth control centers will use multi-tasking "night shift" and on-call specialists. A new project at Johnson Space Center is developing software to support teamwork between distributed human and software agents in future interplanetary work environments. The Engineering and Mission Operations Directorates at Johnson Space Center (JSC) are combining laboratories and expertise to carry out this project, by establishing a testbed for hWl1an centered design, development and evaluation of intelligent autonomous and assistant systems. Intelligent autonomous systems for managing systems on planetary bases will commuicate their knowledge to support distributed multi-agent mixed-initiative operations. Intelligent assistant agents will respond to events by developing briefings and responses according to instructions from human agents on earth and in space.
Parallel Multi-Step/Multi-Rate Integration of Two-Time Scale Dynamic Systems

NASA Technical Reports Server (NTRS)

Chang, Johnny T.; Ploen, Scott R.; Sohl, Garett. A,; Martin, Bryan J.

2004-01-01

Increasing demands on the fidelity of simulations for real-time and high-fidelity simulations are stressing the capacity of modern processors. New integration techniques are required that provide maximum efficiency for systems that are parallelizable. However many current techniques make assumptions that are at odds with non-cascadable systems. A new serial multi-step/multi-rate integration algorithm for dual-timescale continuous state systems is presented which applies to these systems, and is extended to a parallel multi-step/multi-rate algorithm. The superior performance of both algorithms is demonstrated through a representative example.
Highway Traffic Simulations on Multi-Processor Computers

DOT National Transportation Integrated Search

1997-01-01

A computer model has been developed to simulate highway traffic for various degrees of automation with a high degree of fidelity in regard to driver control and vehicle characteristics. The model simulates vehicle maneuvering in a multi-lane highway ...
Interdependent Multi-Layer Networks: Modeling and Survivability Analysis with Applications to Space-Based Networks

PubMed Central

Castet, Jean-Francois; Saleh, Joseph H.

2013-01-01

This article develops a novel approach and algorithmic tools for the modeling and survivability analysis of networks with heterogeneous nodes, and examines their application to space-based networks. Space-based networks (SBNs) allow the sharing of spacecraft on-orbit resources, such as data storage, processing, and downlink. Each spacecraft in the network can have different subsystem composition and functionality, thus resulting in node heterogeneity. Most traditional survivability analyses of networks assume node homogeneity and as a result, are not suited for the analysis of SBNs. This work proposes that heterogeneous networks can be modeled as interdependent multi-layer networks, which enables their survivability analysis. The multi-layer aspect captures the breakdown of the network according to common functionalities across the different nodes, and it allows the emergence of homogeneous sub-networks, while the interdependency aspect constrains the network to capture the physical characteristics of each node. Definitions of primitives of failure propagation are devised. Formal characterization of interdependent multi-layer networks, as well as algorithmic tools for the analysis of failure propagation across the network are developed and illustrated with space applications. The SBN applications considered consist of several networked spacecraft that can tap into each other's Command and Data Handling subsystem, in case of failure of its own, including the Telemetry, Tracking and Command, the Control Processor, and the Data Handling sub-subsystems. Various design insights are derived and discussed, and the capability to perform trade-space analysis with the proposed approach for various network characteristics is indicated. The select results here shown quantify the incremental survivability gains (with respect to a particular class of threats) of the SBN over the traditional monolith spacecraft. Failure of the connectivity between nodes is also examined, and the results highlight the importance of the reliability of the wireless links between spacecraft (nodes) to enable any survivability improvements for space-based networks. PMID:23599835
Interdependent multi-layer networks: modeling and survivability analysis with applications to space-based networks.

PubMed

Castet, Jean-Francois; Saleh, Joseph H

2013-01-01

This article develops a novel approach and algorithmic tools for the modeling and survivability analysis of networks with heterogeneous nodes, and examines their application to space-based networks. Space-based networks (SBNs) allow the sharing of spacecraft on-orbit resources, such as data storage, processing, and downlink. Each spacecraft in the network can have different subsystem composition and functionality, thus resulting in node heterogeneity. Most traditional survivability analyses of networks assume node homogeneity and as a result, are not suited for the analysis of SBNs. This work proposes that heterogeneous networks can be modeled as interdependent multi-layer networks, which enables their survivability analysis. The multi-layer aspect captures the breakdown of the network according to common functionalities across the different nodes, and it allows the emergence of homogeneous sub-networks, while the interdependency aspect constrains the network to capture the physical characteristics of each node. Definitions of primitives of failure propagation are devised. Formal characterization of interdependent multi-layer networks, as well as algorithmic tools for the analysis of failure propagation across the network are developed and illustrated with space applications. The SBN applications considered consist of several networked spacecraft that can tap into each other's Command and Data Handling subsystem, in case of failure of its own, including the Telemetry, Tracking and Command, the Control Processor, and the Data Handling sub-subsystems. Various design insights are derived and discussed, and the capability to perform trade-space analysis with the proposed approach for various network characteristics is indicated. The select results here shown quantify the incremental survivability gains (with respect to a particular class of threats) of the SBN over the traditional monolith spacecraft. Failure of the connectivity between nodes is also examined, and the results highlight the importance of the reliability of the wireless links between spacecraft (nodes) to enable any survivability improvements for space-based networks.
DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors.

PubMed

Schmollinger, Martin; Nieselt, Kay; Kaufmann, Michael; Morgenstern, Burkhard

2004-09-09

Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.
RTEMS SMP and MTAPI for Efficient Multi-Core Space Applications on LEON3/LEON4 Processors

NASA Astrophysics Data System (ADS)

Cederman, Daniel; Hellstrom, Daniel; Sherrill, Joel; Bloom, Gedare; Patte, Mathieu; Zulianello, Marco

2015-09-01

This paper presents the final result of an European Space Agency (ESA) activity aimed at improving the software support for LEON processors used in SMP configurations. One of the benefits of using a multicore system in a SMP configuration is that in many instances it is possible to better utilize the available processing resources by load balancing between cores. This however comes with the cost of having to synchronize operations between cores, leading to increased complexity. While in an AMP system one can use multiple instances of operating systems that are only uni-processor capable, a SMP system requires the operating system to be written to support multicore systems. In this activity we have improved and extended the SMP support of the RTEMS real-time operating system and ensured that it fully supports the multicore capable LEON processors. The targeted hardware in the activity has been the GR712RC, a dual-core core LEON3FT processor, and the functional prototype of ESA's Next Generation Multiprocessor (NGMP), a quad core LEON4 processor. The final version of the NGMP is now available as a product under the name GR740. An implementation of the Multicore Task Management API (MTAPI) has been developed as part of this activity to aid in the parallelization of applications for RTEMS SMP. It allows for simplified development of parallel applications using the task-based programming model. An existing space application, the Gaia Video Processing Unit, has been ported to RTEMS SMP using the MTAPI implementation to demonstrate the feasibility and usefulness of multicore processors for space payload software. The activity is funded by ESA under contract 4000108560/13/NL/JK. Gedare Bloom is supported in part by NSF CNS-0934725.
A multi-satellite orbit determination problem in a parallel processing environment

NASA Technical Reports Server (NTRS)

Deakyne, M. S.; Anderle, R. J.

1988-01-01

The Engineering Orbit Analysis Unit at GE Valley Forge used an Intel Hypercube Parallel Processor to investigate the performance and gain experience of parallel processors with a multi-satellite orbit determination problem. A general study was selected in which major blocks of computation for the multi-satellite orbit computations were used as units to be assigned to the various processors on the Hypercube. Problems encountered or successes achieved in addressing the orbit determination problem would be more likely to be transferable to other parallel processors. The prime objective was to study the algorithm to allow processing of observations later in time than those employed in the state update. Expertise in ephemeris determination was exploited in addressing these problems and the facility used to bring a realism to the study which would highlight the problems which may not otherwise be anticipated. Secondary objectives were to gain experience of a non-trivial problem in a parallel processor environment, to explore the necessary interplay of serial and parallel sections of the algorithm in terms of timing studies, to explore the granularity (coarse vs. fine grain) to discover the granularity limit above which there would be a risk of starvation where the majority of nodes would be idle or under the limit where the overhead associated with splitting the problem may require more work and communication time than is useful.
The Multi-energy High precision Data Processor Based on AD7606

NASA Astrophysics Data System (ADS)

Zhao, Chen; Zhang, Yanchi; Xie, Da

2017-11-01

This paper designs an information collector based on AD7606 to realize the high-precision simultaneous acquisition of multi-source information of multi-energy systems to form the information platform of the energy Internet at Laogang with electricty as its major energy source. Combined with information fusion technologies, this paper analyzes the data to improve the overall energy system scheduling capability and reliability.
Development of an extensible dual-core wireless sensing node for cyber-physical systems

NASA Astrophysics Data System (ADS)

Kane, Michael; Zhu, Dapeng; Hirose, Mitsuhito; Dong, Xinjun; Winter, Benjamin; Häckell, Mortiz; Lynch, Jerome P.; Wang, Yang; Swartz, A.

2014-04-01

The introduction of wireless telemetry into the design of monitoring and control systems has been shown to reduce system costs while simplifying installations. To date, wireless nodes proposed for sensing and actuation in cyberphysical systems have been designed using microcontrollers with one computational pipeline (i.e., single-core microcontrollers). While concurrent code execution can be implemented on single-core microcontrollers, concurrency is emulated by splitting the pipeline's resources to support multiple threads of code execution. For many applications, this approach to multi-threading is acceptable in terms of speed and function. However, some applications such as feedback controls demand deterministic timing of code execution and maximum computational throughput. For these applications, the adoption of multi-core processor architectures represents one effective solution. Multi-core microcontrollers have multiple computational pipelines that can execute embedded code in parallel and can be interrupted independent of one another. In this study, a new wireless platform named Martlet is introduced with a dual-core microcontroller adopted in its design. The dual-core microcontroller design allows Martlet to dedicate one core to standard wireless sensor operations while the other core is reserved for embedded data processing and real-time feedback control law execution. Another distinct feature of Martlet is a standardized hardware interface that allows specialized daughter boards (termed wing boards) to be interfaced to the Martlet baseboard. This extensibility opens opportunity to encapsulate specialized sensing and actuation functions in a wing board without altering the design of Martlet. In addition to describing the design of Martlet, a few example wings are detailed, along with experiments showing the Martlet's ability to monitor and control physical systems such as wind turbines and buildings.
Compiling for Application Specific Computational Acceleration in Reconfigurable Architectures Final Report CRADA No. TSB-2033-01

DOE Office of Scientific and Technical Information (OSTI.GOV)

De Supinski, B.; Caliga, D.

2017-09-28

The primary objective of this project was to develop memory optimization technology to efficiently deliver data to, and distribute data within, the SRC-6's Field Programmable Gate Array- ("FPGA") based Multi-Adaptive Processors (MAPs). The hardware/software approach was to explore efficient MAP configurations and generate the compiler technology to exploit those configurations. This memory accessing technology represents an important step towards making reconfigurable symmetric multi-processor (SMP) architectures that will be a costeffective solution for large-scale scientific computing.
Exact diagonalization of quantum lattice models on coprocessors

NASA Astrophysics Data System (ADS)

Siro, T.; Harju, A.

2016-10-01

We implement the Lanczos algorithm on an Intel Xeon Phi coprocessor and compare its performance to a multi-core Intel Xeon CPU and an NVIDIA graphics processor. The Xeon and the Xeon Phi are parallelized with OpenMP and the graphics processor is programmed with CUDA. The performance is evaluated by measuring the execution time of a single step in the Lanczos algorithm. We study two quantum lattice models with different particle numbers, and conclude that for small systems, the multi-core CPU is the fastest platform, while for large systems, the graphics processor is the clear winner, reaching speedups of up to 7.6 compared to the CPU. The Xeon Phi outperforms the CPU with sufficiently large particle number, reaching a speedup of 2.5.
Virtualized Multi-Mission Operations Center (vMMOC) and its Cloud Services

NASA Technical Reports Server (NTRS)

Ido, Haisam Kassim

2017-01-01

His presentation will cover, the current and future, technical and organizational opportunities and challenges with virtualizing a multi-mission operations center. The full deployment of Goddard Space Flight Centers (GSFC) Virtualized Multi-Mission Operations Center (vMMOC) is nearly complete. The Space Science Mission Operations (SSMO) organizations spacecraft ACE, Fermi, LRO, MMS(4), OSIRIS-REx, SDO, SOHO, Swift, and Wind are in the process of being fully migrated to the vMMOC. The benefits of the vMMOC will be the normalization and the standardization of IT services, mission operations, maintenance, and development as well as ancillary services and policies such as collaboration tools, change management systems, and IT Security. The vMMOC will also provide operational efficiencies regarding hardware, IT domain expertise, training, maintenance and support.The presentation will also cover SSMO's secure Situational Awareness Dashboard in an integrated, fleet centric, cloud based web services fashion. Additionally the SSMO Telemetry as a Service (TaaS) will be covered, which allows authorized users and processes to access telemetry for the entire SSMO fleet, and for the entirety of each spacecrafts history. Both services leverage cloud services in a secure FISMA High and FedRamp environment, and also leverage distributed object stores in order to house and provide the telemetry. The services are also in the process of leveraging the cloud computing services elasticity and horizontal scalability. In the design phase is the Navigation as a Service (NaaS) which will provide a standardized, efficient, and normalized service for the fleet's space flight dynamics operations. Additional future services that may be considered are Ground Segment as a Service (GSaaS), Telemetry and Command as a Service (TCaaS), Flight Software Simulation as a Service, etc.
Improving our understanding of multi-tasking in healthcare: Drawing together the cognitive psychology and healthcare literature.

PubMed

Douglas, Heather E; Raban, Magdalena Z; Walter, Scott R; Westbrook, Johanna I

2017-03-01

Multi-tasking is an important skill for clinical work which has received limited research attention. Its impacts on clinical work are poorly understood. In contrast, there is substantial multi-tasking research in cognitive psychology, driver distraction, and human-computer interaction. This review synthesises evidence of the extent and impacts of multi-tasking on efficiency and task performance from health and non-healthcare literature, to compare and contrast approaches, identify implications for clinical work, and to develop an evidence-informed framework for guiding the measurement of multi-tasking in future healthcare studies. The results showed healthcare studies using direct observation have focused on descriptive studies to quantify concurrent multi-tasking and its frequency in different contexts, with limited study of impact. In comparison, non-healthcare studies have applied predominantly experimental and simulation designs, focusing on interleaved and concurrent multi-tasking, and testing theories of the mechanisms by which multi-tasking impacts task efficiency and performance. We propose a framework to guide the measurement of multi-tasking in clinical settings that draws together lessons from these siloed research efforts. Copyright Â© 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Making tomorrow's mistakes today: Evolutionary prototyping for risk reduction and shorter development time

NASA Astrophysics Data System (ADS)

Friedman, Gary; Schwuttke, Ursula M.; Burliegh, Scott; Chow, Sanguan; Parlier, Randy; Lee, Lorrine; Castro, Henry; Gersbach, Jim

1993-03-01

In the early days of JPL's solar system exploration, each spacecraft mission required its own dedicated data system with all software applications written in the mainframe's native assembly language. Although these early telemetry processing systems were a triumph of engineering in their day, since that time the computer industry has advanced to the point where it is now advantageous to replace these systems with more modern technology. The Space Flight Operations Center (SFOC) Prototype group was established in 1985 as a workstation and software laboratory. The charter of the lab was to determine if it was possible to construct a multimission telemetry processing system using commercial, off-the-shelf computers that communicated via networks. The staff of the lab mirrored that of a typical skunk works operation -- a small, multi-disciplinary team with a great deal of autonomy that could get complex tasks done quickly. In an effort to determine which approaches would be useful, the prototype group experimented with all types of operating systems, inter-process communication mechanisms, network protocols, packet size parameters. Out of that pioneering work came the confidence that a multi-mission telemetry processing system could be built using high-level languages running in a heterogeneous, networked workstation environment. Experience revealed that the operating systems on all nodes should be similar (i.e., all VMS or all PC-DOS or all UNIX), and that a unique Data Transport Subsystem tool needed to be built to address the incompatibilities of network standards, byte ordering, and socket buffering. The advantages of building a telemetry processing system based on emerging industry standards were numerous: by employing these standards, we would no longer be locked into a single vendor. When new technology came to market which offered ten times the performance at one eighth the cost, it would be possible to attach the new machine to the network, re-compile the application code, and run. In addition, we would no longer be plagued with lack of manufacturer support when we encountered obscure bugs. And maybe, hopefully, the eternal elusive goal of software portability across different vendors' platforms would finally be available. Some highlights of our prototyping efforts are described.

Making tomorrow's mistakes today: Evolutionary prototyping for risk reduction and shorter development time

NASA Technical Reports Server (NTRS)

Friedman, Gary; Schwuttke, Ursula M.; Burliegh, Scott; Chow, Sanguan; Parlier, Randy; Lee, Lorrine; Castro, Henry; Gersbach, Jim

1993-01-01

In the early days of JPL's solar system exploration, each spacecraft mission required its own dedicated data system with all software applications written in the mainframe's native assembly language. Although these early telemetry processing systems were a triumph of engineering in their day, since that time the computer industry has advanced to the point where it is now advantageous to replace these systems with more modern technology. The Space Flight Operations Center (SFOC) Prototype group was established in 1985 as a workstation and software laboratory. The charter of the lab was to determine if it was possible to construct a multimission telemetry processing system using commercial, off-the-shelf computers that communicated via networks. The staff of the lab mirrored that of a typical skunk works operation -- a small, multi-disciplinary team with a great deal of autonomy that could get complex tasks done quickly. In an effort to determine which approaches would be useful, the prototype group experimented with all types of operating systems, inter-process communication mechanisms, network protocols, packet size parameters. Out of that pioneering work came the confidence that a multi-mission telemetry processing system could be built using high-level languages running in a heterogeneous, networked workstation environment. Experience revealed that the operating systems on all nodes should be similar (i.e., all VMS or all PC-DOS or all UNIX), and that a unique Data Transport Subsystem tool needed to be built to address the incompatibilities of network standards, byte ordering, and socket buffering. The advantages of building a telemetry processing system based on emerging industry standards were numerous: by employing these standards, we would no longer be locked into a single vendor. When new technology came to market which offered ten times the performance at one eighth the cost, it would be possible to attach the new machine to the network, re-compile the application code, and run. In addition, we would no longer be plagued with lack of manufacturer support when we encountered obscure bugs. And maybe, hopefully, the eternal elusive goal of software portability across different vendors' platforms would finally be available. Some highlights of our prototyping efforts are described.
System architecture for asynchronous multi-processor robotic control system

NASA Technical Reports Server (NTRS)

Steele, Robert D.; Long, Mark; Backes, Paul

1993-01-01

The architecture for the Modular Telerobot Task Execution System (MOTES) as implemented in the Supervisory Telerobotics (STELER) Laboratory is described. MOTES is the software component of the remote site of a local-remote telerobotic system which is being developed for NASA for space applications, in particular Space Station Freedom applications. The system is being developed to provide control and supervised autonomous control to support both space based operation and ground-remote control with time delay. The local-remote architecture places task planning responsibilities at the local site and task execution responsibilities at the remote site. This separation allows the remote site to be designed to optimize task execution capability within a limited computational environment such as is expected in flight systems. The local site task planning system could be placed on the ground where few computational limitations are expected. MOTES is written in the Ada programming language for a multiprocessor environment.
Spitzer Telemetry Processing System

NASA Technical Reports Server (NTRS)

Stanboli, Alice; Martinez, Elmain M.; McAuley, James M.

2013-01-01

The Spitzer Telemetry Processing System (SirtfTlmProc) was designed to address objectives of JPL's Multi-mission Image Processing Lab (MIPL) in processing spacecraft telemetry and distributing the resulting data to the science community. To minimize costs and maximize operability, the software design focused on automated error recovery, performance, and information management. The system processes telemetry from the Spitzer spacecraft and delivers Level 0 products to the Spitzer Science Center. SirtfTlmProc is a unique system with automated error notification and recovery, with a real-time continuous service that can go quiescent after periods of inactivity. The software can process 2 GB of telemetry and deliver Level 0 science products to the end user in four hours. It provides analysis tools so the operator can manage the system and troubleshoot problems. It automates telemetry processing in order to reduce staffing costs.
An embedded multi-core parallel model for real-time stereo imaging

NASA Astrophysics Data System (ADS)

He, Wenjing; Hu, Jian; Niu, Jingyu; Li, Chuanrong; Liu, Guangyu

2018-04-01

The real-time processing based on embedded system will enhance the application capability of stereo imaging for LiDAR and hyperspectral sensor. The task partitioning and scheduling strategies for embedded multiprocessor system starts relatively late, compared with that for PC computer. In this paper, aimed at embedded multi-core processing platform, a parallel model for stereo imaging is studied and verified. After analyzing the computing amount, throughout capacity and buffering requirements, a two-stage pipeline parallel model based on message transmission is established. This model can be applied to fast stereo imaging for airborne sensors with various characteristics. To demonstrate the feasibility and effectiveness of the parallel model, a parallel software was designed using test flight data, based on the 8-core DSP processor TMS320C6678. The results indicate that the design performed well in workload distribution and had a speed-up ratio up to 6.4.
CAPRI (Computational Analysis PRogramming Interface): A Solid Modeling Based Infra-Structure for Engineering Analysis and Design Simulations

NASA Technical Reports Server (NTRS)

Haimes, Robert; Follen, Gregory J.

1998-01-01

CAPRI is a CAD-vendor neutral application programming interface designed for the construction of analysis and design systems. By allowing access to the geometry from within all modules (grid generators, solvers and post-processors) such tasks as meshing on the actual surfaces, node enrichment by solvers and defining which mesh faces are boundaries (for the solver and visualization system) become simpler. The overall reliance on file 'standards' is minimized. This 'Geometry Centric' approach makes multi-physics (multi-disciplinary) analysis codes much easier to build. By using the shared (coupled) surface as the foundation, CAPRI provides a single call to interpolate grid-node based data from the surface discretization in one volume to another. Finally, design systems are possible where the results can be brought back into the CAD system (and therefore manufactured) because all geometry construction and modification are performed using the CAD system's geometry kernel.
Enhancing Image Processing Performance for PCID in a Heterogeneous Network of Multi-core Processors

DTIC Science & Technology

2009-09-01

TFLOPS of Playstation 3 (PS3) nodes with IBM Cell Broadband Engine multi-cores and 15 dual-quad Xeon head nodes. The interconnect fabric includes... 4 3. INFORMATION MANAGEMENT FOR PARALLELIZATION AND...STREAMING............................................................. 7 4 . RESULTS
GSFC magnetic field experiment Explorer 43. [describing magnetometer, data processor, and telemetry

NASA Technical Reports Server (NTRS)

Seek, J. B.; Scheifele, J. L.; Ness, N. F.

1974-01-01

The magnetic field experiment flown on Explorer 43 is described. The detecting instrument is a triaxial fluxgate magnetometer which is mounted on a boom with a flipping mechanism for reorienting the sensor in flight. An on-board data processor takes successive magnetometer samples and transmits differences to the telemetry system. By examining these differences in conjunction with an untruncated sample transmitted periodically, the original data may be uniquely reconstructed on the ground.
Multiprocessing on supercomputers for computational aerodynamics

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Mehta, Unmeel B.

1990-01-01

Very little use is made of multiple processors available on current supercomputers (computers with a theoretical peak performance capability equal to 100 MFLOPs or more) in computational aerodynamics to significantly improve turnaround time. The productivity of a computer user is directly related to this turnaround time. In a time-sharing environment, the improvement in this speed is achieved when multiple processors are used efficiently to execute an algorithm. The concept of multiple instructions and multiple data (MIMD) through multi-tasking is applied via a strategy which requires relatively minor modifications to an existing code for a single processor. Essentially, this approach maps the available memory to multiple processors, exploiting the C-FORTRAN-Unix interface. The existing single processor code is mapped without the need for developing a new algorithm. The procedure for building a code utilizing this approach is automated with the Unix stream editor. As a demonstration of this approach, a Multiple Processor Multiple Grid (MPMG) code is developed. It is capable of using nine processors, and can be easily extended to a larger number of processors. This code solves the three-dimensional, Reynolds averaged, thin-layer and slender-layer Navier-Stokes equations with an implicit, approximately factored and diagonalized method. The solver is applied to generic oblique-wing aircraft problem on a four processor Cray-2 computer. A tricubic interpolation scheme is developed to increase the accuracy of coupling of overlapped grids. For the oblique-wing aircraft problem, a speedup of two in elapsed (turnaround) time is observed in a saturated time-sharing environment.
A FPGA-Based, Granularity-Variable Neuromorphic Processor and Its Application in a MIMO Real-Time Control System.

PubMed

Zhang, Zhen; Ma, Cheng; Zhu, Rong

2017-08-23

Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas.
A FPGA-Based, Granularity-Variable Neuromorphic Processor and Its Application in a MIMO Real-Time Control System

PubMed Central

Zhang, Zhen; Zhu, Rong

2017-01-01

Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas. PMID:28832522
Binaural unmasking of multi-channel stimuli in bilateral cochlear implant users.

PubMed

Van Deun, Lieselot; van Wieringen, Astrid; Francart, Tom; Büchner, Andreas; Lenarz, Thomas; Wouters, Jan

2011-10-01

Previous work suggests that bilateral cochlear implant users are sensitive to interaural cues if experimental speech processors are used to preserve accurate interaural information in the electrical stimulation pattern. Binaural unmasking occurs in adults and children when an interaural delay is applied to the envelope of a high-rate pulse train. Nevertheless, for speech perception, binaural unmasking benefits have not been demonstrated consistently, even with coordinated stimulation at both ears. The present study aimed at bridging the gap between basic psychophysical performance on binaural signal detection tasks on the one hand and binaural perception of speech in noise on the other hand. Therefore, binaural signal detection was expanded to multi-channel stimulation and biologically relevant interaural delays. A harmonic complex, consisting of three sinusoids (125, 250, and 375 Hz), was added to three 125-Hz-wide noise bands centered on the sinusoids. When an interaural delay of 700 μs was introduced, an average BMLD of 3 dB was established. Outcomes are promising in view of real-life benefits. Future research should investigate the generalization of the observed benefits for signal detection to speech perception in everyday listening situations and determine the importance of coordination of bilateral speech processors and accentuation of envelope cues.
Multi-threaded parallel simulation of non-local non-linear problems in ultrashort laser pulse propagation in the presence of plasma

NASA Astrophysics Data System (ADS)

Baregheh, Mandana; Mezentsev, Vladimir; Schmitz, Holger

2011-06-01

We describe a parallel multi-threaded approach for high performance modelling of wide class of phenomena in ultrafast nonlinear optics. Specific implementation has been performed using the highly parallel capabilities of a programmable graphics processor.
Parallel Lattice Basis Reduction Using a Multi-threaded Schnorr-Euchner LLL Algorithm

NASA Astrophysics Data System (ADS)

Backes, Werner; Wetzel, Susanne

In this paper, we introduce a new parallel variant of the LLL lattice basis reduction algorithm. Our new, multi-threaded algorithm is the first to provide an efficient, parallel implementation of the Schorr-Euchner algorithm for today’s multi-processor, multi-core computer architectures. Experiments with sparse and dense lattice bases show a speed-up factor of about 1.8 for the 2-thread and about factor 3.2 for the 4-thread version of our new parallel lattice basis reduction algorithm in comparison to the traditional non-parallel algorithm.
Supercomputing with TOUGH2 family codes for coupled multi-physics simulations of geologic carbon sequestration

NASA Astrophysics Data System (ADS)

Yamamoto, H.; Nakajima, K.; Zhang, K.; Nanai, S.

2015-12-01

Powerful numerical codes that are capable of modeling complex coupled processes of physics and chemistry have been developed for predicting the fate of CO2 in reservoirs as well as its potential impacts on groundwater and subsurface environments. However, they are often computationally demanding for solving highly non-linear models in sufficient spatial and temporal resolutions. Geological heterogeneity and uncertainties further increase the challenges in modeling works. Two-phase flow simulations in heterogeneous media usually require much longer computational time than that in homogeneous media. Uncertainties in reservoir properties may necessitate stochastic simulations with multiple realizations. Recently, massively parallel supercomputers with more than thousands of processors become available in scientific and engineering communities. Such supercomputers may attract attentions from geoscientist and reservoir engineers for solving the large and non-linear models in higher resolutions within a reasonable time. However, for making it a useful tool, it is essential to tackle several practical obstacles to utilize large number of processors effectively for general-purpose reservoir simulators. We have implemented massively-parallel versions of two TOUGH2 family codes (a multi-phase flow simulator TOUGH2 and a chemically reactive transport simulator TOUGHREACT) on two different types (vector- and scalar-type) of supercomputers with a thousand to tens of thousands of processors. After completing implementation and extensive tune-up on the supercomputers, the computational performance was measured for three simulations with multi-million grid models, including a simulation of the dissolution-diffusion-convection process that requires high spatial and temporal resolutions to simulate the growth of small convective fingers of CO2-dissolved water to larger ones in a reservoir scale. The performance measurement confirmed that the both simulators exhibit excellent scalabilities showing almost linear speedup against number of processors up to over ten thousand cores. Generally this allows us to perform coupled multi-physics (THC) simulations on high resolution geologic models with multi-million grid in a practical time (e.g., less than a second per time step).
Web-based multi-channel analyzer

DOEpatents

Gritzo, Russ E.

2003-12-23

The present invention provides an improved multi-channel analyzer designed to conveniently gather, process, and distribute spectrographic pulse data. The multi-channel analyzer may operate on a computer system having memory, a processor, and the capability to connect to a network and to receive digitized spectrographic pulses. The multi-channel analyzer may have a software module integrated with a general-purpose operating system that may receive digitized spectrographic pulses for at least 10,000 pulses per second. The multi-channel analyzer may further have a user-level software module that may receive user-specified controls dictating the operation of the multi-channel analyzer, making the multi-channel analyzer customizable by the end-user. The user-level software may further categorize and conveniently distribute spectrographic pulse data employing non-proprietary, standard communication protocols and formats.
An autonomous receiver/digital signal processor applied to ground-based and rocket-borne wave experiments

NASA Astrophysics Data System (ADS)

Dombrowski, M. P.; LaBelle, J.; McGaw, D. G.; Broughton, M. C.

2016-07-01

The programmable combined receiver/digital signal processor platform presented in this article is designed for digital downsampling and processing of general waveform inputs with a 66 MHz initial sampling rate and multi-input synchronized sampling. Systems based on this platform are capable of fully autonomous low-power operation, can be programmed to preprocess and filter the data for preselection and reduction, and may output to a diverse array of transmission or telemetry media. We describe three versions of this system, one for deployment on sounding rockets and two for ground-based applications. The rocket system was flown on the Correlation of High-Frequency and Auroral Roar Measurements (CHARM)-II mission launched from Poker Flat Research Range, Alaska, in 2010. It measured auroral "roar" signals at 2.60 MHz. The ground-based systems have been deployed at Sondrestrom, Greenland, and South Pole Station, Antarctica. The Greenland system synchronously samples signals from three spaced antennas providing direction finding of 0-5 MHz waves. It has successfully measured auroral signals and man-made broadcast signals. The South Pole system synchronously samples signals from two crossed antennas, providing polarization information. It has successfully measured the polarization of auroral kilometric radiation-like signals as well as auroral hiss. Further systems are in development for future rocket missions and for installation in Antarctic Automatic Geophysical Observatories.
Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array

NASA Astrophysics Data System (ADS)

Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul

2008-04-01

This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.
Design of the ANTARES LCM-DAQ board test bench using a FPGA-based system-on-chip approach

NASA Astrophysics Data System (ADS)

Anvar, S.; Kestener, P.; Le Provost, H.

2006-11-01

The System-on-Chip (SoC) approach consists in using state-of-the-art FPGA devices with embedded RISC processor cores, high-speed differential LVDS links and ready-to-use multi-gigabit transceivers allowing development of compact systems with substantial number of IO channels. Required performances are obtained through a subtle separation of tasks between closely cooperating programmable hardware logic and user-friendly software environment. We report about our experience in using the SoC approach for designing the production test bench of the off-shore readout system for the ANTARES neutrino experiment.
Second International Workshop on Software Engineering and Code Design in Parallel Meteorological and Oceanographic Applications

NASA Technical Reports Server (NTRS)

OKeefe, Matthew (Editor); Kerr, Christopher L. (Editor)

1998-01-01

This report contains the abstracts and technical papers from the Second International Workshop on Software Engineering and Code Design in Parallel Meteorological and Oceanographic Applications, held June 15-18, 1998, in Scottsdale, Arizona. The purpose of the workshop is to bring together software developers in meteorology and oceanography to discuss software engineering and code design issues for parallel architectures, including Massively Parallel Processors (MPP's), Parallel Vector Processors (PVP's), Symmetric Multi-Processors (SMP's), Distributed Shared Memory (DSM) multi-processors, and clusters. Issues to be discussed include: (1) code architectures for current parallel models, including basic data structures, storage allocation, variable naming conventions, coding rules and styles, i/o and pre/post-processing of data; (2) designing modular code; (3) load balancing and domain decomposition; (4) techniques that exploit parallelism efficiently yet hide the machine-related details from the programmer; (5) tools for making the programmer more productive; and (6) the proliferation of programming models (F--, OpenMP, MPI, and HPF).
Multi-petascale highly efficient parallel supercomputer

DOEpatents

Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

2015-07-14

A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

A multi-channel low-power system-on-chip for single-unit recording and narrowband wireless transmission of neural signal.

PubMed

Bonfanti, A; Ceravolo, M; Zambra, G; Gusmeroli, R; Spinelli, A S; Lacaita, A L; Angotzi, G N; Baranauskas, G; Fadiga, L

2010-01-01

This paper reports a multi-channel neural recording system-on-chip (SoC) with digital data compression and wireless telemetry. The circuit consists of a 16 amplifiers, an analog time division multiplexer, an 8-bit SAR AD converter, a digital signal processor (DSP) and a wireless narrowband 400-MHz binary FSK transmitter. Even though only 16 amplifiers are present in our current die version, the whole system is designed to work with 64 channels demonstrating the feasibility of a digital processing and narrowband wireless transmission of 64 neural recording channels. A digital data compression, based on the detection of action potentials and storage of correspondent waveforms, allows the use of a 1.25-Mbit/s binary FSK wireless transmission. This moderate bit-rate and a low frequency deviation, Manchester-coded modulation are crucial for exploiting a narrowband wireless link and an efficient embeddable antenna. The chip is realized in a 0.35- εm CMOS process with a power consumption of 105 εW per channel (269 εW per channel with an extended transmission range of 4 m) and an area of 3.1 × 2.7 mm(2). The transmitted signal is captured by a digital TV tuner and demodulated by a wideband phase-locked loop (PLL), and then sent to a PC via an FPGA module. The system has been tested for electrical specifications and its functionality verified in in-vivo neural recording experiments.
A multi-rate DPSK modem for free-space laser communications

NASA Astrophysics Data System (ADS)

Spellmeyer, N. W.; Browne, C. A.; Caplan, D. O.; Carney, J. J.; Chavez, M. L.; Fletcher, A. S.; Fitzgerald, J. J.; Kaminsky, R. D.; Lund, G.; Hamilton, S. A.; Magliocco, R. J.; Mikulina, O. V.; Murphy, R. J.; Rao, H. G.; Scheinbart, M. S.; Seaver, M. M.; Wang, J. P.

2014-03-01

The multi-rate DPSK format, which enables efficient free-space laser communications over a wide range of data rates, is finding applications in NASA's Laser Communications Relay Demonstration. We discuss the design and testing of an efficient and robust multi-rate DPSK modem, including aspects of the electrical, mechanical, thermal, and optical design. The modem includes an optically preamplified receiver, an 0.5-W average power transmitter, a LEON3 rad-hard microcontroller that provides the command and telemetry interface and supervisory control, and a Xilinx Virtex-5 radhard reprogrammable FPGA that both supports the high-speed data flow to and from the modem and controls the modem's analog and digital subsystems. For additional flexibility, the transmitter and receiver can be configured to support operation with multi-rate PPM waveforms.
APPLICATION OF THE MODELS-3 COMMUNITY MULTI-SCALE AIR QUALITY (CMAQ) MODEL SYSTEM TO SOS/NASHVILLE 1999

EPA Science Inventory

The Models-3 Community Multi-scale Air Quality (CMAQ) model, first released by the USEPA in 1999 (Byun and Ching. 1999), continues to be developed and evaluated. The principal components of the CMAQ system include a comprehensive emission processor known as the Sparse Matrix O...
The research and application of multi-biometric acquisition embedded system

NASA Astrophysics Data System (ADS)

Deng, Shichao; Liu, Tiegen; Guo, Jingjing; Li, Xiuyan

2009-11-01

The identification technology based on multi-biometric can greatly improve the applicability, reliability and antifalsification. This paper presents a multi-biometric system bases on embedded system, which includes: three capture daughter boards are applied to obtain different biometric: one each for fingerprint, iris and vein of the back of hand; FPGA (Field Programmable Gate Array) is designed as coprocessor, which uses to configure three daughter boards on request and provides data path between DSP (digital signal processor) and daughter boards; DSP is the master processor and its functions include: control the biometric information acquisition, extracts feature as required and responsible for compare the results with the local database or data server through network communication. The advantages of this system were it can acquire three different biometric in real time, extracts complexity feature flexibly in different biometrics' raw data according to different purposes and arithmetic and network interface on the core-board will be the solution of big data scale. Because this embedded system has high stability, reliability, flexibility and fit for different data scale, it can satisfy the demand of multi-biometric recognition.
Scalability of a Low-Cost Multi-Teraflop Linux Cluster for High-End Classical Atomistic and Quantum Mechanical Simulations

NASA Technical Reports Server (NTRS)

Kikuchi, Hideaki; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya; Shimojo, Fuyuki; Saini, Subhash

2003-01-01

Scalability of a low-cost, Intel Xeon-based, multi-Teraflop Linux cluster is tested for two high-end scientific applications: Classical atomistic simulation based on the molecular dynamics method and quantum mechanical calculation based on the density functional theory. These scalable parallel applications use space-time multiresolution algorithms and feature computational-space decomposition, wavelet-based adaptive load balancing, and spacefilling-curve-based data compression for scalable I/O. Comparative performance tests are performed on a 1,024-processor Linux cluster and a conventional higher-end parallel supercomputer, 1,184-processor IBM SP4. The results show that the performance of the Linux cluster is comparable to that of the SP4. We also study various effects, such as the sharing of memory and L2 cache among processors, on the performance.
Demonstration of multi-wavelength tunable fiber lasers based on a digital micromirror device processor.

PubMed

Ai, Qi; Chen, Xiao; Tian, Miao; Yan, Bin-bin; Zhang, Ying; Song, Fei-jun; Chen, Gen-xiang; Sang, Xin-zhu; Wang, Yi-quan; Xiao, Feng; Alameh, Kamal

2015-02-01

Based on a digital micromirror device (DMD) processor as the multi-wavelength narrow-band tunable filter, we demonstrate a multi-port tunable fiber laser through experiments. The key property of this laser is that any lasing wavelength channel from any arbitrary output port can be switched independently over the whole C-band, which is only driven by single DMD chip flexibly. All outputs display an excellent tuning capacity and high consistency in the whole C-band with a 0.02 nm linewidth, 0.055 nm wavelength tuning step, and side-mode suppression ratio greater than 60 dB. Due to the automatic power control and polarization design, the power uniformity of output lasers is less than 0.008 dB and the wavelength fluctuation is below 0.02 nm within 2 h at room temperature.
Electronic Structure Calculations and Adaptation Scheme in Multi-core Computing Environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seshagiri, Lakshminarasimhan; Sosonkina, Masha; Zhang, Zhao

2009-05-20

Multi-core processing environments have become the norm in the generic computing environment and are being considered for adding an extra dimension to the execution of any application. The T2 Niagara processor is a very unique environment where it consists of eight cores having a capability of running eight threads simultaneously in each of the cores. Applications like General Atomic and Molecular Electronic Structure (GAMESS), used for ab-initio molecular quantum chemistry calculations, can be good indicators of the performance of such machines and would be a guideline for both hardware designers and application programmers. In this paper we try to benchmarkmore » the GAMESS performance on a T2 Niagara processor for a couple of molecules. We also show the suitability of using a middleware based adaptation algorithm on GAMESS on such a multi-core environment.« less
Smart-Pixel Array Processors Based on Optimal Cellular Neural Networks for Space Sensor Applications

NASA Technical Reports Server (NTRS)

Fang, Wai-Chi; Sheu, Bing J.; Venus, Holger; Sandau, Rainer

1997-01-01

A smart-pixel cellular neural network (CNN) with hardware annealing capability, digitally programmable synaptic weights, and multisensor parallel interface has been under development for advanced space sensor applications. The smart-pixel CNN architecture is a programmable multi-dimensional array of optoelectronic neurons which are locally connected with their local neurons and associated active-pixel sensors. Integration of the neuroprocessor in each processor node of a scalable multiprocessor system offers orders-of-magnitude computing performance enhancements for on-board real-time intelligent multisensor processing and control tasks of advanced small satellites. The smart-pixel CNN operation theory, architecture, design and implementation, and system applications are investigated in detail. The VLSI (Very Large Scale Integration) implementation feasibility was illustrated by a prototype smart-pixel 5x5 neuroprocessor array chip of active dimensions 1380 micron x 746 micron in a 2-micron CMOS technology.
Knowledge-based vision for space station object motion detection, recognition, and tracking

NASA Technical Reports Server (NTRS)

Symosek, P.; Panda, D.; Yalamanchili, S.; Wehner, W., III

1987-01-01

Computer vision, especially color image analysis and understanding, has much to offer in the area of the automation of Space Station tasks such as construction, satellite servicing, rendezvous and proximity operations, inspection, experiment monitoring, data management and training. Knowledge-based techniques improve the performance of vision algorithms for unstructured environments because of their ability to deal with imprecise a priori information or inaccurately estimated feature data and still produce useful results. Conventional techniques using statistical and purely model-based approaches lack flexibility in dealing with the variabilities anticipated in the unstructured viewing environment of space. Algorithms developed under NASA sponsorship for Space Station applications to demonstrate the value of a hypothesized architecture for a Video Image Processor (VIP) are presented. Approaches to the enhancement of the performance of these algorithms with knowledge-based techniques and the potential for deployment of highly-parallel multi-processor systems for these algorithms are discussed.
Multi-frequency communication system and method

DOEpatents

Carrender, Curtis Lee; Gilbert, Ronald W.

2004-06-01

A multi-frequency RFID remote communication system is provided that includes a plurality of RFID tags configured to receive a first signal and to return a second signal, the second signal having a first frequency component and a second frequency component, the second frequency component including data unique to each remote RFID tag. The system further includes a reader configured to transmit an interrogation signal and to receive remote signals from the tags. A first signal processor, preferably a mixer, removes an intermediate frequency component from the received signal, and a second processor, preferably a second mixer, analyzes the IF frequency component to output data that is unique to each remote tag.
Dynamic configuration management of a multi-standard and multi-mode reconfigurable multi-ASIP architecture for turbo decoding

NASA Astrophysics Data System (ADS)

Lapotre, Vianney; Gogniat, Guy; Baghdadi, Amer; Diguet, Jean-Philippe

2017-12-01

The multiplication of connected devices goes along with a large variety of applications and traffic types needing diverse requirements. Accompanying this connectivity evolution, the last years have seen considerable evolutions of wireless communication standards in the domain of mobile telephone networks, local/wide wireless area networks, and Digital Video Broadcasting (DVB). In this context, intensive research has been conducted to provide flexible turbo decoder targeting high throughput, multi-mode, multi-standard, and power consumption efficiency. However, flexible turbo decoder implementations have not often considered dynamic reconfiguration issues in this context that requires high speed configuration switching. Starting from this assessment, this paper proposes the first solution that allows frame-by-frame run-time configuration management of a multi-processor turbo decoder without compromising the decoding performances.
Electro-Optic Computing Architectures. Volume I

DTIC Science & Technology

1998-02-01

The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit (OW
Satellite antenna management system and method

NASA Technical Reports Server (NTRS)

Leath, Timothy T (Inventor); Azzolini, John D (Inventor)

1999-01-01

The antenna management system and method allow a satellite to communicate with a ground station either directly or by an intermediary of a second satellite, thus permitting communication even when the satellite is not within range of the ground station. The system and method employ five major software components, which are the control and initialization module, the command and telemetry handler module, the contact schedule processor module, the contact state machining module, and the telemetry state machine module. The control and initialization module initializes the system and operates the main control cycle, in which the other modules are called. The command and telemetry handler module handles communication to and from the ground station. The contact scheduler processor module handles the contact entry schedules to allow scheduling of contacts with the second satellite. The contact and telemetry state machine modules handle the various states of the satellite in beginning, maintaining and ending contact with the second satellite and in beginning, maintaining and ending communication with the satellite.
Design of a Multi-Week Sound and Motion Recording and Telemetry (SMRT) Tag for Behavioral Studies on Whales

DTIC Science & Technology

2015-09-30

phone: +44 1334 462624 fax: +44 1334 463443 e-mail: markjohnson@st-andrews.ac.uk Todd Lindstrom Wildlife Computers 8345 154th Avenue NE...in situ processing algorithms for sound and motion data. In a parallel project Dr. Andrews at the Alaska SeaLife Center teamed with Wildlife ...from Wildlife Computers to produce a highly integrated Sound and Motion Recording and Telemetry (SMRT) tag. The complete tag development is expected
Tunable multi-wavelength fiber lasers based on an Opto-VLSI processor and optical amplifiers.

PubMed

Xiao, Feng; Alameh, Kamal; Lee, Yong Tak

2009-12-07

A multi-wavelength tunable fiber laser based on the use of an Opto-VLSI processor in conjunction with different optical amplifiers is proposed and experimentally demonstrated. The Opto-VLSI processor can simultaneously select any part of the gain spectrum from each optical amplifier into its associated fiber ring, leading to a multiport tunable fiber laser source. We experimentally demonstrate a 3-port tunable fiber laser source, where each output wavelength of each port can independently be tuned within the C-band with a wavelength step of about 0.05 nm. Experimental results demonstrate a laser linewidth as narrow as 0.05 nm and an optical side-mode-suppression-ratio (SMSR) of about 35 dB. The demonstrated three fiber lasers have excellent stability at room temperature and output power uniformity less than 0.5 dB over the whole C-band.
The parallel algorithm for the 2D discrete wavelet transform

NASA Astrophysics Data System (ADS)

Barina, David; Najman, Pavel; Kleparnik, Petr; Kula, Michal; Zemcik, Pavel

2018-04-01

The discrete wavelet transform can be found at the heart of many image-processing algorithms. Until now, the transform on general-purpose processors (CPUs) was mostly computed using a separable lifting scheme. As the lifting scheme consists of a small number of operations, it is preferred for processing using single-core CPUs. However, considering a parallel processing using multi-core processors, this scheme is inappropriate due to a large number of steps. On such architectures, the number of steps corresponds to the number of points that represent the exchange of data. Consequently, these points often form a performance bottleneck. Our approach appropriately rearranges calculations inside the transform, and thereby reduces the number of steps. In other words, we propose a new scheme that is friendly to parallel environments. When evaluating on multi-core CPUs, we consistently overcome the original lifting scheme. The evaluation was performed on 61-core Intel Xeon Phi and 8-core Intel Xeon processors.
Dynamically programmable cache

NASA Astrophysics Data System (ADS)

Nakkar, Mouna; Harding, John A.; Schwartz, David A.; Franzon, Paul D.; Conte, Thomas

1998-10-01

Reconfigurable machines have recently been used as co- processors to accelerate the execution of certain algorithms or program subroutines. The problems with the above approach include high reconfiguration time and limited partial reconfiguration. By far the most critical problems are: (1) the small on-chip memory which results in slower execution time, and (2) small FPGA areas that cannot implement large subroutines. Dynamically Programmable Cache (DPC) is a novel architecture for embedded processors which offers solutions to the above problems. To solve memory access problems, DPC processors merge reconfigurable arrays with the data cache at various cache levels to create a multi-level reconfigurable machines. As a result DPC machines have both higher data accessibility and FPGA memory bandwidth. To solve the limited FPGA resource problem, DPC processors implemented multi-context switching (Virtualization) concept. Virtualization allows implementation of large subroutines with fewer FPGA cells. Additionally, DPC processors can parallelize the execution of several operations resulting in faster execution time. In this paper, the speedup improvement for DPC machines are shown to be 5X faster than an Altera FLEX10K FPGA chip and 2X faster than a Sun Ultral SPARC station for two different algorithms (convolution and motion estimation).
An accuracy aware low power wireless EEG unit with information content based adaptive data compression.

PubMed

Tolbert, Jeremy R; Kabali, Pratik; Brar, Simeranjit; Mukhopadhyay, Saibal

2009-01-01

We present a digital system for adaptive data compression for low power wireless transmission of Electroencephalography (EEG) data. The proposed system acts as a base-band processor between the EEG analog-to-digital front-end and RF transceiver. It performs a real-time accuracy energy trade-off for multi-channel EEG signal transmission by controlling the volume of transmitted data. We propose a multi-core digital signal processor for on-chip processing of EEG signals, to detect signal information of each channel and perform real-time adaptive compression. Our analysis shows that the proposed approach can provide significant savings in transmitter power with minimal impact on the overall signal accuracy.
MAT - MULTI-ATTRIBUTE TASK BATTERY FOR HUMAN OPERATOR WORKLOAD AND STRATEGIC BEHAVIOR RESEARCH

NASA Technical Reports Server (NTRS)

Comstock, J. R.

1994-01-01

MAT, a Multi-Attribute Task battery, gives the researcher the capability of performing multi-task workload and performance experiments. The battery provides a benchmark set of tasks for use in a wide range of laboratory studies of operator performance and workload. MAT incorporates tasks analogous to activities that aircraft crew members perform in flight, while providing a high degree of experiment control, performance data on each subtask, and freedom to use non-pilot test subjects. The MAT battery primary display is composed of four separate task windows which are as follows: a monitoring task window which includes gauges and warning lights, a tracking task window for the demands of manual control, a communication task window to simulate air traffic control communications, and a resource management task window which permits maintaining target levels on a fuel management task. In addition, a scheduling task window gives the researcher information about future task demands. The battery also provides the option of manual or automated control of tasks. The task generates performance data for each subtask. The task battery may be paused and onscreen workload rating scales presented to the subject. The MAT battery was designed to use a serially linked second computer to generate the voice messages for the Communications task. The MATREMX program and support files, which are included in the MAT package, were designed to work with the Heath Voice Card (Model HV-2000, available through the Heath Company, Benton Harbor, Michigan 49022); however, the MATREMX program and support files may easily be modified to work with other voice synthesizer or digitizer cards. The MAT battery task computer may also be used independent of the voice computer if no computer synthesized voice messages are desired or if some other method of presenting auditory messages is devised. MAT is written in QuickBasic and assembly language for IBM PC series and compatible computers running MS-DOS. The code in MAT is written for Microsoft QuickBasic 4.5 and Microsoft Macro Assembler 5.1. This package requires a joystick and EGA or VGA color graphics. An 80286, 386, or 486 processor machine is highly recommended. The standard distribution medium for MAT is a 5.25 inch 360K MS-DOS format diskette. The files are compressed using the PKZIP file compression utility. PKUNZIP is included on the distribution diskette. MAT was developed in 1992. IBM PC is a registered trademark of International Business Machines. MS-DOS, Microsoft QuickBasic, and Microsoft Macro Assembler are registered trademarks of Microsoft Corporation. PKZIP and PKUNZIP are registered trademarks of PKWare, Inc.
Multi-Scale Characterization of Orthotropic Microstructures

DTIC Science & Technology

2008-04-01

D. Valiveti, S. J. Harris, J. Boileau, A domain partitioning based pre-processor for multi-scale modelling of cast aluminium alloys , Modelling and...SUPPLEMENTARY NOTES Journal article submitted to Modeling and Simulation in Materials Science and Engineering. PAO Case Number: WPAFB 08-3362...element for charac- terization or simulation to avoid misleading predictions of macroscopic defor- mation, fracture, or transport behavior. Likewise

Runtime Performance Monitoring Tool for RTEMS System Software

NASA Astrophysics Data System (ADS)

Cho, B.; Kim, S.; Park, H.; Kim, H.; Choi, J.; Chae, D.; Lee, J.

2007-08-01

RTEMS is a commercial-grade real-time operating system that supports multi-processor computers. However, there are not many development tools for RTEMS. In this paper, we report new RTEMS-based runtime performance monitoring tool. We have implemented a light weight runtime monitoring task with an extension to the RTEMS APIs. Using our tool, software developers can verify various performance- related parameters during runtime. Our tool can be used during software development phase and in-orbit operation as well. Our implemented target agent is light weight and has small overhead using SpaceWire interface. Efforts to reduce overhead and to add other monitoring parameters are currently under research.
The RSZ BASIC programming language manual

NASA Technical Reports Server (NTRS)

Stattel, R. J.; Niswander, J. K.; Kochhar, A. K.

1980-01-01

The RSZ BASIC interactive language is described. The RSZ BASIC interpreter is resident in the Telemetry Data Processor, a system dedicated to the processing and displaying of PCM telemetry data. A series of working examples teaches the fundamentals of RSZ BASIC and shows how to construct, edit, and manage storage of programs.
Parallel Agent-Based Simulations on Clusters of GPUs and Multi-Core Processors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aaby, Brandon G; Perumalla, Kalyan S; Seal, Sudip K

2010-01-01

An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Messagemore » Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs.« less
Energy Efficient Image/Video Data Transmission on Commercial Multi-Core Processors

PubMed Central

Lee, Sungju; Kim, Heegon; Chung, Yongwha; Park, Daihee

2012-01-01

In transmitting image/video data over Video Sensor Networks (VSNs), energy consumption must be minimized while maintaining high image/video quality. Although image/video compression is well known for its efficiency and usefulness in VSNs, the excessive costs associated with encoding computation and complexity still hinder its adoption for practical use. However, it is anticipated that high-performance handheld multi-core devices will be used as VSN processing nodes in the near future. In this paper, we propose a way to improve the energy efficiency of image and video compression with multi-core processors while maintaining the image/video quality. We improve the compression efficiency at the algorithmic level or derive the optimal parameters for the combination of a machine and compression based on the tradeoff between the energy consumption and the image/video quality. Based on experimental results, we confirm that the proposed approach can improve the energy efficiency of the straightforward approach by a factor of 2∼5 without compromising image/video quality. PMID:23202181
Platform-Independence and Scheduling In a Multi-Threaded Real-Time Simulation

NASA Technical Reports Server (NTRS)

Sugden, Paul P.; Rau, Melissa A.; Kenney, P. Sean

2001-01-01

Aviation research often relies on real-time, pilot-in-the-loop flight simulation as a means to develop new flight software, flight hardware, or pilot procedures. Often these simulations become so complex that a single processor is incapable of performing the necessary computations within a fixed time-step. Threads are an elegant means to distribute the computational work-load when running on a symmetric multi-processor machine. However, programming with threads often requires operating system specific calls that reduce code portability and maintainability. While a multi-threaded simulation allows a significant increase in the simulation complexity, it also increases the workload of a simulation operator by requiring that the operator determine which models run on which thread. To address these concerns an object-oriented design was implemented in the NASA Langley Standard Real-Time Simulation in C++ (LaSRS++) application framework. The design provides a portable and maintainable means to use threads and also provides a mechanism to automatically load balance the simulation models.
A multi-channel instrumentation system for biosignal recording.

PubMed

Yu, Hong; Li, Pengfei; Xiao, Zhiming; Peng, Chung-Ching; Bashirullah, Rizwan

2008-01-01

This paper reports a highly integrated battery operated multi-channel instrumentation system intended for physiological signal recording. The mixed signal IC has been fabricated in standard 0.5microm 5V 3M-2P CMOS process and features 32 instrumentation amplifiers, four 8b SAR ADCs, a wireless power interface with Li-ion battery charger, low power bidirectional telemetry and FSM controller with power gating control for improved energy efficiency. The chip measures 3.2mm by 4.8mm and dissipates approximately 2.1mW when fully operational.
An Energy-Aware Runtime Management of Multi-Core Sensory Swarms.

PubMed

Kim, Sungchan; Yang, Hoeseok

2017-08-24

In sensory swarms, minimizing energy consumption under performance constraint is one of the key objectives. One possible approach to this problem is to monitor application workload that is subject to change at runtime, and to adjust system configuration adaptively to satisfy the performance goal. As today's sensory swarms are usually implemented using multi-core processors with adjustable clock frequency, we propose to monitor the CPU workload periodically and adjust the task-to-core allocation or clock frequency in an energy-efficient way in response to the workload variations. In doing so, we present an online heuristic that determines the most energy-efficient adjustment that satisfies the performance requirement. The proposed method is based on a simple yet effective energy model that is built upon performance prediction using IPC (instructions per cycle) measured online and power equation derived empirically. The use of IPC accounts for memory intensities of a given workload, enabling the accurate prediction of execution time. Hence, the model allows us to rapidly and accurately estimate the effect of the two control knobs, clock frequency adjustment and core allocation. The experiments show that the proposed technique delivers considerable energy saving of up to 45%compared to the state-of-the-art multi-core energy management technique.
An Energy-Aware Runtime Management of Multi-Core Sensory Swarms

PubMed Central

Kim, Sungchan

2017-01-01

In sensory swarms, minimizing energy consumption under performance constraint is one of the key objectives. One possible approach to this problem is to monitor application workload that is subject to change at runtime, and to adjust system configuration adaptively to satisfy the performance goal. As today’s sensory swarms are usually implemented using multi-core processors with adjustable clock frequency, we propose to monitor the CPU workload periodically and adjust the task-to-core allocation or clock frequency in an energy-efficient way in response to the workload variations. In doing so, we present an online heuristic that determines the most energy-efficient adjustment that satisfies the performance requirement. The proposed method is based on a simple yet effective energy model that is built upon performance prediction using IPC (instructions per cycle) measured online and power equation derived empirically. The use of IPC accounts for memory intensities of a given workload, enabling the accurate prediction of execution time. Hence, the model allows us to rapidly and accurately estimate the effect of the two control knobs, clock frequency adjustment and core allocation. The experiments show that the proposed technique delivers considerable energy saving of up to 45%compared to the state-of-the-art multi-core energy management technique. PMID:28837094
Research on monitoring system of water resources in Shiyang River Basin based on Multi-agent

NASA Astrophysics Data System (ADS)

Zhao, T. H.; Yin, Z.; Song, Y. Z.

2012-11-01

The Shiyang River Basin is the most populous, economy relatively develop, the highest degree of development and utilization of water resources, water conflicts the most prominent, ecological environment problems of the worst hit areas in Hexi inland river basin in Gansu province. the contradiction between people and water is aggravated constantly in the basin. This text combines multi-Agent technology with monitoring system of water resource, the establishment of a management center, telemetry Agent Federation, as well as the communication network between the composition of the Shiyang River Basin water resources monitoring system. By taking advantage of multi-agent system intelligence and communications coordination to improve the timeliness of the basin water resources monitoring.
iTAG: Integrating a Cloud Based, Collaborative Animal Tracking Network into the GCOOS data portal in the Gulf of Mexico

NASA Astrophysics Data System (ADS)

Kirkpatrick, B. A.; Currier, R. D.; Simoniello, C.

2016-02-01

The tagging and tracking of aquatic animals using acoustic telemetry hardware has traditionally been the purview of individual researchers that specialize in single species. Concerns over data privacy and unauthorized use of receiver arrays have prevented the construction of large-scale, multi-species, multi-institution, multi-researcher collaborative acoustic arrays. We have developed a toolset to build the new portal using the Flask microframework, Python language, and Twitter bootstrap. Initial feedback has been overwhelmingly positive. The privacy policy has been praised for its granularity: principal investigators can choose between three levels of privacy for all data and hardware: Completely private - viewable only by the PI Visible to iTAG members Visible to the general public At the time of this writing iTAG is still in the beta stage, but the feedback received to date indicates that with the proper design and security features, and an iterative cycle of feedback from potential members, constructing a collaborative acoustic tracking network system is possible. Initial usage will be limited to the entry and searching for `orphan/mystery' tags, with the integration of historical array deployments and data following shortly thereafter. We have also been working with staff from the Ocean Tracking Network to allow for integration of the two systems. The database schema of iTAG is based on the marine metadata convention for acoustic telemetry. This should permit machine-to-machine data exchange between iTAG and OTN. The integration of animal telemetry data into the GCOOS portal will allow researchers to easily access the physiochemical oceanography data, thus allowing for a more in depth understanding of animal response and usage patterns.
Clustered Multi-Task Learning for Automatic Radar Target Recognition

PubMed Central

Li, Cong; Bao, Weimin; Xu, Luping; Zhang, Hua

2017-01-01

Model training is a key technique for radar target recognition. Traditional model training algorithms in the framework of single task leaning ignore the relationships among multiple tasks, which degrades the recognition performance. In this paper, we propose a clustered multi-task learning, which can reveal and share the multi-task relationships for radar target recognition. To further make full use of these relationships, the latent multi-task relationships in the projection space are taken into consideration. Specifically, a constraint term in the projection space is proposed, the main idea of which is that multiple tasks within a close cluster should be close to each other in the projection space. In the proposed method, the cluster structures and multi-task relationships can be autonomously learned and utilized in both of the original and projected space. In view of the nonlinear characteristics of radar targets, the proposed method is extended to a non-linear kernel version and the corresponding non-linear multi-task solving method is proposed. Comprehensive experimental studies on simulated high-resolution range profile dataset and MSTAR SAR public database verify the superiority of the proposed method to some related algorithms. PMID:28953267
Electro-Optic Computing Architectures: Volume II. Components and System Design and Analysis

DTIC Science & Technology

1998-02-01

The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit
Multi-Core Processors: An Enabling Technology for Embedded Distributed Model-Based Control (Postprint)

DTIC Science & Technology

2008-07-01

generation of process partitioning, a thread pipelining becomes possible. In this paper we briefly summarize the requirements and trends for FADEC based... FADEC environment, presenting a hypothetical realization of an example application. Finally we discuss the application of Time-Triggered...based control applications of the future. 15. SUBJECT TERMS Gas turbine, FADEC , Multi-core processing technology, disturbed based control
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.

PubMed

Zierke, Stephanie; Bakos, Jason D

2010-04-12

Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
Multi-fuel reformers for fuel cells used in transportation. Phase 1: Multi-fuel reformers

NASA Astrophysics Data System (ADS)

1994-05-01

DOE has established the goal, through the Fuel Cells in Transportation Program, of fostering the rapid development and commercialization of fuel cells as economic competitors for the internal combustion engine. Central to this goal is a safe feasible means of supplying hydrogen of the required purity to the vehicular fuel cell system. Two basic strategies are being considered: (1) on-board fuel processing whereby alternative fuels such as methanol, ethanol or natural gas stored on the vehicle undergo reformation and subsequent processing to produce hydrogen, and (2) on-board storage of pure hydrogen provided by stationary fuel processing plants. This report analyzes fuel processor technologies, types of fuel and fuel cell options for on-board reformation. As the Phase 1 of a multi-phased program to develop a prototype multi-fuel reformer system for a fuel cell powered vehicle, the objective of this program was to evaluate the feasibility of a multi-fuel reformer concept and to select a reforming technology for further development in the Phase 2 program, with the ultimate goal of integration with a DOE-designated fuel cell and vehicle configuration. The basic reformer processes examined in this study included catalytic steam reforming (SR), non-catalytic partial oxidation (POX) and catalytic partial oxidation (also known as Autothermal Reforming, or ATR). Fuels under consideration in this study included methanol, ethanol, and natural gas. A systematic evaluation of reforming technologies, fuels, and transportation fuel cell applications was conducted for the purpose of selecting a suitable multi-fuel processor for further development and demonstration in a transportation application.
Fault-Tolerant, Real-Time, Multi-Core Computer System

NASA Technical Reports Server (NTRS)

Gostelow, Kim P.

2012-01-01

A document discusses a fault-tolerant, self-aware, low-power, multi-core computer for space missions with thousands of simple cores, achieving speed through concurrency. The proposed machine decides how to achieve concurrency in real time, rather than depending on programmers. The driving features of the system are simple hardware that is modular in the extreme, with no shared memory, and software with significant runtime reorganizing capability. The document describes a mechanism for moving ongoing computations and data that is based on a functional model of execution. Because there is no shared memory, the processor connects to its neighbors through a high-speed data link. Messages are sent to a neighbor switch, which in turn forwards that message on to its neighbor until reaching the intended destination. Except for the neighbor connections, processors are isolated and independent of each other. The processors on the periphery also connect chip-to-chip, thus building up a large processor net. There is no particular topology to the larger net, as a function at each processor allows it to forward a message in the correct direction. Some chip-to-chip connections are not necessarily nearest neighbors, providing short cuts for some of the longer physical distances. The peripheral processors also provide the connections to sensors, actuators, radios, science instruments, and other devices with which the computer system interacts.
The Minerva Multi-Microprocessor.

DTIC Science & Technology

A multiprocessor system is described which is an experiment in low cost, extensible, multiprocessor architectures. Global issues such as inclusion of a central bus, design of the bus arbiter, and methods of interrupt handling are considered. The system initially includes two processor types, based on microprocessors, and these are discussed. Methods for reducing processor demand for the central bus are described.
Evolutionary Telemetry and Command Processor (TCP) architecture

NASA Technical Reports Server (NTRS)

Schneider, John R.

1992-01-01

A low cost, modular, high performance, and compact Telemetry and Command Processor (TCP) is being built as the foundation of command and data handling subsystems for the next generation of satellites. The TCP product line will support command and telemetry requirements for small to large spacecraft and from low to high rate data transmission. It is compatible with the latest TDRSS, STDN and SGLS transponders and provides CCSDS protocol communications in addition to standard TDM formats. Its high performance computer provides computing resources for hosted flight software. Layered and modular software provides common services using standardized interfaces to applications thereby enhancing software re-use, transportability, and interoperability. The TCP architecture is based on existing standards, distributed networking, distributed and open system computing, and packet technology. The first TCP application is planned for the 94 SDIO SPAS 3 mission. The architecture enhances rapid tailoring of functions thereby reducing costs and schedules developed for individual spacecraft missions.
Multiple core computer processor with globally-accessible local memories

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shalf, John; Donofrio, David; Oliker, Leonid

A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality ofmore » processor cores.« less
Biphasic responses in multi-site phosphorylation systems.

PubMed

Suwanmajo, Thapanar; Krishnan, J

2013-12-06

Multi-site phosphorylation systems are repeatedly encountered in cellular biology and multi-site modification is a basic building block of post-translational modification. In this paper, we demonstrate how distributive multi-site modification mechanisms by a single kinase/phosphatase pair can lead to biphasic/partial biphasic dose-response characteristics for the maximally phosphorylated substrate at steady state. We use simulations and analysis to uncover a hidden competing effect which is responsible for this and analyse how it may be accentuated. We build on this to analyse different variants of multi-site phosphorylation mechanisms showing that some mechanisms are intrinsically not capable of displaying this behaviour. This provides both a consolidated understanding of how and under what conditions biphasic responses are obtained in multi-site phosphorylation and a basis for discriminating between different mechanisms based on this. We also demonstrate how this behaviour may be combined with other behaviour such as threshold and bistable responses, demonstrating the capacity of multi-site phosphorylation systems to act as complex molecular signal processors.

Examining the Impact of Off-Task Multi-Tasking with Technology on Real-Time Classroom Learning

ERIC Educational Resources Information Center

Wood, Eileen; Zivcakova, Lucia; Gentile, Petrice; Archer, Karin; De Pasquale, Domenica; Nosko, Amanda

2012-01-01

The purpose of the present study was to examine the impact of multi-tasking with digital technologies while attempting to learn from real-time classroom lectures in a university setting. Four digitally-based multi-tasking activities (texting using a cell-phone, emailing, MSN messaging and Facebook[TM]) were compared to 3 control groups…
Real-time SHVC software decoding with multi-threaded parallel processing

NASA Astrophysics Data System (ADS)

Gudumasu, Srinivas; He, Yuwen; Ye, Yan; He, Yong; Ryu, Eun-Seok; Dong, Jie; Xiu, Xiaoyu

2014-09-01

This paper proposes a parallel decoding framework for scalable HEVC (SHVC). Various optimization technologies are implemented on the basis of SHVC reference software SHM-2.0 to achieve real-time decoding speed for the two layer spatial scalability configuration. SHVC decoder complexity is analyzed with profiling information. The decoding process at each layer and the up-sampling process are designed in parallel and scheduled by a high level application task manager. Within each layer, multi-threaded decoding is applied to accelerate the layer decoding speed. Entropy decoding, reconstruction, and in-loop processing are pipeline designed with multiple threads based on groups of coding tree units (CTU). A group of CTUs is treated as a processing unit in each pipeline stage to achieve a better trade-off between parallelism and synchronization. Motion compensation, inverse quantization, and inverse transform modules are further optimized with SSE4 SIMD instructions. Simulations on a desktop with an Intel i7 processor 2600 running at 3.4 GHz show that the parallel SHVC software decoder is able to decode 1080p spatial 2x at up to 60 fps (frames per second) and 1080p spatial 1.5x at up to 50 fps for those bitstreams generated with SHVC common test conditions in the JCT-VC standardization group. The decoding performance at various bitrates with different optimization technologies and different numbers of threads are compared in terms of decoding speed and resource usage, including processor and memory.
NASA Tech Briefs, February 2011

NASA Technical Reports Server (NTRS)

2011-01-01

Topics covered include: Multi-Segment Radius Measurement Using an Absolute Distance Meter Through a Null Assembly; Fiber-Optic Magnetic-Field-Strength Measurement System for Lightning Detection; Photocatalytic Active Radiation Measurements and Use; Computer Generated Hologram System for Wavefront Measurement System Calibration; Non-Contact Thermal Properties Measurement with Low-Power Laser and IR Camera System; SpaceCube 2.0: An Advanced Hybrid Onboard Data Processor; CMOS Imager Has Better Cross-Talk and Full-Well Performance; High-Performance Wireless Telemetry; Telemetry-Based Ranging; JWST Wavefront Control Toolbox; Java Image I/O for VICAR, PDS, and ISIS; X-Band Acquisition Aid Software; Antimicrobial-Coated Granules for Disinfecting Water; Range 7 Scanner Integration with PaR Robot Scanning System; Methods of Antimicrobial Coating of Diverse Materials; High-Operating-Temperature Barrier Infrared Detector with Tailorable Cutoff Wavelength; A Model of Reduced Kinetics for Alkane Oxidation Using Constituents and Species for N-Heptane; Thermally Conductive Tape Based on Carbon Nanotube Arrays; Two Catalysts for Selective Oxidation of Contaminant Gases; Nanoscale Metal Oxide Semiconductors for Gas Sensing; Lightweight, Ultra-High-Temperature, CMC-Lined Carbon/Carbon Structures; Sample Acquisition and Handling System from a Remote Platform; Improved Rare-Earth Emitter Hollow Cathode; High-Temperature Smart Structures for Engine Noise Reduction and Performance Enhancement; Cryogenic Scan Mechanism for Fourier Transform Spectrometer; Piezoelectric Rotary Tube Motor; Thermoelectric Energy Conversion Technology for High-Altitude Airships; Combustor Computations for CO2-Neutral Aviation; Use of Dynamic Distortion to Predict and Alleviate Loss of Control; Cycle Time Reduction in Trapped Mercury Ion Atomic Frequency Standards; and A (201)Hg+ Comagnetometer for (199)Hg+ Trapped Ion Space Atomic Clocks.
Neural simulations on multi-core architectures.

PubMed

Eichner, Hubert; Klug, Tobias; Borst, Alexander

2009-01-01

Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing.
Neural Simulations on Multi-Core Architectures

PubMed Central

Eichner, Hubert; Klug, Tobias; Borst, Alexander

2009-01-01

Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing. PMID:19636393
USC orthogonal multiprocessor for image processing with neural networks

NASA Astrophysics Data System (ADS)

Hwang, Kai; Panda, Dhabaleswar K.; Haddadi, Navid

1990-07-01

This paper presents the architectural features and imaging applications of the Orthogonal MultiProcessor (OMP) system, which is under construction at the University of Southern California with research funding from NSF and assistance from several industrial partners. The prototype OMP is being built with 16 Intel i860 RISC microprocessors and 256 parallel memory modules using custom-designed spanning buses, which are 2-D interleaved and orthogonally accessed without conflicts. The 16-processor OMP prototype is targeted to achieve 430 MIPS and 600 Mflops, which have been verified by simulation experiments based on the design parameters used. The prototype OMP machine will be initially applied for image processing, computer vision, and neural network simulation applications. We summarize important vision and imaging algorithms that can be restructured with neural network models. These algorithms can efficiently run on the OMP hardware with linear speedup. The ultimate goal is to develop a high-performance Visual Computer (Viscom) for integrated low- and high-level image processing and vision tasks.
Programming methodology for a general purpose automation controller

NASA Technical Reports Server (NTRS)

Sturzenbecker, M. C.; Korein, J. U.; Taylor, R. H.

1987-01-01

The General Purpose Automation Controller is a multi-processor architecture for automation programming. A methodology has been developed whose aim is to simplify the task of programming distributed real-time systems for users in research or manufacturing. Programs are built by configuring function blocks (low-level computations) into processes using data flow principles. These processes are activated through the verb mechanism. Verbs are divided into two classes: those which support devices, such as robot joint servos, and those which perform actions on devices, such as motion control. This programming methodology was developed in order to achieve the following goals: (1) specifications for real-time programs which are to a high degree independent of hardware considerations such as processor, bus, and interconnect technology; (2) a component approach to software, so that software required to support new devices and technologies can be integrated by reconfiguring existing building blocks; (3) resistance to error and ease of debugging; and (4) a powerful command language interface.
Multi-Core Programming Design Patterns: Stream Processing Algorithms for Dynamic Scene Perceptions

DTIC Science & Technology

2014-05-01

processor developed by IBM and other companies , incorpo- rates the verb—POWER5— processor as the Power Processor Element (PPE), one of the early general...deliver an power efficient single-precision peak performance of more than 256 GFlops. Substantially more raw power became available later, when nVIDIA ...algorithms, including IBM’s Cell/B.E., GPUs from NVidia and AMD and many-core CPUs from Intel.27 The vast growth of digital video content has been a
Project Report: Automatic Sequence Processor Software Analysis

NASA Technical Reports Server (NTRS)

Benjamin, Brandon

2011-01-01

The Mission Planning and Sequencing (MPS) element of Multi-Mission Ground System and Services (MGSS) provides space missions with multi-purpose software to plan spacecraft activities, sequence spacecraft commands, and then integrate these products and execute them on spacecraft. Jet Propulsion Laboratory (JPL) is currently is flying many missions. The processes for building, integrating, and testing the multi-mission uplink software need to be improved to meet the needs of the missions and the operations teams that command the spacecraft. The Multi-Mission Sequencing Team is responsible for collecting and processing the observations, experiments and engineering activities that are to be performed on a selected spacecraft. The collection of these activities is called a sequence and ultimately a sequence becomes a sequence of spacecraft commands. The operations teams check the sequence to make sure that no constraints are violated. The workflow process involves sending a program start command, which activates the Automatic Sequence Processor (ASP). The ASP is currently a file-based system that is comprised of scripts written in perl, c-shell and awk. Once this start process is complete, the system checks for errors and aborts if there are any; otherwise the system converts the commands to binary, and then sends the resultant information to be radiated to the spacecraft.
A Comparison of Two Methods Used for Ranking Task Exposure Levels Using Simulated Multi-Task Data

DTIC Science & Technology

1999-12-17

OF OKLAHOMA HEALTH SCIENCES CENTER GRADUATE COLLEGE A COMPARISON OF TWO METHODS USED FOR RANKING TASK EXPOSURE LEVELS USING SIMULATED MULTI-TASK...COSTANTINO Oklahoma City, Oklahoma 1999 ^ooo wx °^ A COMPARISON OF TWO METHODS USED FOR RANKING TASK EXPOSURE LEVELS USING SIMULATED MULTI-TASK DATA... METHODS AND MATERIALS 9 TV. RESULTS 14 V. DISCUSSION AND CONCLUSION 28 LIST OF REFERENCES 31 APPENDICES 33 Appendix A JJ -in Appendix B Dl IV
Multi-purpose ECG telemetry system.

PubMed

Marouf, Mohamed; Vukomanovic, Goran; Saranovac, Lazar; Bozic, Miroslav

2017-06-19

The Electrocardiogram ECG is one of the most important non-invasive tools for cardiac diseases diagnosis. Taking advantage of the developed telecommunication infrastructure, several approaches that address the development of telemetry cardiac devices were introduced recently. Telemetry ECG devices allow easy and fast ECG monitoring of patients with suspected cardiac issues. Choosing the right device with the desired working mode, signal quality, and the device cost are still the main obstacles to massive usage of these devices. In this paper, we introduce design, implementation, and validation of a multi-purpose telemetry system for recording, transmission, and interpretation of ECG signals in different recording modes. The system consists of an ECG device, a cloud-based analysis pipeline, and accompanied mobile applications for physicians and patients. The proposed ECG device's mechanical design allows laypersons to easily record post-event short-term ECG signals, using dry electrodes without any preparation. Moreover, patients can use the device to record long-term signals in loop and holter modes, using wet electrodes. In order to overcome the problem of signal quality fluctuation due to using different electrodes types and different placements on subject's chest, customized ECG signal processing and interpretation pipeline is presented for each working mode. We present the evaluation of the novel short-term recorder design. Recording of an ECG signal was performed for 391 patients using a standard 12-leads golden standard ECG and the proposed patient-activated short-term post-event recorder. In the validation phase, a sample of validation signals followed peer review process wherein two experts annotated the signals in terms of signal acceptability for diagnosis.We found that 96% of signals allow detecting arrhythmia and other signal's abnormal changes. Additionally, we compared and presented the correlation coefficient and the automatic QRS delineation results of both short-term post-event recorder and 12-leads golden standard ECG recorder. The proposed multi-purpose ECG device allows physicians to choose the working mode of the same device according to the patient status. The proposed device was designed to allow patients to manage the technical requirements of both working modes. Post-event short-term ECG recording using the proposed design provide physicians reliable three ECG leads with direct symptom-rhythm correlation.
A Modular Pipelined Processor for High Resolution Gamma-Ray Spectroscopy

NASA Astrophysics Data System (ADS)

Veiga, Alejandro; Grunfeld, Christian

2016-02-01

The design of a digital signal processor for gamma-ray applications is presented in which a single ADC input can simultaneously provide temporal and energy characterization of gamma radiation for a wide range of applications. Applying pipelining techniques, the processor is able to manage and synchronize very large volumes of streamed real-time data. Its modular user interface provides a flexible environment for experimental design. The processor can fit in a medium-sized FPGA device operating at ADC sampling frequency, providing an efficient solution for multi-channel applications. Two experiments are presented in order to characterize its temporal and energy resolution.
Multi-task feature selection in microarray data by binary integer programming.

PubMed

Lan, Liang; Vucetic, Slobodan

2013-12-20

A major challenge in microarray classification is that the number of features is typically orders of magnitude larger than the number of examples. In this paper, we propose a novel feature filter algorithm to select the feature subset with maximal discriminative power and minimal redundancy by solving a quadratic objective function with binary integer constraints. To improve the computational efficiency, the binary integer constraints are relaxed and a low-rank approximation to the quadratic term is applied. The proposed feature selection algorithm was extended to solve multi-task microarray classification problems. We compared the single-task version of the proposed feature selection algorithm with 9 existing feature selection methods on 4 benchmark microarray data sets. The empirical results show that the proposed method achieved the most accurate predictions overall. We also evaluated the multi-task version of the proposed algorithm on 8 multi-task microarray datasets. The multi-task feature selection algorithm resulted in significantly higher accuracy than when using the single-task feature selection methods.
GPU accelerated dynamic functional connectivity analysis for functional MRI data.

PubMed

Akgün, Devrim; Sakoğlu, Ünal; Esquivel, Johnny; Adinoff, Bryon; Mete, Mutlu

2015-07-01

Recent advances in multi-core processors and graphics card based computational technologies have paved the way for an improved and dynamic utilization of parallel computing techniques. Numerous applications have been implemented for the acceleration of computationally-intensive problems in various computational science fields including bioinformatics, in which big data problems are prevalent. In neuroimaging, dynamic functional connectivity (DFC) analysis is a computationally demanding method used to investigate dynamic functional interactions among different brain regions or networks identified with functional magnetic resonance imaging (fMRI) data. In this study, we implemented and analyzed a parallel DFC algorithm based on thread-based and block-based approaches. The thread-based approach was designed to parallelize DFC computations and was implemented in both Open Multi-Processing (OpenMP) and Compute Unified Device Architecture (CUDA) programming platforms. Another approach developed in this study to better utilize CUDA architecture is the block-based approach, where parallelization involves smaller parts of fMRI time-courses obtained by sliding-windows. Experimental results showed that the proposed parallel design solutions enabled by the GPUs significantly reduce the computation time for DFC analysis. Multicore implementation using OpenMP on 8-core processor provides up to 7.7× speed-up. GPU implementation using CUDA yielded substantial accelerations ranging from 18.5× to 157× speed-up once thread-based and block-based approaches were combined in the analysis. Proposed parallel programming solutions showed that multi-core processor and CUDA-supported GPU implementations accelerated the DFC analyses significantly. Developed algorithms make the DFC analyses more practical for multi-subject studies with more dynamic analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.
Coding for parallel execution of hardware-in-the-loop millimeter-wave scene generation models on multicore SIMD processor architectures

NASA Astrophysics Data System (ADS)

Olson, Richard F.

2013-05-01

Rendering of point scatterer based radar scenes for millimeter wave (mmW) seeker tests in real-time hardware-in-the-loop (HWIL) scene generation requires efficient algorithms and vector-friendly computer architectures for complex signal synthesis. New processor technology from Intel implements an extended 256-bit vector SIMD instruction set (AVX, AVX2) in a multi-core CPU design providing peak execution rates of hundreds of GigaFLOPS (GFLOPS) on one chip. Real world mmW scene generation code can approach peak SIMD execution rates only after careful algorithm and source code design. An effective software design will maintain high computing intensity emphasizing register-to-register SIMD arithmetic operations over data movement between CPU caches or off-chip memories. Engineers at the U.S. Army Aviation and Missile Research, Development and Engineering Center (AMRDEC) applied two basic parallel coding methods to assess new 256-bit SIMD multi-core architectures for mmW scene generation in HWIL. These include use of POSIX threads built on vector library functions and more portable, highlevel parallel code based on compiler technology (e.g. OpenMP pragmas and SIMD autovectorization). Since CPU technology is rapidly advancing toward high processor core counts and TeraFLOPS peak SIMD execution rates, it is imperative that coding methods be identified which produce efficient and maintainable parallel code. This paper describes the algorithms used in point scatterer target model rendering, the parallelization of those algorithms, and the execution performance achieved on an AVX multi-core machine using the two basic parallel coding methods. The paper concludes with estimates for scale-up performance on upcoming multi-core technology.
Processor Units Reduce Satellite Construction Costs

NASA Technical Reports Server (NTRS)

2014-01-01

As part of the effort to build the Fast Affordable Science and Technology Satellite (FASTSAT), Marshall Space Flight Center developed a low-cost telemetry unit which is used to facilitate communication between a satellite and its receiving station. Huntsville, Alabama-based Orbital Telemetry Inc. has licensed the NASA technology and is offering to install the cost-cutting units on commercial satellites.
Ultra-Compact Transputer-Based Controller for High-Level, Multi-Axis Coordination

NASA Technical Reports Server (NTRS)

Zenowich, Brian; Crowell, Adam; Townsend, William T.

2013-01-01

The design of machines that rely on arrays of servomotors such as robotic arms, orbital platforms, and combinations of both, imposes a heavy computational burden to coordinate their actions to perform coherent tasks. For example, the robotic equivalent of a person tracing a straight line in space requires enormously complex kinematics calculations, and complexity increases with the number of servo nodes. A new high-level architecture for coordinated servo-machine control enables a practical, distributed transputer alternative to conventional central processor electronics. The solution is inherently scalable, dramatically reduces bulkiness and number of conductor runs throughout the machine, requires only a fraction of the power, and is designed for cooling in a vacuum.
Vectorization for Molecular Dynamics on Intel Xeon Phi Corpocessors

NASA Astrophysics Data System (ADS)

Yi, Hongsuk

2014-03-01

Many modern processors are capable of exploiting data-level parallelism through the use of single instruction multiple data (SIMD) execution. The new Intel Xeon Phi coprocessor supports 512 bit vector registers for the high performance computing. In this paper, we have developed a hierarchical parallelization scheme for accelerated molecular dynamics simulations with the Terfoff potentials for covalent bond solid crystals on Intel Xeon Phi coprocessor systems. The scheme exploits multi-level parallelism computing. We combine thread-level parallelism using a tightly coupled thread-level and task-level parallelism with 512-bit vector register. The simulation results show that the parallel performance of SIMD implementations on Xeon Phi is apparently superior to their x86 CPU architecture.
Binaural sensitivity in children who use bilateral cochlear implants.

PubMed

Ehlers, Erica; Goupell, Matthew J; Zheng, Yi; Godar, Shelly P; Litovsky, Ruth Y

2017-06-01

Children who are deaf and receive bilateral cochlear implants (BiCIs) perform better on spatial hearing tasks using bilateral rather than unilateral inputs; however, they underperform relative to normal-hearing (NH) peers. This gap in performance is multi-factorial, including the inability of speech processors to reliably deliver binaural cues. Although much is known regarding binaural sensitivity of adults with BiCIs, less is known about how the development of binaural sensitivity in children with BiCIs compared to NH children. Sixteen children (ages 9-17 years) were tested using synchronized research processors. Interaural time differences and interaural level differences (ITDs and ILDs, respectively) were presented to pairs of pitch-matched electrodes. Stimuli were 300-ms, 100-pulses-per-second, constant-amplitude pulse trains. In the first and second experiments, discrimination of interaural cues (either ITDs or ILDs) was measured using a two-interval left/right task. In the third experiment, subjects reported the perceived intracranial position of ITDs and ILDs in a lateralization task. All children demonstrated sensitivity to ILDs, possibly due to monaural level cues. Children who were born deaf had weak or absent sensitivity to ITDs; in contrast, ITD sensitivity was noted in children with previous exposure to acoustic hearing. Therefore, factors such as auditory deprivation, in particular, lack of early exposure to consistent timing differences between the ears, may delay the maturation of binaural circuits and cause insensitivity to binaural differences.
Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition

NASA Astrophysics Data System (ADS)

Yin, Xi; Liu, Xiaoming

2018-02-01

This paper explores multi-task learning (MTL) for face recognition. We answer the questions of how and why MTL can improve the face recognition performance. First, we propose a multi-task Convolutional Neural Network (CNN) for face recognition where identity classification is the main task and pose, illumination, and expression estimations are the side tasks. Second, we develop a dynamic-weighting scheme to automatically assign the loss weight to each side task, which is a crucial problem in MTL. Third, we propose a pose-directed multi-task CNN by grouping different poses to learn pose-specific identity features, simultaneously across all poses. Last but not least, we propose an energy-based weight analysis method to explore how CNN-based MTL works. We observe that the side tasks serve as regularizations to disentangle the variations from the learnt identity features. Extensive experiments on the entire Multi-PIE dataset demonstrate the effectiveness of the proposed approach. To the best of our knowledge, this is the first work using all data in Multi-PIE for face recognition. Our approach is also applicable to in-the-wild datasets for pose-invariant face recognition and achieves comparable or better performance than state of the art on LFW, CFP, and IJB-A datasets.

Energy-aware Thread and Data Management in Heterogeneous Multi-core, Multi-memory Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Su, Chun-Yi

By 2004, microprocessor design focused on multicore scaling—increasing the number of cores per die in each generation—as the primary strategy for improving performance. These multicore processors typically equip multiple memory subsystems to improve data throughput. In addition, these systems employ heterogeneous processors such as GPUs and heterogeneous memories like non-volatile memory to improve performance, capacity, and energy efficiency. With the increasing volume of hardware resources and system complexity caused by heterogeneity, future systems will require intelligent ways to manage hardware resources. Early research to improve performance and energy efficiency on heterogeneous, multi-core, multi-memory systems focused on tuning a single primitivemore » or at best a few primitives in the systems. The key limitation of past efforts is their lack of a holistic approach to resource management that balances the tradeoff between performance and energy consumption. In addition, the shift from simple, homogeneous systems to these heterogeneous, multicore, multi-memory systems requires in-depth understanding of efficient resource management for scalable execution, including new models that capture the interchange between performance and energy, smarter resource management strategies, and novel low-level performance/energy tuning primitives and runtime systems. Tuning an application to control available resources efficiently has become a daunting challenge; managing resources in automation is still a dark art since the tradeoffs among programming, energy, and performance remain insufficiently understood. In this dissertation, I have developed theories, models, and resource management techniques to enable energy-efficient execution of parallel applications through thread and data management in these heterogeneous multi-core, multi-memory systems. I study the effect of dynamic concurrent throttling on the performance and energy of multi-core, non-uniform memory access (NUMA) systems. I use critical path analysis to quantify memory contention in the NUMA memory system and determine thread mappings. In addition, I implement a runtime system that combines concurrent throttling and a novel thread mapping algorithm to manage thread resources and improve energy efficient execution in multi-core, NUMA systems.« less
A hybrid optic-fiber sensor network with the function of self-diagnosis and self-healing

NASA Astrophysics Data System (ADS)

Xu, Shibo; Liu, Tiegen; Ge, Chunfeng; Chen, Cheng; Zhang, Hongxia

2014-11-01

We develop a hybrid wavelength division multiplexing optical fiber network with distributed fiber-optic sensors and quasi-distributed FBG sensor arrays which detect vibrations, temperatures and strains at the same time. The network has the ability to locate the failure sites automatically designated as self-diagnosis and make protective switching to reestablish sensing service designated as self-healing by cooperative work of software and hardware. The processes above are accomplished by master-slave processors with the help of optical and wireless telemetry signals. All the sensing and optical telemetry signals transmit in the same fiber either working fiber or backup fiber. We take wavelength 1450nm as downstream signal and wavelength 1350nm as upstream signal to control the network in normal circumstances, both signals are sent by a light emitting node of the corresponding processor. There is also a continuous laser wavelength 1310nm sent by each node and received by next node on both working and backup fibers to monitor their healthy states, but it does not carry any message like telemetry signals do. When fibers of two sensor units are completely damaged, the master processor will lose the communication with the node between the damaged ones.However we install RF module in each node to solve the possible problem. Finally, the whole network state is transmitted to host computer by master processor. Operator could know and control the network by human-machine interface if needed.
Evaluation and Analysis of a Multi-Band Transceiver for Next Generation Telemetry Applications

DTIC Science & Technology

2014-06-01

DDC ) BAND SELECTION Kintex FPGA DIGITAL RADIO RECEIVER DIGITAL RADIO TRANSMITTER ADC Fs < 225 MSPS Fs = 400 MHz RF BW = 36 MHz FREQ TRANSLATION VIA...MANAGER (MMCM) DIGITAL DOWN CONVERSION ( DDC ) BAND SELECTIVE FILTER Kintex FPGA DIGITAL RADIO RECEIVER DIGITAL RADIO TRANSMITTER FIR FINE TRANSLATION
Multi-Source Multi-Target Dictionary Learning for Prediction of Cognitive Decline.

PubMed

Zhang, Jie; Li, Qingyang; Caselli, Richard J; Thompson, Paul M; Ye, Jieping; Wang, Yalin

2017-06-01

Alzheimer's Disease (AD) is the most common type of dementia. Identifying correct biomarkers may determine pre-symptomatic AD subjects and enable early intervention. Recently, Multi-task sparse feature learning has been successfully applied to many computer vision and biomedical informatics researches. It aims to improve the generalization performance by exploiting the shared features among different tasks. However, most of the existing algorithms are formulated as a supervised learning scheme. Its drawback is with either insufficient feature numbers or missing label information. To address these challenges, we formulate an unsupervised framework for multi-task sparse feature learning based on a novel dictionary learning algorithm. To solve the unsupervised learning problem, we propose a two-stage Multi-Source Multi-Target Dictionary Learning (MMDL) algorithm. In stage 1, we propose a multi-source dictionary learning method to utilize the common and individual sparse features in different time slots. In stage 2, supported by a rigorous theoretical analysis, we develop a multi-task learning method to solve the missing label problem. Empirical studies on an N = 3970 longitudinal brain image data set, which involves 2 sources and 5 targets, demonstrate the improved prediction accuracy and speed efficiency of MMDL in comparison with other state-of-the-art algorithms.
Final Report for Project DE-FC02-06ER25755 [Pmodels2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Panda, Dhabaleswar; Sadayappan, P.

2014-03-12

In this report, we describe the research accomplished by the OSU team under the Pmodels2 project. The team has worked on various angles: designing high performance MPI implementations on modern networking technologies (Mellanox InfiniBand (including the new ConnectX2 architecture and Quad Data Rate), QLogic InfiniPath, the emerging 10GigE/iWARP and RDMA over Converged Enhanced Ethernet (RoCE) and Obsidian IB-WAN), studying MPI scalability issues for multi-thousand node clusters using XRC transport, scalable job start-up, dynamic process management support, efficient one-sided communication, protocol offloading and designing scalable collective communication libraries for emerging multi-core architectures. New designs conforming to the Argonne’s Nemesis interface havemore » also been carried out. All of these above solutions have been integrated into the open-source MVAPICH/MVAPICH2 software. This software is currently being used by more than 2,100 organizations worldwide (in 71 countries). As of January ’14, more than 200,000 downloads have taken place from the OSU Web site. In addition, many InfiniBand vendors, server vendors, system integrators and Linux distributors have been incorporating MVAPICH/MVAPICH2 into their software stacks and distributing it. Several InfiniBand systems using MVAPICH/MVAPICH2 have obtained positions in the TOP500 ranking of supercomputers in the world. The latest November ’13 ranking include the following systems: 7th ranked Stampede system at TACC with 462,462 cores; 11th ranked Tsubame 2.5 system at Tokyo Institute of Technology with 74,358 cores; 16th ranked Pleiades system at NASA with 81,920 cores; Work on PGAS models has proceeded on multiple directions. The Scioto framework, which supports task-parallelism in one-sided and global-view parallel programming, has been extended to allow multi-processor tasks that are executed by processor groups. A quantum Monte Carlo application is being ported onto the extended Scioto framework. A public release of Global Trees (GT) has been made, along with the Global Chunks (GC) framework on which GT is built. The Global Chunks (GC) layer is also being used as the basis for the development of a higher level Global Graphs (GG) layer. The Global Graphs (GG) system will provide a global address space view of distributed graph data structures on distributed memory systems.« less
Parallel and fault-tolerant algorithms for hypercube multiprocessors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aykanat, C.

1988-01-01

Several techniques for increasing the performance of parallel algorithms on distributed-memory message-passing multi-processor systems are investigated. These techniques are effectively implemented for the parallelization of the Scaled Conjugate Gradient (SCG) algorithm on a hypercube connected message-passing multi-processor. Significant performance improvement is achieved by using these techniques. The SCG algorithm is used for the solution phase of an FE modeling system. Almost linear speed-up is achieved, and it is shown that hypercube topology is scalable for an FE class of problem. The SCG algorithm is also shown to be suitable for vectorization, and near supercomputer performance is achieved on a vectormore » hypercube multiprocessor by exploiting both parallelization and vectorization. Fault-tolerance issues for the parallel SCG algorithm and for the hypercube topology are also addressed.« less
Performance Analysis of a Hybrid Overset Multi-Block Application on Multiple Architectures

NASA Technical Reports Server (NTRS)

Djomehri, M. Jahed; Biswas, Rupak

2003-01-01

This paper presents a detailed performance analysis of a multi-block overset grid compu- tational fluid dynamics app!ication on multiple state-of-the-art computer architectures. The application is implemented using a hybrid MPI+OpenMP programming paradigm that exploits both coarse and fine-grain parallelism; the former via MPI message passing and the latter via OpenMP directives. The hybrid model also extends the applicability of multi-block programs to large clusters of SNIP nodes by overcoming the restriction that the number of processors be less than the number of grid blocks. A key kernel of the application, namely the LU-SGS linear solver, had to be modified to enhance the performance of the hybrid approach on the target machines. Investigations were conducted on cacheless Cray SX6 vector processors, cache-based IBM Power3 and Power4 architectures, and single system image SGI Origin3000 platforms. Overall results for complex vortex dynamics simulations demonstrate that the SX6 achieves the highest performance and outperforms the RISC-based architectures; however, the best scaling performance was achieved on the Power3.
Parallel Monte Carlo transport modeling in the context of a time-dependent, three-dimensional multi-physics code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Procassini, R.J.

1997-12-31

The fine-scale, multi-space resolution that is envisioned for accurate simulations of complex weapons systems in three spatial dimensions implies flop-rate and memory-storage requirements that will only be obtained in the near future through the use of parallel computational techniques. Since the Monte Carlo transport models in these simulations usually stress both of these computational resources, they are prime candidates for parallelization. The MONACO Monte Carlo transport package, which is currently under development at LLNL, will utilize two types of parallelism within the context of a multi-physics design code: decomposition of the spatial domain across processors (spatial parallelism) and distribution ofmore » particles in a given spatial subdomain across additional processors (particle parallelism). This implementation of the package will utilize explicit data communication between domains (message passing). Such a parallel implementation of a Monte Carlo transport model will result in non-deterministic communication patterns. The communication of particles between subdomains during a Monte Carlo time step may require a significant level of effort to achieve a high parallel efficiency.« less
Robotic Attention Processing And Its Application To Visual Guidance

NASA Astrophysics Data System (ADS)

Barth, Matthew; Inoue, Hirochika

1988-03-01

This paper describes a method of real-time visual attention processing for robots performing visual guidance. This robot attention processing is based on a novel vision processor, the multi-window vision system that was developed at the University of Tokyo. The multi-window vision system is unique in that it only processes visual information inside local area windows. These local area windows are quite flexible in their ability to move anywhere on the visual screen, change their size and shape, and alter their pixel sampling rate. By using these windows for specific attention tasks, it is possible to perform high speed attention processing. The primary attention skills of detecting motion, tracking an object, and interpreting an image are all performed at high speed on the multi-window vision system. A basic robotic attention scheme using the attention skills was developed. The attention skills involved detection and tracking of salient visual features. The tracking and motion information thus obtained was utilized in producing the response to the visual stimulus. The response of the attention scheme was quick enough to be applicable to the real-time vision processing tasks of playing a video 'pong' game, and later using an automobile driving simulator. By detecting the motion of a 'ball' on a video screen and then tracking the movement, the attention scheme was able to control a 'paddle' in order to keep the ball in play. The response was faster than that of a human's, allowing the attention scheme to play the video game at higher speeds. Further, in the application to the driving simulator, the attention scheme was able to control both direction and velocity of a simulated vehicle following a lead car. These two applications show the potential of local visual processing in its use for robotic attention processing.
The Use Of Videography For Three-Dimensional Motion Analysis

NASA Astrophysics Data System (ADS)

Hawkins, D. A.; Hawthorne, D. L.; DeLozier, G. S.; Campbell, K. R.; Grabiner, M. D.

1988-02-01

Special video path editing capabilities with custom hardware and software, have been developed for use in conjunction with existing video acquisition hardware and firmware. This system has simplified the task of quantifying the kinematics of human movement. A set of retro-reflective markers are secured to a subject performing a given task (i.e. walking, throwing, swinging a golf club, etc.). Multiple cameras, a video processor, and a computer work station collect video data while the task is performed. Software has been developed to edit video files, create centroid data, and identify marker paths. Multi-camera path files are combined to form a 3D path file using the DLT method of cinematography. A separate program converts the 3D path file into kinematic data by creating a set of local coordinate axes and performing a series of coordinate transformations from one local system to the next. The kinematic data is then displayed for appropriate review and/or comparison.
Speech understanding in noise with the Roger Pen, Naida CI Q70 processor, and integrated Roger 17 receiver in a multi-talker network.

PubMed

De Ceulaer, Geert; Bestel, Julie; Mülder, Hans E; Goldbeck, Felix; de Varebeke, Sebastien Pierre Janssens; Govaerts, Paul J

2016-05-01

Roger is a digital adaptive multi-channel remote microphone technology that wirelessly transmits a speaker's voice directly to a hearing instrument or cochlear implant sound processor. Frequency hopping between channels, in combination with repeated broadcast, avoids interference issues that have limited earlier generation FM systems. This study evaluated the benefit of the Roger Pen transmitter microphone in a multiple talker network (MTN) for cochlear implant users in a simulated noisy conversation setting. Twelve post-lingually deafened adult Advanced Bionics CII/HiRes 90K recipients were recruited. Subjects used a Naida CI Q70 processor with integrated Roger 17 receiver. The test environment simulated four people having a meal in a noisy restaurant, one the CI user (listener), and three companions (talkers) talking non-simultaneously in a diffuse field of multi-talker babble. Speech reception thresholds (SRTs) were determined without the Roger Pen, with one Roger Pen, and with three Roger Pens in an MTN. Using three Roger Pens in an MTN improved the SRT by 14.8 dB over using no Roger Pen, and by 13.1 dB over using a single Roger Pen (p < 0.0001). The Roger Pen in an MTN provided statistically and clinically significant improvement in speech perception in noise for Advanced Bionics cochlear implant recipients. The integrated Roger 17 receiver made it easy for users of the Naida CI Q70 processor to take advantage of the Roger system. The listening advantage and ease of use should encourage more clinicians to recommend and fit Roger in adult cochlear implant patients.
Low Power, Low Mass, Modular, Multi-band Software-defined Radios

NASA Technical Reports Server (NTRS)

Haskins, Christopher B. (Inventor); Millard, Wesley P. (Inventor)

2013-01-01

Methods and systems to implement and operate software-defined radios (SDRs). An SDR may be configured to perform a combination of fractional and integer frequency synthesis and direct digital synthesis under control of a digital signal processor, which may provide a set of relatively agile, flexible, low-noise, and low spurious, timing and frequency conversion signals, and which may be used to maintain a transmit path coherent with a receive path. Frequency synthesis may include dithering to provide additional precision. The SDR may include task-specific software-configurable systems to perform tasks in accordance with software-defined parameters or personalities. The SDR may include a hardware interface system to control hardware components, and a host interface system to provide an interface to the SDR with respect to a host system. The SDR may be configured for one or more of communications, navigation, radio science, and sensors.
Designing minimal space telerobotics systems for maximum performance

NASA Technical Reports Server (NTRS)

Backes, Paul G.; Long, Mark K.; Steele, Robert D.

1992-01-01

The design of the remote site of a local-remote telerobot control system is described which addresses the constraints of limited computational power available at the remote site control system while providing a large range of control capabilities. The Modular Telerobot Task Execution System (MOTES) provides supervised autonomous control, shared control and teleoperation for a redundant manipulator. The system is capable of nominal task execution as well as monitoring and reflex motion. The MOTES system is minimized while providing a large capability by limiting its functionality to only that which is necessary at the remote site and by utilizing a unified multi-sensor based impedance control scheme. A command interpreter similar to one used on robotic spacecraft is used to interpret commands received from the local site. The system is written in Ada and runs in a VME environment on 68020 processors and initially controls a Robotics Research K1207 7 degree of freedom manipulator.
Full-Authority Fault-Tolerant Electronic Engine Control System for Variable Cycle Engines.

DTIC Science & Technology

1982-04-01

single internally self-checked VLSI micro - processor . The selected configuration is an externally checked pair of com- mercially available...Electronic Engine Control FPMH Failures per Million Hours FTMP Fault Tolerant Multi- Processor FTSC Fault Tolerant Spaceborn Computer GRAMP Generalized...Removal * MTBR Mean Time Between Repair MTTF Mean Time to Failure xiii List of Abbreviations (continued) - NH High Pressure Rotor Speed O&S Operating
Multi-task feature learning by using trace norm regularization

NASA Astrophysics Data System (ADS)

Jiangmei, Zhang; Binfeng, Yu; Haibo, Ji; Wang, Kunpeng

2017-11-01

Multi-task learning can extract the correlation of multiple related machine learning problems to improve performance. This paper considers applying the multi-task learning method to learn a single task. We propose a new learning approach, which employs the mixture of expert model to divide a learning task into several related sub-tasks, and then uses the trace norm regularization to extract common feature representation of these sub-tasks. A nonlinear extension of this approach by using kernel is also provided. Experiments conducted on both simulated and real data sets demonstrate the advantage of the proposed approach.
Multi-element germanium detectors for synchrotron applications

NASA Astrophysics Data System (ADS)

Rumaiz, A. K.; Kuczewski, A. J.; Mead, J.; Vernon, E.; Pinelli, D.; Dooryhee, E.; Ghose, S.; Caswell, T.; Siddons, D. P.; Miceli, A.; Baldwin, J.; Almer, J.; Okasinski, J.; Quaranta, O.; Woods, R.; Krings, T.; Stock, S.

2018-04-01

We have developed a series of monolithic multi-element germanium detectors, based on sensor arrays produced by the Forschungzentrum Julich, and on Application-specific integrated circuits (ASICs) developed at Brookhaven. Devices have been made with element counts ranging from 64 to 384. These detectors are being used at NSLS-II and APS for a range of diffraction experiments, both monochromatic and energy-dispersive. Compact and powerful readout systems have been developed, based on the new generation of FPGA system-on-chip devices, which provide closely coupled multi-core processors embedded in large gate arrays. We will discuss the technical details of the systems, and present some of the results from them.
Image Matrix Processor for Volumetric Computations Final Report CRADA No. TSB-1148-95

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roberson, G. Patrick; Browne, Jolyon

The development of an Image Matrix Processor (IMP) was proposed that would provide an economical means to perform rapid ray-tracing processes on volume "Giga Voxel" data sets. This was a multi-phased project. The objective of the first phase of the IMP project was to evaluate the practicality of implementing a workstation-based Image Matrix Processor for use in volumetric reconstruction and rendering using hardware simulation techniques. Additionally, ARACOR and LLNL worked together to identify and pursue further funding sources to complete a second phase of this project.
Multi-Source Multi-Target Dictionary Learning for Prediction of Cognitive Decline

PubMed Central

Zhang, Jie; Li, Qingyang; Caselli, Richard J.; Thompson, Paul M.; Ye, Jieping; Wang, Yalin

2017-01-01

Alzheimer’s Disease (AD) is the most common type of dementia. Identifying correct biomarkers may determine pre-symptomatic AD subjects and enable early intervention. Recently, Multi-task sparse feature learning has been successfully applied to many computer vision and biomedical informatics researches. It aims to improve the generalization performance by exploiting the shared features among different tasks. However, most of the existing algorithms are formulated as a supervised learning scheme. Its drawback is with either insufficient feature numbers or missing label information. To address these challenges, we formulate an unsupervised framework for multi-task sparse feature learning based on a novel dictionary learning algorithm. To solve the unsupervised learning problem, we propose a two-stage Multi-Source Multi-Target Dictionary Learning (MMDL) algorithm. In stage 1, we propose a multi-source dictionary learning method to utilize the common and individual sparse features in different time slots. In stage 2, supported by a rigorous theoretical analysis, we develop a multi-task learning method to solve the missing label problem. Empirical studies on an N = 3970 longitudinal brain image data set, which involves 2 sources and 5 targets, demonstrate the improved prediction accuracy and speed efficiency of MMDL in comparison with other state-of-the-art algorithms. PMID:28943731
SHARP: A multi-mission artificial intelligence system for spacecraft telemetry monitoring and diagnosis

NASA Technical Reports Server (NTRS)

Lawson, Denise L.; James, Mark L.

1989-01-01

The Spacecraft Health Automated Reasoning Prototype (SHARP) is a system designed to demonstrate automated health and status analysis for multi-mission spacecraft and ground data systems operations. Telecommunications link analysis of the Voyager 2 spacecraft is the initial focus for the SHARP system demonstration which will occur during Voyager's encounter with the planet Neptune in August, 1989, in parallel with real time Voyager operations. The SHARP system combines conventional computer science methodologies with artificial intelligence techniques to produce an effective method for detecting and analyzing potential spacecraft and ground systems problems. The system performs real time analysis of spacecraft and other related telemetry, and is also capable of examining data in historical context. A brief introduction is given to the spacecraft and ground systems monitoring process at the Jet Propulsion Laboratory. The current method of operation for monitoring the Voyager Telecommunications subsystem is described, and the difficulties associated with the existing technology are highlighted. The approach taken in the SHARP system to overcome the current limitations is also described, as well as both the conventional and artificial intelligence solutions developed in SHARP.
SHARP: A multi-mission AI system for spacecraft telemetry monitoring and diagnosis

NASA Technical Reports Server (NTRS)

Lawson, Denise L.; James, Mark L.

1989-01-01

The Spacecraft Health Automated Reasoning Prototype (SHARP) is a system designed to demonstrate automated health and status analysis for multi-mission spacecraft and ground data systems operations. Telecommunications link analysis of the Voyager II spacecraft is the initial focus for the SHARP system demonstration which will occur during Voyager's encounter with the planet Neptune in August, 1989, in parallel with real-time Voyager operations. The SHARP system combines conventional computer science methodologies with artificial intelligence techniques to produce an effective method for detecting and analyzing potential spacecraft and ground systems problems. The system performs real-time analysis of spacecraft and other related telemetry, and is also capable of examining data in historical context. A brief introduction is given to the spacecraft and ground systems monitoring process at the Jet Propulsion Laboratory. The current method of operation for monitoring the Voyager Telecommunications subsystem is described, and the difficulties associated with the existing technology are highlighted. The approach taken in the SHARP system to overcome the current limitations is also described, as well as both the conventional and artificial intelligence solutions developed in SHARP.

Concurrent Probabilistic Simulation of High Temperature Composite Structural Response

NASA Technical Reports Server (NTRS)

Abdi, Frank

1996-01-01

A computational structural/material analysis and design tool which would meet industry's future demand for expedience and reduced cost is presented. This unique software 'GENOA' is dedicated to parallel and high speed analysis to perform probabilistic evaluation of high temperature composite response of aerospace systems. The development is based on detailed integration and modification of diverse fields of specialized analysis techniques and mathematical models to combine their latest innovative capabilities into a commercially viable software package. The technique is specifically designed to exploit the availability of processors to perform computationally intense probabilistic analysis assessing uncertainties in structural reliability analysis and composite micromechanics. The primary objectives which were achieved in performing the development were: (1) Utilization of the power of parallel processing and static/dynamic load balancing optimization to make the complex simulation of structure, material and processing of high temperature composite affordable; (2) Computational integration and synchronization of probabilistic mathematics, structural/material mechanics and parallel computing; (3) Implementation of an innovative multi-level domain decomposition technique to identify the inherent parallelism, and increasing convergence rates through high- and low-level processor assignment; (4) Creating the framework for Portable Paralleled architecture for the machine independent Multi Instruction Multi Data, (MIMD), Single Instruction Multi Data (SIMD), hybrid and distributed workstation type of computers; and (5) Market evaluation. The results of Phase-2 effort provides a good basis for continuation and warrants Phase-3 government, and industry partnership.
Performances of multiprocessor multidisk architectures for continuous media storage

NASA Astrophysics Data System (ADS)

Gennart, Benoit A.; Messerli, Vincent; Hersch, Roger D.

1996-03-01

Multimedia interfaces increase the need for large image databases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.
A pluggable framework for parallel pairwise sequence search.

PubMed

Archuleta, Jeremy; Feng, Wu-chun; Tilevich, Eli

2007-01-01

The current and near future of the computing industry is one of multi-core and multi-processor technology. Most existing sequence-search tools have been designed with a focus on single-core, single-processor systems. This discrepancy between software design and hardware architecture substantially hinders sequence-search performance by not allowing full utilization of the hardware. This paper presents a novel framework that will aid the conversion of serial sequence-search tools into a parallel version that can take full advantage of the available hardware. The framework, which is based on a software architecture called mixin layers with refined roles, enables modules to be plugged into the framework with minimal effort. The inherent modular design improves maintenance and extensibility, thus opening up a plethora of opportunities for advanced algorithmic features to be developed and incorporated while routine maintenance of the codebase persists.
Performance prediction: A case study using a multi-ring KSR-1 machine

NASA Technical Reports Server (NTRS)

Sun, Xian-He; Zhu, Jianping

1995-01-01

While computers with tens of thousands of processors have successfully delivered high performance power for solving some of the so-called 'grand-challenge' applications, the notion of scalability is becoming an important metric in the evaluation of parallel machine architectures and algorithms. In this study, the prediction of scalability and its application are carefully investigated. A simple formula is presented to show the relation between scalability, single processor computing power, and degradation of parallelism. A case study is conducted on a multi-ring KSR1 shared virtual memory machine. Experimental and theoretical results show that the influence of topology variation of an architecture is predictable. Therefore, the performance of an algorithm on a sophisticated, heirarchical architecture can be predicted and the best algorithm-machine combination can be selected for a given application.
Crack Damage Detection Method via Multiple Visual Features and Efficient Multi-Task Learning Model.

PubMed

Wang, Baoxian; Zhao, Weigang; Gao, Po; Zhang, Yufeng; Wang, Zhe

2018-06-02

This paper proposes an effective and efficient model for concrete crack detection. The presented work consists of two modules: multi-view image feature extraction and multi-task crack region detection. Specifically, multiple visual features (such as texture, edge, etc.) of image regions are calculated, which can suppress various background noises (such as illumination, pockmark, stripe, blurring, etc.). With the computed multiple visual features, a novel crack region detector is advocated using a multi-task learning framework, which involves restraining the variability for different crack region features and emphasizing the separability between crack region features and complex background ones. Furthermore, the extreme learning machine is utilized to construct this multi-task learning model, thereby leading to high computing efficiency and good generalization. Experimental results of the practical concrete images demonstrate that the developed algorithm can achieve favorable crack detection performance compared with traditional crack detectors.
Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Villa, Oreste; Tumeo, Antonino; Secchi, Simone

Irregular applications, such as data mining and analysis or graph-based computations, show unpredictable memory/network access patterns and control structures. Highly multithreaded architectures with large processor counts, like the Cray MTA-1, MTA-2 and XMT, appear to address their requirements better than commodity clusters. However, the research on highly multithreaded systems is currently limited by the lack of adequate architectural simulation infrastructures due to issues such as size of the machines, memory footprint, simulation speed, accuracy and customization. At the same time, Shared-memory MultiProcessors (SMPs) with multi-core processors have become an attractive platform to simulate large scale machines. In this paper, wemore » introduce a cycle-level simulator of the highly multithreaded Cray XMT supercomputer. The simulator runs unmodified XMT applications. We discuss how we tackled the challenges posed by its development, detailing the techniques introduced to make the simulation as fast as possible while maintaining a high accuracy. By mapping XMT processors (ThreadStorm with 128 hardware threads) to host computing cores, the simulation speed remains constant as the number of simulated processors increases, up to the number of available host cores. The simulator supports zero-overhead switching among different accuracy levels at run-time and includes a network model that takes into account contention. On a modern 48-core SMP host, our infrastructure simulates a large set of irregular applications 500 to 2000 times slower than real time when compared to a 128-processor XMT, while remaining within 10\\% of accuracy. Emulation is only from 25 to 200 times slower than real time.« less
Spaceborne Processor Array

NASA Technical Reports Server (NTRS)

Chow, Edward T.; Schatzel, Donald V.; Whitaker, William D.; Sterling, Thomas

2008-01-01

A Spaceborne Processor Array in Multifunctional Structure (SPAMS) can lower the total mass of the electronic and structural overhead of spacecraft, resulting in reduced launch costs, while increasing the science return through dynamic onboard computing. SPAMS integrates the multifunctional structure (MFS) and the Gilgamesh Memory, Intelligence, and Network Device (MIND) multi-core in-memory computer architecture into a single-system super-architecture. This transforms every inch of a spacecraft into a sharable, interconnected, smart computing element to increase computing performance while simultaneously reducing mass. The MIND in-memory architecture provides a foundation for high-performance, low-power, and fault-tolerant computing. The MIND chip has an internal structure that includes memory, processing, and communication functionality. The Gilgamesh is a scalable system comprising multiple MIND chips interconnected to operate as a single, tightly coupled, parallel computer. The array of MIND components shares a global, virtual name space for program variables and tasks that are allocated at run time to the distributed physical memory and processing resources. Individual processor- memory nodes can be activated or powered down at run time to provide active power management and to configure around faults. A SPAMS system is comprised of a distributed Gilgamesh array built into MFS, interfaces into instrument and communication subsystems, a mass storage interface, and a radiation-hardened flight computer.
Multi-level, multi-scale resource selection functions and resistance surfaces for conservation planning: Pumas as a case study

PubMed Central

Vickers, T. Winston; Ernest, Holly B.; Boyce, Walter M.

2017-01-01

The importance of examining multiple hierarchical levels when modeling resource use for wildlife has been acknowledged for decades. Multi-level resource selection functions have recently been promoted as a method to synthesize resource use across nested organizational levels into a single predictive surface. Analyzing multiple scales of selection within each hierarchical level further strengthens multi-level resource selection functions. We extend this multi-level, multi-scale framework to modeling resistance for wildlife by combining multi-scale resistance surfaces from two data types, genetic and movement. Resistance estimation has typically been conducted with one of these data types, or compared between the two. However, we contend it is not an either/or issue and that resistance may be better-modeled using a combination of resistance surfaces that represent processes at different hierarchical levels. Resistance surfaces estimated from genetic data characterize temporally broad-scale dispersal and successful breeding over generations, whereas resistance surfaces estimated from movement data represent fine-scale travel and contextualized movement decisions. We used telemetry and genetic data from a long-term study on pumas (Puma concolor) in a highly developed landscape in southern California to develop a multi-level, multi-scale resource selection function and a multi-level, multi-scale resistance surface. We used these multi-level, multi-scale surfaces to identify resource use patches and resistant kernel corridors. Across levels, we found puma avoided urban, agricultural areas, and roads and preferred riparian areas and more rugged terrain. For other landscape features, selection differed among levels, as did the scales of selection for each feature. With these results, we developed a conservation plan for one of the most isolated puma populations in the U.S. Our approach captured a wide spectrum of ecological relationships for a population, resulted in effective conservation planning, and can be readily applied to other wildlife species. PMID:28609466
Multi-level, multi-scale resource selection functions and resistance surfaces for conservation planning: Pumas as a case study.

PubMed

Zeller, Katherine A; Vickers, T Winston; Ernest, Holly B; Boyce, Walter M

2017-01-01

The importance of examining multiple hierarchical levels when modeling resource use for wildlife has been acknowledged for decades. Multi-level resource selection functions have recently been promoted as a method to synthesize resource use across nested organizational levels into a single predictive surface. Analyzing multiple scales of selection within each hierarchical level further strengthens multi-level resource selection functions. We extend this multi-level, multi-scale framework to modeling resistance for wildlife by combining multi-scale resistance surfaces from two data types, genetic and movement. Resistance estimation has typically been conducted with one of these data types, or compared between the two. However, we contend it is not an either/or issue and that resistance may be better-modeled using a combination of resistance surfaces that represent processes at different hierarchical levels. Resistance surfaces estimated from genetic data characterize temporally broad-scale dispersal and successful breeding over generations, whereas resistance surfaces estimated from movement data represent fine-scale travel and contextualized movement decisions. We used telemetry and genetic data from a long-term study on pumas (Puma concolor) in a highly developed landscape in southern California to develop a multi-level, multi-scale resource selection function and a multi-level, multi-scale resistance surface. We used these multi-level, multi-scale surfaces to identify resource use patches and resistant kernel corridors. Across levels, we found puma avoided urban, agricultural areas, and roads and preferred riparian areas and more rugged terrain. For other landscape features, selection differed among levels, as did the scales of selection for each feature. With these results, we developed a conservation plan for one of the most isolated puma populations in the U.S. Our approach captured a wide spectrum of ecological relationships for a population, resulted in effective conservation planning, and can be readily applied to other wildlife species.
Risk assessments using the Strain Index and the TLV for HAL, Part II: Multi-task jobs and prevalence of CTS.

PubMed

Kapellusch, Jay M; Silverstein, Barbara A; Bao, Stephen S; Thiese, Mathew S; Merryweather, Andrew S; Hegmann, Kurt T; Garg, Arun

2018-02-01

The Strain Index (SI) and the American Conference of Governmental Industrial Hygienists (ACGIH) threshold limit value for hand activity level (TLV for HAL) have been shown to be associated with prevalence of distal upper-limb musculoskeletal disorders such as carpal tunnel syndrome (CTS). The SI and TLV for HAL disagree on more than half of task exposure classifications. Similarly, time-weighted average (TWA), peak, and typical exposure techniques used to quantity physical exposure from multi-task jobs have shown between-technique agreement ranging from 61% to 93%, depending upon whether the SI or TLV for HAL model was used. This study compared exposure-response relationships between each model-technique combination and prevalence of CTS. Physical exposure data from 1,834 workers (710 with multi-task jobs) were analyzed using the SI and TLV for HAL and the TWA, typical, and peak multi-task job exposure techniques. Additionally, exposure classifications from the SI and TLV for HAL were combined into a single measure and evaluated. Prevalent CTS cases were identified using symptoms and nerve-conduction studies. Mixed effects logistic regression was used to quantify exposure-response relationships between categorized (i.e., low, medium, and high) physical exposure and CTS prevalence for all model-technique combinations, and for multi-task workers, mono-task workers, and all workers combined. Except for TWA TLV for HAL, all model-technique combinations showed monotonic increases in risk of CTS with increased physical exposure. The combined-models approach showed stronger association than the SI or TLV for HAL for multi-task workers. Despite differences in exposure classifications, nearly all model-technique combinations showed exposure-response relationships with prevalence of CTS for the combined sample of mono-task and multi-task workers. Both the TLV for HAL and the SI, with the TWA or typical techniques, appear useful for epidemiological studies and surveillance. However, the utility of TWA, typical, and peak techniques for job design and intervention is dubious.
The Multi-Feature Hypothesis: Connectionist Guidelines for L2 Task Design

ERIC Educational Resources Information Center

Moonen, Machteld; de Graaff, Rick; Westhoff, Gerard; Brekelmans, Mieke

2014-01-01

This study focuses on the effects of task type on the retention and ease of activation of second language (L2) vocabulary, based on the multi-feature hypothesis (Moonen, De Graaff, & Westhoff, 2006). Two tasks were compared: a writing task and a list-learning task. It was hypothesized that performing the writing task would yield higher…
Conditional load and store in a shared memory

DOEpatents

Blumrich, Matthias A; Ohmacht, Martin

2015-02-03

A method, system and computer program product for implementing load-reserve and store-conditional instructions in a multi-processor computing system. The computing system includes a multitude of processor units and a shared memory cache, and each of the processor units has access to the memory cache. In one embodiment, the method comprises providing the memory cache with a series of reservation registers, and storing in these registers addresses reserved in the memory cache for the processor units as a result of issuing load-reserve requests. In this embodiment, when one of the processor units makes a request to store data in the memory cache using a store-conditional request, the reservation registers are checked to determine if an address in the memory cache is reserved for that processor unit. If an address in the memory cache is reserved for that processor, the data are stored at this address.
Research status of multi - robot systems task allocation and uncertainty treatment

NASA Astrophysics Data System (ADS)

Li, Dahui; Fan, Qi; Dai, Xuefeng

2017-08-01

The multi-robot coordination algorithm has become a hot research topic in the field of robotics in recent years. It has a wide range of applications and good application prospects. This paper analyzes and summarizes the current research status of multi-robot coordination algorithms at home and abroad. From task allocation and dealing with uncertainty, this paper discusses the multi-robot coordination algorithm and presents the advantages and disadvantages of each method commonly used.
Multi-channel time-reversal receivers for multi and 1-bit implementations

DOEpatents

Candy, James V.; Chambers, David H.; Guidry, Brian L.; Poggio, Andrew J.; Robbins, Christopher L.

2008-12-09

A communication system for transmitting a signal through a channel medium comprising digitizing the signal, time-reversing the digitized signal, and transmitting the signal through the channel medium. In one embodiment a transmitter is adapted to transmit the signal, a multiplicity of receivers are adapted to receive the signal, a digitizer digitizes the signal, and a time-reversal signal processor is adapted to time-reverse the digitized signal. An embodiment of the present invention includes multi bit implementations. Another embodiment of the present invention includes 1-bit implementations. Another embodiment of the present invention includes a multiplicity of receivers used in the step of transmitting the signal through the channel medium.
IMPETUS - Interactive MultiPhysics Environment for Unified Simulations.

PubMed

Ha, Vi Q; Lykotrafitis, George

2016-12-08

We introduce IMPETUS - Interactive MultiPhysics Environment for Unified Simulations, an object oriented, easy-to-use, high performance, C++ program for three-dimensional simulations of complex physical systems that can benefit a large variety of research areas, especially in cell mechanics. The program implements cross-communication between locally interacting particles and continuum models residing in the same physical space while a network facilitates long-range particle interactions. Message Passing Interface is used for inter-processor communication for all simulations. Copyright © 2016 Elsevier Ltd. All rights reserved.
Acoustical Techniques/Designs Investigated During the Southeast Asia Conflict 1966-1972

DTIC Science & Technology

1981-01-21

the inbuoy processor above, the three sensor channels, two directional and one omni, are multi - plexed together and via the RF link transmitted back to...width at the 10 db dco;.n points and mnximtum side lobe level of -15 db) requires at least six’ elements in the line. At a geometric wean frequency of...presentation provides the data to the operator in the most accessible format. The vehicle descriptors are easily extracted with the aid of multi -point
The Complicate Observations and Multi-Parameter Land Information Constructions on Allied Telemetry Experiment (COMPLICATE)

PubMed Central

Tian, Xin; Li, Zengyuan; Chen, Erxue; Liu, Qinhuo; Yan, Guangjian; Wang, Jindi; Niu, Zheng; Zhao, Shaojie; Li, Xin; Pang, Yong; Su, Zhongbo; van der Tol, Christiaan; Liu, Qingwang; Wu, Chaoyang; Xiao, Qing; Yang, Le; Mu, Xihan; Bo, Yanchen; Qu, Yonghua; Zhou, Hongmin; Gao, Shuai; Chai, Linna; Huang, Huaguo; Fan, Wenjie; Li, Shihua; Bai, Junhua; Jiang, Lingmei; Zhou, Ji

2015-01-01

The Complicate Observations and Multi-Parameter Land Information Constructions on Allied Telemetry Experiment (COMPLICATE) comprises a network of remote sensing experiments designed to enhance the dynamic analysis and modeling of remotely sensed information for complex land surfaces. Two types of experimental campaigns were established under the framework of COMPLICATE. The first was designed for continuous and elaborate experiments. The experimental strategy helps enhance our understanding of the radiative and scattering mechanisms of soil and vegetation and modeling of remotely sensed information for complex land surfaces. To validate the methodologies and models for dynamic analyses of remote sensing for complex land surfaces, the second campaign consisted of simultaneous satellite-borne, airborne, and ground-based experiments. During field campaigns, several continuous and intensive observations were obtained. Measurements were undertaken to answer key scientific issues, as follows: 1) Determine the characteristics of spatial heterogeneity and the radiative and scattering mechanisms of remote sensing on complex land surfaces. 2) Determine the mechanisms of spatial and temporal scale extensions for remote sensing on complex land surfaces. 3) Determine synergist inversion mechanisms for soil and vegetation parameters using multi-mode remote sensing on complex land surfaces. Here, we introduce the background, the objectives, the experimental designs, the observations and measurements, and the overall advances of COMPLICATE. As a result of the implementation of COMLICATE and for the next several years, we expect to contribute to quantitative remote sensing science and Earth observation techniques. PMID:26332035
Multi-mission space science data processing systems - Past, present, and future

NASA Technical Reports Server (NTRS)

Stallings, William H.

1990-01-01

Packetized telemetry that is consistent with the international Consultative Committee for Space Data Systems (CCSDS) has been baselined for future NASA missions such as Space Station Freedom. Some experiences from past and present multimission systems are examined, including current experiences in implementing a CCSDS standard packetized data processing system, relative to the effectiveness of the multimission approach in lowering life cycle cost and the complexity of meeting new mission needs. It is shown that the continued effort toward standardization of telemetry and processing support will permit the development of multimission systems needed to meet the increased requirements of future NASA missions.
Multi-purpose CMOS sensor interface for low-power applications

NASA Astrophysics Data System (ADS)

Wouters, P.; de Cooman, M.; Puers, R.

1994-08-01

A dedicated low-power CMOS transponder microchip is presented as part of a novel telemetry implant for biomedical applications. This mixed analog-digital circuit contains an identification code and collects information on physiological parameters, i.e., body temperature and physical activity, and on the status of the battery. To minimize the amount of data to be transmitted, a dedicated signal processing algorithm is embedded within its circuitry. All telemetry functions (encoding, modulation, generation of the carrier) are implemented on the integrated circuit. Emphasis is on a high degree of flexibility towards sensor inputs and internal data management, extreme miniaturization, and low-power consumption to allow a long implantation lifetime.
Optimization of the coherence function estimation for multi-core central processing unit

NASA Astrophysics Data System (ADS)

Cheremnov, A. G.; Faerman, V. A.; Avramchuk, V. S.

2017-02-01

The paper considers use of parallel processing on multi-core central processing unit for optimization of the coherence function evaluation arising in digital signal processing. Coherence function along with other methods of spectral analysis is commonly used for vibration diagnosis of rotating machinery and its particular nodes. An algorithm is given for the function evaluation for signals represented with digital samples. The algorithm is analyzed for its software implementation and computational problems. Optimization measures are described, including algorithmic, architecture and compiler optimization, their results are assessed for multi-core processors from different manufacturers. Thus, speeding-up of the parallel execution with respect to sequential execution was studied and results are presented for Intel Core i7-4720HQ и AMD FX-9590 processors. The results show comparatively high efficiency of the optimization measures taken. In particular, acceleration indicators and average CPU utilization have been significantly improved, showing high degree of parallelism of the constructed calculating functions. The developed software underwent state registration and will be used as a part of a software and hardware solution for rotating machinery fault diagnosis and pipeline leak location with acoustic correlation method.

Distributed digital signal processors for multi-body flexible structures

NASA Technical Reports Server (NTRS)

Lee, Gordon K. F.

1992-01-01

Multi-body flexible structures, such as those currently under investigation in spacecraft design, are large scale (high-order) dimensional systems. Controlling and filtering such structures is a computationally complex problem. This is particularly important when many sensors and actuators are located along the structure and need to be processed in real time. This report summarizes research activity focused on solving the signal processing (that is, information processing) issues of multi-body structures. A distributed architecture is developed in which single loop processors are employed for local filtering and control. By implementing such a philosophy with an embedded controller configuration, a supervising controller may be used to process global data and make global decisions as the local devices are processing local information. A hardware testbed, a position controller system for a servo motor, is employed to illustrate the capabilities of the embedded controller structure. Several filtering and control structures which can be modeled as rational functions can be implemented on the system developed in this research effort. Thus the results of the study provide a support tool for many Control/Structure Interaction (CSI) NASA testbeds such as the Evolutionary model and the nine-bay truss structure.
Algorithm Design of CPCI Backboard's Interrupts Management Based on VxWorks' Multi-Tasks

NASA Astrophysics Data System (ADS)

Cheng, Jingyuan; An, Qi; Yang, Junfeng

2006-09-01

This paper begins with a brief introduction of the embedded real-time operating system VxWorks and CompactPCI standard, then gives the programming interfaces of Peripheral Controller Interface (PCI) configuring, interrupts handling and multi-tasks programming interface under VxWorks, and then emphasis is placed on the software frameworks of CPCI interrupt management based on multi-tasks. This method is sound in design and easy to adapt, ensures that all possible interrupts are handled in time, which makes it suitable for data acquisition systems with multi-channels, a high data rate, and hard real-time high energy physics.
Space-Based Telemetry and Range Safety (STARS) Study

NASA Technical Reports Server (NTRS)

Hogie, Keith; Crisuolo, Ed; Parise, Ron

2004-01-01

This presentation will describe the design, development, and testing of a system to collect telemetry, format it into UDP/IP packets, and deliver it to a ground test range using standard IP technologies over a TDRSS link. This presentation will discuss the goal of the STARS IP Formatter along with the overall design. It will also present performance results of the current version of the IP formatter. Finally, it will discuss key issues for supporting constant rate telemetry data delivery when using standard components such as PCI/104 processors, the Linux operating system, Internet Protocols, and synchronous serial interfaces.
High-performance computing for airborne applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather M; Manuzzato, Andrea; Fairbanks, Tom

2010-06-28

Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even thoughmore » the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.« less
A high-rate PCI-based telemetry processor system

NASA Astrophysics Data System (ADS)

Turri, R.

2002-07-01

The high performances reached by the Satellite on-board telemetry generation and transmission, as consequently, will impose the design of ground facilities with higher processing capabilities at low cost to allow a good diffusion of these ground station. The equipment normally used are based on complex, proprietary bus and computing architectures that prevent the systems from exploiting the continuous and rapid increasing in computing power available on market. The PCI bus systems now allow processing of high-rate data streams in a standard PC-system. At the same time the Windows NT operating system supports multitasking and symmetric multiprocessing, giving the capability to process high data rate signals. In addition, high-speed networking, 64 bit PCI-bus technologies and the increase in processor power and software, allow creating a system based on COTS products (which in future may be easily and inexpensively upgraded). In the frame of EUCLID RTP 9.8 project, a specific work element was dedicated to develop the architecture of a system able to acquire telemetry data of up to 600 Mbps. Laben S.p.A - a Finmeccanica Company -, entrusted of this work, has designed a PCI-based telemetry system making possible the communication between a satellite down-link and a wide area network at the required rate.
Integration, Development and Performance of the 500 TFLOPS Heterogeneous Cluster (Condor)

DTIC Science & Technology

2012-08-01

PlayStation 3 for High Performance Cluster Computing” LAPACK Working Note 185, 2007. [ 4 ] Feng, W., X. Feng, and R. Ge, “Green Supercomputing Comes of...CONFERENCE PAPER (Post Print) 3. DATES COVERED (From - To) JUN 2010 – MAY 2013 4 . TITLE AND SUBTITLE INTEGRATION, DEVELOPMENT AND PERFORMANCE OF...and streaming processing; the PlayStation 3 uses the IBM Cell BE processor, which adopts the multi-processor, single-instruction-multiple- data (SIMD
Analog hardware for learning neural networks

NASA Technical Reports Server (NTRS)

Eberhardt, Silvio P. (Inventor)

1991-01-01

This is a recurrent or feedforward analog neural network processor having a multi-level neuron array and a synaptic matrix for storing weighted analog values of synaptic connection strengths which is characterized by temporarily changing one connection strength at a time to determine its effect on system output relative to the desired target. That connection strength is then adjusted based on the effect, whereby the processor is taught the correct response to training examples connection by connection.
A parallel algorithm for generation and assembly of finite element stiffness and mass matrices

NASA Technical Reports Server (NTRS)

Storaasli, O. O.; Carmona, E. A.; Nguyen, D. T.; Baddourah, M. A.

1991-01-01

A new algorithm is proposed for parallel generation and assembly of the finite element stiffness and mass matrices. The proposed assembly algorithm is based on a node-by-node approach rather than the more conventional element-by-element approach. The new algorithm's generality and computation speed-up when using multiple processors are demonstrated for several practical applications on multi-processor Cray Y-MP and Cray 2 supercomputers.
A Modified Distributed Bees Algorithm for Multi-Sensor Task Allocation.

PubMed

Tkach, Itshak; Jevtić, Aleksandar; Nof, Shimon Y; Edan, Yael

2018-03-02

Multi-sensor systems can play an important role in monitoring tasks and detecting targets. However, real-time allocation of heterogeneous sensors to dynamic targets/tasks that are unknown a priori in their locations and priorities is a challenge. This paper presents a Modified Distributed Bees Algorithm (MDBA) that is developed to allocate stationary heterogeneous sensors to upcoming unknown tasks using a decentralized, swarm intelligence approach to minimize the task detection times. Sensors are allocated to tasks based on sensors' performance, tasks' priorities, and the distances of the sensors from the locations where the tasks are being executed. The algorithm was compared to a Distributed Bees Algorithm (DBA), a Bees System, and two common multi-sensor algorithms, market-based and greedy-based algorithms, which were fitted for the specific task. Simulation analyses revealed that MDBA achieved statistically significant improved performance by 7% with respect to DBA as the second-best algorithm, and by 19% with respect to Greedy algorithm, which was the worst, thus indicating its fitness to provide solutions for heterogeneous multi-sensor systems.
Advantages of Task-Specific Multi-Objective Optimisation in Evolutionary Robotics.

PubMed

Trianni, Vito; López-Ibáñez, Manuel

2015-01-01

The application of multi-objective optimisation to evolutionary robotics is receiving increasing attention. A survey of the literature reveals the different possibilities it offers to improve the automatic design of efficient and adaptive robotic systems, and points to the successful demonstrations available for both task-specific and task-agnostic approaches (i.e., with or without reference to the specific design problem to be tackled). However, the advantages of multi-objective approaches over single-objective ones have not been clearly spelled out and experimentally demonstrated. This paper fills this gap for task-specific approaches: starting from well-known results in multi-objective optimisation, we discuss how to tackle commonly recognised problems in evolutionary robotics. In particular, we show that multi-objective optimisation (i) allows evolving a more varied set of behaviours by exploring multiple trade-offs of the objectives to optimise, (ii) supports the evolution of the desired behaviour through the introduction of objectives as proxies, (iii) avoids the premature convergence to local optima possibly introduced by multi-component fitness functions, and (iv) solves the bootstrap problem exploiting ancillary objectives to guide evolution in the early phases. We present an experimental demonstration of these benefits in three different case studies: maze navigation in a single robot domain, flocking in a swarm robotics context, and a strictly collaborative task in collective robotics.
Stochastic airspace simulation tool development

DOT National Transportation Integrated Search

2009-10-01

Modeling and simulation is often used to study : the physical world when observation may not be : practical. The overall goal of a recent and ongoing : simulation tool project has been to provide a : documented, lifecycle-managed, multi-processor : c...
Data recording and playback on video tape--a multi-channel analog interface for a digital audio processor system.

PubMed

Blaettler, M; Bruegger, A; Forster, I C; Lehareinger, Y

1988-03-01

The design of an analog interface to a digital audio signal processor (DASP)-video cassette recorder (VCR) system is described. The complete system represents a low-cost alternative to both FM instrumentation tape recorders and multi-channel chart recorders. The interface or DASP input-output unit described in this paper enables the recording and playback of up to 12 analog channels with a maximum of 12 bit resolution and a bandwidth of 2 kHz per channel. Internal control and timing in the recording component of the interface is performed using ROMs which can be reprogrammed to suit different analog-to-digital converter hardware. Improvement in the bandwidth specifications is possible by connecting channels in parallel. A parallel 16 bit data output port is provided for direct transfer of the digitized data to a computer.
A manipulative instrument with simultaneous gesture and end-effector trajectory planning and controlling

NASA Astrophysics Data System (ADS)

Lin, Hsien-I.; Nguyen, Xuan-Anh

2017-05-01

To operate a redundant manipulator to accomplish the end-effector trajectory planning and simultaneously control its gesture in online programming, incorporating the human motion is a useful and flexible option. This paper focuses on a manipulative instrument that can simultaneously control its arm gesture and end-effector trajectory via human teleoperation. The instrument can be classified by two parts; first, for the human motion capture and data processing, marker systems are proposed to capture human gesture. Second, the manipulator kinematics control is implemented by an augmented multi-tasking method, and forward and backward reaching inverse kinematics, respectively. Especially, the local-solution and divergence problems of a multi-tasking method are resolved by the proposed augmented multi-tasking method. Computer simulations and experiments with a 7-DOF (degree of freedom) redundant manipulator were used to validate the proposed method. Comparison among the single-tasking, original multi-tasking, and augmented multi-tasking algorithms were performed and the result showed that the proposed augmented method had a good end-effector position accuracy and the most similar gesture to the human gesture. Additionally, the experimental results showed that the proposed instrument was realized online.
The advanced receiver 2: Telemetry test results in CTA 21

NASA Technical Reports Server (NTRS)

Hinedi, S.; Bevan, R.; Marina, M.

1991-01-01

Telemetry tests with the Advanced Receiver II (ARX II) in Compatibility Test Area 21 are described. The ARX II was operated in parallel with a Block-III Receiver/baseband processor assembly combination (BLK-III/BPA) and a Block III Receiver/subcarrier demodulation assembly/symbol synchronization assembly combination (BLK-III/SDA/SSA). The telemetry simulator assembly provided the test signal for all three configurations, and the symbol signal to noise ratio as well as the symbol error rates were measured and compared. Furthermore, bit error rates were also measured by the system performance test computer for all three systems. Results indicate that the ARX-II telemetry performance is comparable and sometimes superior to the BLK-III/BPA and BLK-III/SDA/SSA combinations.
Modeling and simulation of dynamic ant colony's labor division for task allocation of UAV swarm

NASA Astrophysics Data System (ADS)

Wu, Husheng; Li, Hao; Xiao, Renbin; Liu, Jie

2018-02-01

The problem of unmanned aerial vehicle (UAV) task allocation not only has the intrinsic attribute of complexity, such as highly nonlinear, dynamic, highly adversarial and multi-modal, but also has a better practicability in various multi-agent systems, which makes it more and more attractive recently. In this paper, based on the classic fixed response threshold model (FRTM), under the idea of "problem centered + evolutionary solution" and by a bottom-up way, the new dynamic environmental stimulus, response threshold and transition probability are designed, and a dynamic ant colony's labor division (DACLD) model is proposed. DACLD allows a swarm of agents with a relatively low-level of intelligence to perform complex tasks, and has the characteristic of distributed framework, multi-tasks with execution order, multi-state, adaptive response threshold and multi-individual response. With the proposed model, numerical simulations are performed to illustrate the effectiveness of the distributed task allocation scheme in two situations of UAV swarm combat (dynamic task allocation with a certain number of enemy targets and task re-allocation due to unexpected threats). Results show that our model can get both the heterogeneous UAVs' real-time positions and states at the same time, and has high degree of self-organization, flexibility and real-time response to dynamic environments.
Modular countermine payload for small robots

NASA Astrophysics Data System (ADS)

Herman, Herman; Few, Doug; Versteeg, Roelof; Valois, Jean-Sebastien; McMahill, Jeff; Licitra, Michael; Henciak, Edward

2010-04-01

Payloads for small robotic platforms have historically been designed and implemented as platform and task specific solutions. A consequence of this approach is that payloads cannot be deployed on different robotic platforms without substantial re-engineering efforts. To address this issue, we developed a modular countermine payload that is designed from the ground-up to be platform agnostic. The payload consists of the multi-mission payload controller unit (PCU) coupled with the configurable mission specific threat detection, navigation and marking payloads. The multi-mission PCU has all the common electronics to control and interface to all the payloads. It also contains the embedded processor that can be used to run the navigational and control software. The PCU has a very flexible robot interface which can be configured to interface to various robot platforms. The threat detection payload consists of a two axis sweeping arm and the detector. The navigation payload consists of several perception sensors that are used for terrain mapping, obstacle detection and navigation. Finally, the marking payload consists of a dual-color paint marking system. Through the multimission PCU, all these payloads are packaged in a platform agnostic way to allow deployment on multiple robotic platforms, including Talon and Packbot.
Segment Fixed Priority Scheduling for Self Suspending Real Time Tasks

DTIC Science & Technology

2016-08-11

Segment-Fixed Priority Scheduling for Self-Suspending Real -Time Tasks Junsung Kim, Department of Electrical and Computer Engineering, Carnegie...4 2.1 Application of a Multi-Segment Self-Suspending Real -Time Task Model ............................. 5 3 Fixed Priority Scheduling...1 Figure 2: A multi-segment self-suspending real -time task model
A Modified Distributed Bees Algorithm for Multi-Sensor Task Allocation †

PubMed Central

Nof, Shimon Y.; Edan, Yael

2018-01-01

Multi-sensor systems can play an important role in monitoring tasks and detecting targets. However, real-time allocation of heterogeneous sensors to dynamic targets/tasks that are unknown a priori in their locations and priorities is a challenge. This paper presents a Modified Distributed Bees Algorithm (MDBA) that is developed to allocate stationary heterogeneous sensors to upcoming unknown tasks using a decentralized, swarm intelligence approach to minimize the task detection times. Sensors are allocated to tasks based on sensors’ performance, tasks’ priorities, and the distances of the sensors from the locations where the tasks are being executed. The algorithm was compared to a Distributed Bees Algorithm (DBA), a Bees System, and two common multi-sensor algorithms, market-based and greedy-based algorithms, which were fitted for the specific task. Simulation analyses revealed that MDBA achieved statistically significant improved performance by 7% with respect to DBA as the second-best algorithm, and by 19% with respect to Greedy algorithm, which was the worst, thus indicating its fitness to provide solutions for heterogeneous multi-sensor systems. PMID:29498683
System properties, feedback control and effector coordination of human temperature regulation.

PubMed

Werner, Jürgen

2010-05-01

The aim of human temperature regulation is to protect body processes by establishing a relative constancy of deep body temperature (regulated variable), in spite of external and internal influences on it. This is basically achieved by a distributed multi-sensor, multi-processor, multi-effector proportional feedback control system. The paper explains why proportional control implies inherent deviations of the regulated variable from the value in the thermoneutral zone. The concept of feedback of the thermal state of the body, conveniently represented by a high-weighted core temperature (T (c)) and low-weighted peripheral temperatures (T (s)) is equivalent to the control concept of "auxiliary feedback control", using a main (regulated) variable (T (c)), supported by an auxiliary variable (T (s)). This concept implies neither regulation of T (s) nor feedforward control. Steady-states result in the closed control-loop, when the open-loop properties of the (heat transfer) process are compatible with those of the thermoregulatory processors. They are called operating points or balance points and are achieved due to the inherent property of dynamical stability of the thermoregulatory feedback loop. No set-point and no comparison of signals (e.g. actual-set value) are necessary. Metabolic heat production and sweat production, though receiving the same information about the thermal state of the body, are independent effectors with different thresholds and gains. Coordination between one of these effectors and the vasomotor effector is achieved by the fact that changes in the (heat transfer) process evoked by vasomotor control are taken into account by the metabolic/sweat processor.
Distributed micro-radar system for detection and tracking of low-profile, low-altitude targets

NASA Astrophysics Data System (ADS)

Gorwara, Ashok; Molchanov, Pavlo

2016-05-01

Proposed airborne surveillance radar system can detect, locate, track, and classify low-profile, low-altitude targets: from traditional fixed and rotary wing aircraft to non-traditional targets like unmanned aircraft systems (drones) and even small projectiles. Distributed micro-radar system is the next step in the development of passive monopulse direction finder proposed by Stephen E. Lipsky in the 80s. To extend high frequency limit and provide high sensitivity over the broadband of frequencies, multiple angularly spaced directional antennas are coupled with front end circuits and separately connected to a direction finder processor by a digital interface. Integration of antennas with front end circuits allows to exclude waveguide lines which limits system bandwidth and creates frequency dependent phase errors. Digitizing of received signals proximate to antennas allows loose distribution of antennas and dramatically decrease phase errors connected with waveguides. Accuracy of direction finding in proposed micro-radar in this case will be determined by time accuracy of digital processor and sampling frequency. Multi-band, multi-functional antennas can be distributed around the perimeter of a Unmanned Aircraft System (UAS) and connected to the processor by digital interface or can be distributed between swarm/formation of mini/micro UAS and connected wirelessly. Expendable micro-radars can be distributed by perimeter of defense object and create multi-static radar network. Low-profile, lowaltitude, high speed targets, like small projectiles, create a Doppler shift in a narrow frequency band. This signal can be effectively filtrated and detected with high probability. Proposed micro-radar can work in passive, monostatic or bistatic regime.

Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification.

PubMed

Yong Luo; Yonggang Wen; Dacheng Tao; Jie Gui; Chao Xu

2016-01-01

The features used in many image analysis-based applications are frequently of very high dimension. Feature extraction offers several advantages in high-dimensional cases, and many recent studies have used multi-task feature extraction approaches, which often outperform single-task feature extraction approaches. However, most of these methods are limited in that they only consider data represented by a single type of feature, even though features usually represent images from multiple modalities. We, therefore, propose a novel large margin multi-modal multi-task feature extraction (LM3FE) framework for handling multi-modal features for image classification. In particular, LM3FE simultaneously learns the feature extraction matrix for each modality and the modality combination coefficients. In this way, LM3FE not only handles correlated and noisy features, but also utilizes the complementarity of different modalities to further help reduce feature redundancy in each modality. The large margin principle employed also helps to extract strongly predictive features, so that they are more suitable for prediction (e.g., classification). An alternating algorithm is developed for problem optimization, and each subproblem can be efficiently solved. Experiments on two challenging real-world image data sets demonstrate the effectiveness and superiority of the proposed method.
Multi-task transfer learning deep convolutional neural network: application to computer-aided diagnosis of breast cancer on mammograms

NASA Astrophysics Data System (ADS)

Samala, Ravi K.; Chan, Heang-Ping; Hadjiiski, Lubomir M.; Helvie, Mark A.; Cha, Kenny H.; Richter, Caleb D.

2017-12-01

Transfer learning in deep convolutional neural networks (DCNNs) is an important step in its application to medical imaging tasks. We propose a multi-task transfer learning DCNN with the aim of translating the ‘knowledge’ learned from non-medical images to medical diagnostic tasks through supervised training and increasing the generalization capabilities of DCNNs by simultaneously learning auxiliary tasks. We studied this approach in an important application: classification of malignant and benign breast masses. With Institutional Review Board (IRB) approval, digitized screen-film mammograms (SFMs) and digital mammograms (DMs) were collected from our patient files and additional SFMs were obtained from the Digital Database for Screening Mammography. The data set consisted of 2242 views with 2454 masses (1057 malignant, 1397 benign). In single-task transfer learning, the DCNN was trained and tested on SFMs. In multi-task transfer learning, SFMs and DMs were used to train the DCNN, which was then tested on SFMs. N-fold cross-validation with the training set was used for training and parameter optimization. On the independent test set, the multi-task transfer learning DCNN was found to have significantly (p = 0.007) higher performance compared to the single-task transfer learning DCNN. This study demonstrates that multi-task transfer learning may be an effective approach for training DCNN in medical imaging applications when training samples from a single modality are limited.
Improving multi-tasking ability through action videogames.

PubMed

Chiappe, Dan; Conger, Mark; Liao, Janet; Caldwell, J Lynn; Vu, Kim-Phuong L

2013-03-01

The present study examined whether action videogames can improve multi-tasking in high workload environments. Two groups with no action videogame experience were pre-tested using the Multi-Attribute Task Battery (MATB). It consists of two primary tasks; tracking and fuel management, and two secondary tasks; systems monitoring and communication. One group served as a control group, while a second played action videogames a minimum of 5 h a week for 10 weeks. Both groups returned for a post-assessment on the MATB. We found the videogame treatment enhanced performance on secondary tasks, without interfering with the primary tasks. Our results demonstrate action videogames can increase people's ability to take on additional tasks by increasing attentional capacity. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Landsat-1 and Landsat-2 flight evaluation

NASA Technical Reports Server (NTRS)

1975-01-01

The flight performance of Landsat 1 and Landsat 2 is analyzed. Flight operations of the satellites are briefly summarized. Other topics discussed include: orbital parameters; power subsystem; attitude control subsystem; command/clock subsystem; telemetry subsystem; orbit adjust subsystem; magnetic moment compensating assembly; unified s-band/premodulation processor; electrical interface subsystem; thermal subsystem; narrowband tape recorders; wideband telemetry subsystem; attitude measurement sensor; wideband video tape recorders; return beam vidicon; multispectral scanner subsystem; and data collection subsystem.
Synthetic Synchronisation: From Attention and Multi-Tasking to Negative Capability and Judgment

ERIC Educational Resources Information Center

Stables, Andrew

2013-01-01

Educational literature has tended to focus, explicitly and implicitly, on two kinds of task orientation: the ability either to focus on a single task, or to multi-task. A third form of orientation characterises many highly successful people. This is the ability to combine several tasks into one: to "kill two (or more) birds with one…
Parallel processing data network of master and slave transputers controlled by a serial control network

DOEpatents

Crosetto, D.B.

1996-12-31

The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor to a plurality of slave processors to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor`s status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer, a digital signal processor, a parallel transfer controller, and two three-port memory devices. A communication switch within each node connects it to a fast parallel hardware channel through which all high density data arrives or leaves the node. 6 figs.
On nonlinear finite element analysis in single-, multi- and parallel-processors

NASA Technical Reports Server (NTRS)

Utku, S.; Melosh, R.; Islam, M.; Salama, M.

1982-01-01

Numerical solution of nonlinear equilibrium problems of structures by means of Newton-Raphson type iterations is reviewed. Each step of the iteration is shown to correspond to the solution of a linear problem, therefore the feasibility of the finite element method for nonlinear analysis is established. Organization and flow of data for various types of digital computers, such as single-processor/single-level memory, single-processor/two-level-memory, vector-processor/two-level-memory, and parallel-processors, with and without sub-structuring (i.e. partitioning) are given. The effect of the relative costs of computation, memory and data transfer on substructuring is shown. The idea of assigning comparable size substructures to parallel processors is exploited. Under Cholesky type factorization schemes, the efficiency of parallel processing is shown to decrease due to the occasional shared data, just as that due to the shared facilities.
State University of New York Institute of Technology (SUNYIT) Summer Scholar Program

DTIC Science & Technology

2009-10-01

COVERED (From - To) March 2007 – April 2009 4 . TITLE AND SUBTITLE STATE UNIVERSITY OF NEW YORK INSTITUTE OF TECHNOLOGY (SUNYIT) SUMMER SCHOLAR...Even with access to the Arctic Regional Supercomputer Center (ARSC), evolving a 9/7 wavelet with four multi-resolution levels (MRA 4 ) involves...evaluated over the multiple processing elements in the Cell processor. It was tested on Cell processors in a Sony Playstation 3 and on an IBM QS20 blade
Hardware based redundant multi-threading inside a GPU for improved reliability

DOEpatents

Sridharan, Vilas; Gurumurthi, Sudhanva

2015-05-05

A system and method for verifying computation output using computer hardware are provided. Instances of computation are generated and processed on hardware-based processors. As instances of computation are processed, each instance of computation receives a load accessible to other instances of computation. Instances of output are generated by processing the instances of computation. The instances of output are verified against each other in a hardware based processor to ensure accuracy of the output.
Multi-element germanium detectors for synchrotron applications

DOE PAGES

Rumaiz, A. K.; Kuczewski, A. J.; Mead, J.; ...

2018-04-27

In this paper, we have developed a series of monolithic multi-element germanium detectors, based on sensor arrays produced by the Forschungzentrum Julich, and on Application-specific integrated circuits (ASICs) developed at Brookhaven. Devices have been made with element counts ranging from 64 to 384. These detectors are being used at NSLS-II and APS for a range of diffraction experiments, both monochromatic and energy-dispersive. Compact and powerful readout systems have been developed, based on the new generation of FPGA system-on-chip devices, which provide closely coupled multi-core processors embedded in large gate arrays. Finally, we will discuss the technical details of the systems,more » and present some of the results from them.« less
Multi-element germanium detectors for synchrotron applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rumaiz, A. K.; Kuczewski, A. J.; Mead, J.

In this paper, we have developed a series of monolithic multi-element germanium detectors, based on sensor arrays produced by the Forschungzentrum Julich, and on Application-specific integrated circuits (ASICs) developed at Brookhaven. Devices have been made with element counts ranging from 64 to 384. These detectors are being used at NSLS-II and APS for a range of diffraction experiments, both monochromatic and energy-dispersive. Compact and powerful readout systems have been developed, based on the new generation of FPGA system-on-chip devices, which provide closely coupled multi-core processors embedded in large gate arrays. Finally, we will discuss the technical details of the systems,more » and present some of the results from them.« less
Multiple-function multi-input/multi-output digital control and on-line analysis

NASA Technical Reports Server (NTRS)

Hoadley, Sherwood T.; Wieseman, Carol D.; Mcgraw, Sandra M.

1992-01-01

The design and capabilities of two digital controller systems for aeroelastic wind-tunnel models are described. The first allowed control of flutter while performing roll maneuvers with wing load control as well as coordinating the acquisition, storage, and transfer of data for on-line analysis. This system, which employs several digital signal multi-processor (DSP) boards programmed in high-level software languages, is housed in a SUN Workstation environment. A second DCS provides a measure of wind-tunnel safety by functioning as a trip system during testing in the case of high model dynamic response or in case the first DCS fails. The second DCS uses National Instruments LabVIEW Software and Hardware within a Macintosh environment.
Design and test of a regenerative satellite transmultiplexer

NASA Astrophysics Data System (ADS)

Hung, Kenny King-Ming

1993-05-01

In a multiple access scheme for regenerative satellite communications, the bulk frequency division multiple access (FDMA) uplink signal is demodulated on board the satellite and then remodulated for time division multiplexing (TDM) downlink transmission. Conversion from frequency to time division multiplex format requires that the uplink signal be frequency demultiplexed and each individual carrier be subsequently demodulated. For thin-route application which consists of a large number of channels with fixed data rate, multicarrier demodulation can be accomplished efficiently by a digital transmultiplexer (TMUX) using a fast Fourier transform processor followed by a bank of per-channel processors. A time domain description of the TMUX algorithm is derived which elucidates how the TMUX functions. The per-channel processor performs timing and carrier recovery for optimum and coherent data detection. Timing recovery is necessarily achieved asynchronously by a filter coefficient interpolation. Carrier recovery is performed using an all-digital phase-locked loop. The combination of both timing and carrier loops is investigated for a multi-user system. The performance of the overall system is assessed over a multi-user, additive white Gaussian noise channel for a bit energy to noise power spectral density ratio down to zero dB.
Corrosion Prediction with Parallel Finite Element Modeling for Coupled Hygro-Chemo Transport into Concrete under Chloride-Rich Environment

PubMed Central

Na, Okpin; Cai, Xiao-Chuan; Xi, Yunping

2017-01-01

The prediction of the chloride-induced corrosion is very important because of the durable life of concrete structure. To simulate more realistic durability performance of concrete structures, complex scientific methods and more accurate material models are needed. In order to predict the robust results of corrosion initiation time and to describe the thin layer from concrete surface to reinforcement, a large number of fine meshes are also used. The purpose of this study is to suggest more realistic physical model regarding coupled hygro-chemo transport and to implement the model with parallel finite element algorithm. Furthermore, microclimate model with environmental humidity and seasonal temperature is adopted. As a result, the prediction model of chloride diffusion under unsaturated condition was developed with parallel algorithms and was applied to the existing bridge to validate the model with multi-boundary condition. As the number of processors increased, the computational time decreased until the number of processors became optimized. Then, the computational time increased because the communication time between the processors increased. The framework of present model can be extended to simulate the multi-species de-icing salts ingress into non-saturated concrete structures in future work. PMID:28772714
Electrical Prototype Power Processor for the 30-cm Mercury electric propulsion engine

NASA Technical Reports Server (NTRS)

Biess, J. J.; Frye, R. J.

1978-01-01

An Electrical Prototpye Power Processor has been designed to the latest electrical and performance requirements for a flight-type 30-cm ion engine and includes all the necessary power, command, telemetry and control interfaces for a typical electric propulsion subsystem. The power processor was configured into seven separate mechanical modules that would allow subassembly fabrication, test and integration into a complete power processor unit assembly. The conceptual mechanical packaging of the electrical prototype power processor unit demonstrated the relative location of power, high voltage and control electronic components to minimize electrical interactions and to provide adequate thermal control in a vacuum environment. Thermal control was accomplished with a heat pipe simulator attached to the base of the modules.
The 20 kilovolt rocket borne electron accelerator. [equipment specifications

NASA Technical Reports Server (NTRS)

Harrison, R.

1973-01-01

The accelerator system is a preprogrammed multi-voltage system capable of operating at a current level of 1/2 ampere at the 20 kilovolt level. The five major functional areas which comprise this system are: (1) Silver zinc battery packs; (2) the electron gun assembly; (3) gun control and opening circuits; (4) the telemetry conditioning section; and (5) the power conversion section.
Development of Parallel Code for the Alaska Tsunami Forecast Model

NASA Astrophysics Data System (ADS)

Bahng, B.; Knight, W. R.; Whitmore, P.

2014-12-01

The Alaska Tsunami Forecast Model (ATFM) is a numerical model used to forecast propagation and inundation of tsunamis generated by earthquakes and other means in both the Pacific and Atlantic Oceans. At the U.S. National Tsunami Warning Center (NTWC), the model is mainly used in a pre-computed fashion. That is, results for hundreds of hypothetical events are computed before alerts, and are accessed and calibrated with observations during tsunamis to immediately produce forecasts. ATFM uses the non-linear, depth-averaged, shallow-water equations of motion with multiply nested grids in two-way communications between domains of each parent-child pair as waves get closer to coastal waters. Even with the pre-computation the task becomes non-trivial as sub-grid resolution gets finer. Currently, the finest resolution Digital Elevation Models (DEM) used by ATFM are 1/3 arc-seconds. With a serial code, large or multiple areas of very high resolution can produce run-times that are unrealistic even in a pre-computed approach. One way to increase the model performance is code parallelization used in conjunction with a multi-processor computing environment. NTWC developers have undertaken an ATFM code-parallelization effort to streamline the creation of the pre-computed database of results with the long term aim of tsunami forecasts from source to high resolution shoreline grids in real time. Parallelization will also permit timely regeneration of the forecast model database with new DEMs; and, will make possible future inclusion of new physics such as the non-hydrostatic treatment of tsunami propagation. The purpose of our presentation is to elaborate on the parallelization approach and to show the compute speed increase on various multi-processor systems.
Concurrent Learning of Control in Multi agent Sequential Decision Tasks

DTIC Science & Technology

2018-04-17

Concurrent Learning of Control in Multi-agent Sequential Decision Tasks The overall objective of this project was to develop multi-agent reinforcement...learning (MARL) approaches for intelligent agents to autonomously learn distributed control policies in decentral- ized partially observable...shall be subject to any oenalty for failing to comply with a collection of information if it does not display a currently valid OMB control number
Advantages of Task-Specific Multi-Objective Optimisation in Evolutionary Robotics

PubMed Central

Trianni, Vito; López-Ibáñez, Manuel

2015-01-01

The application of multi-objective optimisation to evolutionary robotics is receiving increasing attention. A survey of the literature reveals the different possibilities it offers to improve the automatic design of efficient and adaptive robotic systems, and points to the successful demonstrations available for both task-specific and task-agnostic approaches (i.e., with or without reference to the specific design problem to be tackled). However, the advantages of multi-objective approaches over single-objective ones have not been clearly spelled out and experimentally demonstrated. This paper fills this gap for task-specific approaches: starting from well-known results in multi-objective optimisation, we discuss how to tackle commonly recognised problems in evolutionary robotics. In particular, we show that multi-objective optimisation (i) allows evolving a more varied set of behaviours by exploring multiple trade-offs of the objectives to optimise, (ii) supports the evolution of the desired behaviour through the introduction of objectives as proxies, (iii) avoids the premature convergence to local optima possibly introduced by multi-component fitness functions, and (iv) solves the bootstrap problem exploiting ancillary objectives to guide evolution in the early phases. We present an experimental demonstration of these benefits in three different case studies: maze navigation in a single robot domain, flocking in a swarm robotics context, and a strictly collaborative task in collective robotics. PMID:26295151
Design and evaluation of a fault-tolerant multiprocessor using hardware recovery blocks

NASA Technical Reports Server (NTRS)

Lee, Y. H.; Shin, K. G.

1982-01-01

A fault-tolerant multiprocessor with a rollback recovery mechanism is discussed. The rollback mechanism is based on the hardware recovery block which is a hardware equivalent to the software recovery block. The hardware recovery block is constructed by consecutive state-save operations and several state-save units in every processor and memory module. When a fault is detected, the multiprocessor reconfigures itself to replace the faulty component and then the process originally assigned to the faulty component retreats to one of the previously saved states in order to resume fault-free execution. A mathematical model is proposed to calculate both the coverage of multi-step rollback recovery and the risk of restart. A performance evaluation in terms of task execution time is also presented.

Development of land based radar polarimeter processor system

NASA Technical Reports Server (NTRS)

Kronke, C. W.; Blanchard, A. J.

1983-01-01

The processing subsystem of a land based radar polarimeter was designed and constructed. This subsystem is labeled the remote data acquisition and distribution system (RDADS). The radar polarimeter, an experimental remote sensor, incorporates the RDADS to control all operations of the sensor. The RDADS uses industrial standard components including an 8-bit microprocessor based single board computer, analog input/output boards, a dynamic random access memory board, and power supplis. A high-speed digital electronics board was specially designed and constructed to control range-gating for the radar. A complete system of software programs was developed to operate the RDADS. The software uses a powerful real time, multi-tasking, executive package as an operating system. The hardware and software used in the RDADS are detailed. Future system improvements are recommended.
A multi-threaded version of MCFM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Campbell, John M.; Ellis, R. Keith; Giele, Walter T.

We report on our findings modifying MCFM using OpenMP to implement multi-threading. By using OpenMP, the modified MCFM will execute on any processor, automatically adjusting to the number of available threads. We then modified the integration routine VEGAS to distribute the event evaluation over the threads, while combining all events at the end of every iteration to optimize the numerical integration. Furthermore, we took special care so that the results of the Monte Carlo integration were independent of the number of threads used, to facilitate the validation of the OpenMP version of MCFM.
The Department of Defense Superconductivity Research and Development (DSRD) Options. A Study of Possible Directions for Exploitation of Superconductivity in Military Applications.

DTIC Science & Technology

1987-07-01

transmission lines Low - noise mm wave detectors, mixers and amplifiers Multi-GHz chirp transform processors High performance small antenna arrays Multi-GHz A/D...attractive alternative. The overall advantages for HTS mm wave receivers are very- low quantum-limited noise , wide bandwidth, low electrical power...0 0 3 2 1 6 6.3A 0 0 0 2 -3 S Total 2 2 4 S 4 17 116 10, ELF Communication (far term). Extremely low frequency communication via magnetic wave has
Advanced graphical user interface for multi-physics simulations using AMST

NASA Astrophysics Data System (ADS)

Hoffmann, Florian; Vogel, Frank

2017-07-01

Numerical modelling of particulate matter has gained much popularity in recent decades. Advanced Multi-physics Simulation Technology (AMST) is a state-of-the-art three dimensional numerical modelling technique combining the eX-tended Discrete Element Method (XDEM) with Computational Fluid Dynamics (CFD) and Finite Element Analysis (FEA) [1]. One major limitation of this code is the lack of a graphical user interface (GUI) meaning that all pre-processing has to be made directly in a HDF5-file. This contribution presents the first graphical pre-processor developed for AMST.
Multi-processing on supercomputers for computational aerodynamics

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Mehta, Unmeel B.

1990-01-01

The MIMD concept is applied, through multitasking, with relatively minor modifications to an existing code for a single processor. This approach maps the available memory to multiple processors, exploiting the C-FORTRAN-Unix interface. An existing single processor algorithm is mapped without the need for developing a new algorithm. The procedure of designing a code utilizing this approach is automated with the Unix stream editor. A Multiple Processor Multiple Grid (MPMG) code is developed as a demonstration of this approach. This code solves the three-dimensional, Reynolds-averaged, thin-layer and slender-layer Navier-Stokes equations with an implicit, approximately factored and diagonalized method. This solver is applied to a generic, oblique-wing aircraft problem on a four-processor computer using one process for data management and nonparallel computations and three processes for pseudotime advance on three different grid systems.
Dynamic Task Allocation in Multi-Hop Multimedia Wireless Sensor Networks with Low Mobility

PubMed Central

Jin, Yichao; Vural, Serdar; Gluhak, Alexander; Moessner, Klaus

2013-01-01

This paper presents a task allocation-oriented framework to enable efficient in-network processing and cost-effective multi-hop resource sharing for dynamic multi-hop multimedia wireless sensor networks with low node mobility, e.g., pedestrian speeds. The proposed system incorporates a fast task reallocation algorithm to quickly recover from possible network service disruptions, such as node or link failures. An evolutional self-learning mechanism based on a genetic algorithm continuously adapts the system parameters in order to meet the desired application delay requirements, while also achieving a sufficiently long network lifetime. Since the algorithm runtime incurs considerable time delay while updating task assignments, we introduce an adaptive window size to limit the delay periods and ensure an up-to-date solution based on node mobility patterns and device processing capabilities. To the best of our knowledge, this is the first study that yields multi-objective task allocation in a mobile multi-hop wireless environment under dynamic conditions. Simulations are performed in various settings, and the results show considerable performance improvement in extending network lifetime compared to heuristic mechanisms. Furthermore, the proposed framework provides noticeable reduction in the frequency of missing application deadlines. PMID:24135992
Design and Analysis of Self-Adapted Task Scheduling Strategies in Wireless Sensor Networks

PubMed Central

Guo, Wenzhong; Xiong, Naixue; Chao, Han-Chieh; Hussain, Sajid; Chen, Guolong

2011-01-01

In a wireless sensor network (WSN), the usage of resources is usually highly related to the execution of tasks which consume a certain amount of computing and communication bandwidth. Parallel processing among sensors is a promising solution to provide the demanded computation capacity in WSNs. Task allocation and scheduling is a typical problem in the area of high performance computing. Although task allocation and scheduling in wired processor networks has been well studied in the past, their counterparts for WSNs remain largely unexplored. Existing traditional high performance computing solutions cannot be directly implemented in WSNs due to the limitations of WSNs such as limited resource availability and the shared communication medium. In this paper, a self-adapted task scheduling strategy for WSNs is presented. First, a multi-agent-based architecture for WSNs is proposed and a mathematical model of dynamic alliance is constructed for the task allocation problem. Then an effective discrete particle swarm optimization (PSO) algorithm for the dynamic alliance (DPSO-DA) with a well-designed particle position code and fitness function is proposed. A mutation operator which can effectively improve the algorithm’s ability of global search and population diversity is also introduced in this algorithm. Finally, the simulation results show that the proposed solution can achieve significant better performance than other algorithms. PMID:22163971
High performance, low cost, self-contained, multipurpose PC based ground systems

NASA Technical Reports Server (NTRS)

Forman, Michael; Nickum, William; Troendly, Gregory

1993-01-01

The use of embedded processors greatly enhances the capabilities of personal computers when used for telemetry processing and command control center functions. Parallel architectures based on the use of transputers are shown to be very versatile and reusable, and the synergism between the PC and the embedded processor with transputers results in single unit, low cost workstations of 20 less than MIPS less than or equal to 1000.
Multi-Attribute Sequential Search

ERIC Educational Resources Information Center

Bearden, J. Neil; Connolly, Terry

2007-01-01

This article describes empirical and theoretical results from two multi-attribute sequential search tasks. In both tasks, the DM sequentially encounters options described by two attributes and must pay to learn the values of the attributes. In the "continuous" version of the task the DM learns the precise numerical value of an attribute when she…
Hierarchical algorithms for modeling the ocean on hierarchical architectures

NASA Astrophysics Data System (ADS)

Hill, C. N.

2012-12-01

This presentation will describe an approach to using accelerator/co-processor technology that maps hierarchical, multi-scale modeling techniques to an underlying hierarchical hardware architecture. The focus of this work is on making effective use of both CPU and accelerator/co-processor parts of a system, for large scale ocean modeling. In the work, a lower resolution basin scale ocean model is locally coupled to multiple, "embedded", limited area higher resolution sub-models. The higher resolution models execute on co-processor/accelerator hardware and do not interact directly with other sub-models. The lower resolution basin scale model executes on the system CPU(s). The result is a multi-scale algorithm that aligns with hardware designs in the co-processor/accelerator space. We demonstrate this approach being used to substitute explicit process models for standard parameterizations. Code for our sub-models is implemented through a generic abstraction layer, so that we can target multiple accelerator architectures with different programming environments. We will present two application and implementation examples. One uses the CUDA programming environment and targets GPU hardware. This example employs a simple non-hydrostatic two dimensional sub-model to represent vertical motion more accurately. The second example uses a highly threaded three-dimensional model at high resolution. This targets a MIC/Xeon Phi like environment and uses sub-models as a way to explicitly compute sub-mesoscale terms. In both cases the accelerator/co-processor capability provides extra compute cycles that allow improved model fidelity for little or no extra wall-clock time cost.
Efficient computation of the phylogenetic likelihood function on multi-gene alignments and multi-core architectures.

PubMed

Stamatakis, Alexandros; Ott, Michael

2008-12-27

The continuous accumulation of sequence data, for example, due to novel wet-laboratory techniques such as pyrosequencing, coupled with the increasing popularity of multi-gene phylogenies and emerging multi-core processor architectures that face problems of cache congestion, poses new challenges with respect to the efficient computation of the phylogenetic maximum-likelihood (ML) function. Here, we propose two approaches that can significantly speed up likelihood computations that typically represent over 95 per cent of the computational effort conducted by current ML or Bayesian inference programs. Initially, we present a method and an appropriate data structure to efficiently compute the likelihood score on 'gappy' multi-gene alignments. By 'gappy' we denote sampling-induced gaps owing to missing sequences in individual genes (partitions), i.e. not real alignment gaps. A first proof-of-concept implementation in RAXML indicates that this approach can accelerate inferences on large and gappy alignments by approximately one order of magnitude. Moreover, we present insights and initial performance results on multi-core architectures obtained during the transition from an OpenMP-based to a Pthreads-based fine-grained parallelization of the ML function.
Comparing the Performance of Two Dynamic Load Distribution Methods

NASA Technical Reports Server (NTRS)

Kale, L. V.

1987-01-01

Parallel processing of symbolic computations on a message-passing multi-processor presents one challenge: To effectively utilize the available processors, the load must be distributed uniformly to all the processors. However, the structure of these computations cannot be predicted in advance. go, static scheduling methods are not applicable. In this paper, we compare the performance of two dynamic, distributed load balancing methods with extensive simulation studies. The two schemes are: the Contracting Within a Neighborhood (CWN) scheme proposed by us, and the Gradient Model proposed by Lin and Keller. We conclude that although simpler, the CWN is significantly more effective at distributing the work than the Gradient model.
Custom instruction set NIOS-based OFDM processor for FPGAs

NASA Astrophysics Data System (ADS)

Meyer-Bäse, Uwe; Sunkara, Divya; Castillo, Encarnacion; Garcia, Antonio

2006-05-01

Orthogonal Frequency division multiplexing (OFDM) spread spectrum technique, sometimes also called multi-carrier or discrete multi-tone modulation, are used in bandwidth-efficient communication systems in the presence of channel distortion. The benefits of OFDM are high spectral efficiency, resiliency to RF interference, and lower multi-path distortion. OFDM is the basis for the European digital audio broadcasting (DAB) standard, the global asymmetric digital subscriber line (ADSL) standard, in the IEEE 802.11 5.8 GHz band standard, and ongoing development in wireless local area networks. The modulator and demodulator in an OFDM system can be implemented by use of a parallel bank of filters based on the discrete Fourier transform (DFT), in case the number of subchannels is large (e.g. K > 25), the OFDM system are efficiently implemented by use of the fast Fourier transform (FFT) to compute the DFT. We have developed a custom FPGA-based Altera NIOS system to increase the performance, programmability, and low power in mobil wireless systems. The overall gain observed for a 1024-point FFT ranges depending on the multiplier used by the NIOS processor between a factor of 3 and 16. A careful optimization described in the appendix yield a performance gain of up to 77% when compared with our preliminary results.
Multi-threaded ATLAS simulation on Intel Knights Landing processors

NASA Astrophysics Data System (ADS)

Farrell, Steven; Calafiura, Paolo; Leggett, Charles; Tsulaia, Vakhtang; Dotti, Andrea; ATLAS Collaboration

2017-10-01

The Knights Landing (KNL) release of the Intel Many Integrated Core (MIC) Xeon Phi line of processors is a potential game changer for HEP computing. With 72 cores and deep vector registers, the KNL cards promise significant performance benefits for highly-parallel, compute-heavy applications. Cori, the newest supercomputer at the National Energy Research Scientific Computing Center (NERSC), was delivered to its users in two phases with the first phase online at the end of 2015 and the second phase now online at the end of 2016. Cori Phase 2 is based on the KNL architecture and contains over 9000 compute nodes with 96GB DDR4 memory. ATLAS simulation with the multithreaded Athena Framework (AthenaMT) is a good potential use-case for the KNL architecture and supercomputers like Cori. ATLAS simulation jobs have a high ratio of CPU computation to disk I/O and have been shown to scale well in multi-threading and across many nodes. In this paper we will give an overview of the ATLAS simulation application with details on its multi-threaded design. Then, we will present a performance analysis of the application on KNL devices and compare it to a traditional x86 platform to demonstrate the capabilities of the architecture and evaluate the benefits of utilizing KNL platforms like Cori for ATLAS production.
Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications

NASA Astrophysics Data System (ADS)

Francés, J.; Otero, B.; Bleda, S.; Gallego, S.; Neipp, C.; Márquez, A.; Beléndez, A.

2015-06-01

The Finite-Difference Time-Domain (FDTD) method is applied to the analysis of vibroacoustic problems and to study the propagation of longitudinal and transversal waves in a stratified media. The potential of the scheme and the relevance of each acceleration strategy for massively computations in FDTD are demonstrated in this work. In this paper, we propose two new specific implementations of the bi-dimensional scheme of the FDTD method using multi-CPU and multi-GPU, respectively. In the first implementation, an open source message passing interface (OMPI) has been included in order to massively exploit the resources of a biprocessor station with two Intel Xeon processors. Moreover, regarding CPU code version, the streaming SIMD extensions (SSE) and also the advanced vectorial extensions (AVX) have been included with shared memory approaches that take advantage of the multi-core platforms. On the other hand, the second implementation called the multi-GPU code version is based on Peer-to-Peer communications available in CUDA on two GPUs (NVIDIA GTX 670). Subsequently, this paper presents an accurate analysis of the influence of the different code versions including shared memory approaches, vector instructions and multi-processors (both CPU and GPU) and compares them in order to delimit the degree of improvement of using distributed solutions based on multi-CPU and multi-GPU. The performance of both approaches was analysed and it has been demonstrated that the addition of shared memory schemes to CPU computing improves substantially the performance of vector instructions enlarging the simulation sizes that use efficiently the cache memory of CPUs. In this case GPU computing is slightly twice times faster than the fine tuned CPU version in both cases one and two nodes. However, for massively computations explicit vector instructions do not worth it since the memory bandwidth is the limiting factor and the performance tends to be the same than the sequential version with auto-vectorisation and also shared memory approach. In this scenario GPU computing is the best option since it provides a homogeneous behaviour. More specifically, the speedup of GPU computing achieves an upper limit of 12 for both one and two GPUs, whereas the performance reaches peak values of 80 GFlops and 146 GFlops for the performance for one GPU and two GPUs respectively. Finally, the method is applied to an earth crust profile in order to demonstrate the potential of our approach and the necessity of applying acceleration strategies in these type of applications.
Performance of VPIC on Sequoia

NASA Astrophysics Data System (ADS)

Nystrom, William

2014-10-01

Sequoia is a major DOE computing resource which is characteristic of future resources in that it has many threads per compute node, 64, and the individual processor cores are simpler and less powerful than cores on previous processors like Intel's Sandy Bridge or AMD's Opteron. An effort is in progress to port VPIC to the Blue Gene Q architecture of Sequoia and evaluate its performance. Results of this work will be presented on single node performance of VPIC as well as multi-node scaling.
Multi-cluster processor operating only select number of clusters during each phase based on program statistic monitored at predetermined intervals

DOEpatents

Balasubramonian, Rajeev [Sandy, UT; Dwarkadas, Sandhya [Rochester, NY; Albonesi, David [Ithaca, NY

2009-02-10

In a processor having multiple clusters which operate in parallel, the number of clusters in use can be varied dynamically. At the start of each program phase, the configuration option for an interval is run to determine the optimal configuration, which is used until the next phase change is detected. The optimum instruction interval is determined by starting with a minimum interval and doubling it until a low stability factor is reached.
The Use of a UNIX-Based Workstation in the Information Systems Laboratory

DTIC Science & Technology

1989-03-01

system. The conclusions of the research and the resulting recommendations are presented in Chapter III. These recommendations include how to manage...required to run the program on a new system, these should not be significant changes. 2. Processing Environment The UNIX processing environment is...interactive with multi-tasking and multi-user capabilities. Multi-tasking refers to the fact that many programs can be run concurrently. This capability
The place-value of a digit in multi-digit numbers is processed automatically.

PubMed

Kallai, Arava Y; Tzelgov, Joseph

2012-09-01

The automatic processing of the place-value of digits in a multi-digit number was investigated in 4 experiments. Experiment 1 and two control experiments employed a numerical comparison task in which the place-value of a non-zero digit was varied in a string composed of zeros. Experiment 2 employed a physical comparison task in which strings of digits varied in their physical sizes. In both types of tasks, the place-value of the non-zero digit in the string was irrelevant to the task performed. Interference of the place-value information was found in both tasks. When the non-zero digit occupied a lower place-value, it was recognized slower as a larger digit or as written in a larger font size. We concluded that place-value in a multi-digit number is processed automatically. These results support the notion of a decomposed representation of multi-digit numbers in memory. PsycINFO Database Record (c) 2012 APA, all rights reserved.
Parallel Task Management Library for MARTe

NASA Astrophysics Data System (ADS)

Valcarcel, Daniel F.; Alves, Diogo; Neto, Andre; Reux, Cedric; Carvalho, Bernardo B.; Felton, Robert; Lomas, Peter J.; Sousa, Jorge; Zabeo, Luca

2014-06-01

The Multithreaded Application Real-Time executor (MARTe) is a real-time framework with increasing popularity and support in the thermonuclear fusion community. It allows modular code to run in a multi-threaded environment leveraging on the current multi-core processor (CPU) technology. One application that relies on the MARTe framework is the Joint European Torus (JET) tokamak WAll Load Limiter System (WALLS). It calculates and monitors the temperature on metal tiles and plasma facing components (PFCs) that can melt or flake if their temperature gets too high when exposed to power loads. One of the main time consuming tasks in WALLS is the calculation of thermal diffusion models in real-time. These models tend to be described by very large state-space models thus making them perfect candidates for parallelisation. MARTe's traditional approach for task parallelisation is to split the problem into several Real-Time Threads, each responsible for a self-contained sequential execution of an input-to-output chain. This is usually possible, but it might not always be practical for algorithmic or technical reasons. Also, it might not be easily scalable with an increase in the number of available CPU cores. The WorkLibrary introduces a “GPU-like approach” of splitting work among the available cores of modern CPUs that is (i) straightforward to use in an application, (ii) scalable with the availability of cores and all of this (iii) without rewriting or recompiling the source code. The first part of this article explains the motivation behind the library, its architecture and implementation. The second part presents a real application for WALLS, a parallel version of a large state-space model describing the 2D thermal diffusion on a JET tile.

Performance impact of mutation operators of a subpopulation-based genetic algorithm for multi-robot task allocation problems.

PubMed

Liu, Chun; Kroll, Andreas

2016-01-01

Multi-robot task allocation determines the task sequence and distribution for a group of robots in multi-robot systems, which is one of constrained combinatorial optimization problems and more complex in case of cooperative tasks because they introduce additional spatial and temporal constraints. To solve multi-robot task allocation problems with cooperative tasks efficiently, a subpopulation-based genetic algorithm, a crossover-free genetic algorithm employing mutation operators and elitism selection in each subpopulation, is developed in this paper. Moreover, the impact of mutation operators (swap, insertion, inversion, displacement, and their various combinations) is analyzed when solving several industrial plant inspection problems. The experimental results show that: (1) the proposed genetic algorithm can obtain better solutions than the tested binary tournament genetic algorithm with partially mapped crossover; (2) inversion mutation performs better than other tested mutation operators when solving problems without cooperative tasks, and the swap-inversion combination performs better than other tested mutation operators/combinations when solving problems with cooperative tasks. As it is difficult to produce all desired effects with a single mutation operator, using multiple mutation operators (including both inversion and swap) is suggested when solving similar combinatorial optimization problems.
Multi-task learning with group information for human action recognition

NASA Astrophysics Data System (ADS)

Qian, Li; Wu, Song; Pu, Nan; Xu, Shulin; Xiao, Guoqiang

2018-04-01

Human action recognition is an important and challenging task in computer vision research, due to the variations in human motion performance, interpersonal differences and recording settings. In this paper, we propose a novel multi-task learning framework with group information (MTL-GI) for accurate and efficient human action recognition. Specifically, we firstly obtain group information through calculating the mutual information according to the latent relationship between Gaussian components and action categories, and clustering similar action categories into the same group by affinity propagation clustering. Additionally, in order to explore the relationships of related tasks, we incorporate group information into multi-task learning. Experimental results evaluated on two popular benchmarks (UCF50 and HMDB51 datasets) demonstrate the superiority of our proposed MTL-GI framework.
Development of the Multi-Level Seismic Receiver (MLSR)

NASA Astrophysics Data System (ADS)

Sleefe, G. E.; Engler, B. P.; Drozda, P. M.; Franco, R. J.; Morgan, Jeff

1995-02-01

The Advanced Geophysical Technology Department (6114) and the Telemetry Technology Development Department (2664) have, in conjunction with the Oil Recovery Technology Partnership, developed a Multi-Level Seismic Receiver (MLSR) for use in crosswell seismic surveys. The MLSR was designed and evaluated with the significant support of many industry partners in the oil exploration industry. The unit was designed to record and process superior quality seismic data operating in severe borehole environments, including high temperature (up to 200 C) and static pressure (10,000 psi). This development has utilized state-of-the-art technology in transducers, data acquisition, and real-time data communication and data processing. The mechanical design of the receiver has been carefully modeled and evaluated to insure excellent signal coupling into the receiver.
Ionospheric Multi-Point Measurements Using Tethered Satellite Sensors

NASA Technical Reports Server (NTRS)

Gilchrist, B. E.; Heelis, R. A.; Raitt, W. J.

1998-01-01

Many scientific questions concerning the distribution of electromagnetic fields and plasma structures in the ionosphere require measurements over relatively small temporal and spatial scales with as little ambiguity as possible. It is also often necessary to differentiate several geophysical parameters between horizontal and vertical gradients unambiguously. The availability of multiple tethered satellites or sensors, so-called "pearls-on-a-string," may make the necessary measurements practical. In this report we provide two examples of scientific questions which could benefit from such measurements (1) high-latitude magnetospheric-ionospheric coupling; and, (2) plasma structure impact on large and small-scale electrodynamics. Space tether state-of-the-art and special technical considerations addressing mission lifetime, sensor pointing, and multi-stream telemetry are reviewed.
Metacognition of Multi-Tasking: How Well Do We Predict the Costs of Divided Attention?

PubMed Central

Finley, Jason R.; Benjamin, Aaron S.; McCarley, Jason S.

2014-01-01

Risky multi-tasking, such as texting while driving, may occur because people misestimate the costs of divided attention. In two experiments, participants performed a computerized visual-manual tracking task in which they attempted to keep a mouse cursor within a small target that moved erratically around a circular track. They then separately performed an auditory n-back task. After practicing both tasks separately, participants received feedback on their single-task tracking performance and predicted their dual-task tracking performance before finally performing the two tasks simultaneously. Most participants correctly predicted reductions in tracking performance under dual-task conditions, with a majority overestimating the costs of dual-tasking. However, the between-subjects correlation between predicted and actual performance decrements was near zero. This combination of results suggests that people do anticipate costs of multi-tasking, but have little metacognitive insight on the extent to which they are personally vulnerable to the risks of divided attention, relative to other people. PMID:24490818
Effects of two doses of glucose and a caffeine-glucose combination on cognitive performance and mood during multi-tasking.

PubMed

Scholey, Andrew; Savage, Karen; O'Neill, Barry V; Owen, Lauren; Stough, Con; Priestley, Caroline; Wetherell, Mark

2014-09-01

This study assessed the effects of two doses of glucose and a caffeine-glucose combination on mood and performance of an ecologically valid, computerised multi-tasking platform. Following a double-blind, placebo-controlled, randomised, parallel-groups design, 150 healthy adults (mean age 34.78 years) consumed drinks containing placebo, 25 g glucose, 60 g glucose or 60 g glucose with 40 mg caffeine. They completed a multi-tasking framework at baseline and then 30 min following drink consumption with mood assessments immediately before and after the multi-tasking framework. Blood glucose and salivary caffeine were co-monitored. The caffeine-glucose group had significantly better total multi-tasking scores than the placebo or 60 g glucose groups and were significantly faster at mental arithmetic tasks than either glucose drink group. There were no significant treatment effects on mood. Caffeine and glucose levels confirmed compliance with overnight abstinence/fasting, respectively, and followed the predicted post-drink patterns. These data suggest that co-administration of glucose and caffeine allows greater allocation of attentional resources than placebo or glucose alone. At present, we cannot rule out the possibility that the effects are due to caffeine alone Future studies should aim at disentangling caffeine and glucose effects. © 2014 The Authors. Human Psychopharmacology: Clinical and Experimental published by John Wiley & Sons, Ltd.
Effects of two doses of glucose and a caffeine–glucose combination on cognitive performance and mood during multi-tasking

PubMed Central

Scholey, Andrew; Savage, Karen; O'Neill, Barry V; Owen, Lauren; Stough, Con; Priestley, Caroline; Wetherell, Mark

2014-01-01

Background This study assessed the effects of two doses of glucose and a caffeine–glucose combination on mood and performance of an ecologically valid, computerised multi-tasking platform. Materials and methods Following a double-blind, placebo-controlled, randomised, parallel-groups design, 150 healthy adults (mean age 34.78 years) consumed drinks containing placebo, 25 g glucose, 60 g glucose or 60 g glucose with 40 mg caffeine. They completed a multi-tasking framework at baseline and then 30 min following drink consumption with mood assessments immediately before and after the multi-tasking framework. Blood glucose and salivary caffeine were co-monitored. Results The caffeine–glucose group had significantly better total multi-tasking scores than the placebo or 60 g glucose groups and were significantly faster at mental arithmetic tasks than either glucose drink group. There were no significant treatment effects on mood. Caffeine and glucose levels confirmed compliance with overnight abstinence/fasting, respectively, and followed the predicted post-drink patterns. Conclusion These data suggest that co-administration of glucose and caffeine allows greater allocation of attentional resources than placebo or glucose alone. At present, we cannot rule out the possibility that the effects are due to caffeine alone Future studies should aim at disentangling caffeine and glucose effects. PMID:25196040
Parallel processing data network of master and slave transputers controlled by a serial control network

DOEpatents

Crosetto, Dario B.

1996-01-01

The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor (100) to a plurality of slave processors (200) to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor's status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer (104), a digital signal processor (114), a parallel transfer controller (106), and two three-port memory devices. A communication switch (108) within each node (100) connects it to a fast parallel hardware channel (70) through which all high density data arrives or leaves the node.
LANDSAT-2 and LANDSAT-3 Flight evaluation report

NASA Technical Reports Server (NTRS)

Winchester, T. W.

1978-01-01

Flight performance analysis of LANDSAT 2 and LANDSAT 3 are presented for the period July 1978 to October 1978. Spacecraft operations and orbital parameters are summarized for each spacecraft. Data are provided on the performance and operation of the following subsystems onboard the spacecraft: power; attitude control; command/clock; telemetry; orbit adjust; magnetic moment compensating assembly; unified S band/premodulation processor; electrical interface; thermal narrowband tape recorders; wideband telemetry; attitude measurement sensor; wideband video tape recorders; return beam vidicon; multispectral scanner subsystem; and data collections.
A system-level view of optimizing high-channel-count wireless biosignal telemetry.

PubMed

Chandler, Rodney J; Gibson, Sarah; Karkare, Vaibhav; Farshchi, Shahin; Marković, Dejan; Judy, Jack W

2009-01-01

In this paper we perform a system-level analysis of a wireless biosignal telemetry system. We perform an analysis of each major system component (e.g., analog front end, analog-to-digital converter, digital signal processor, and wireless link), in which we consider physical, algorithmic, and design limitations. Since there are a wide range applications for wireless biosignal telemetry systems, each with their own unique set of requirements for key parameters (e.g., channel count, power dissipation, noise level, number of bits, etc.), our analysis is equally broad. The net result is a set of plots, in which the power dissipation for each component and as the system as a whole, are plotted as a function of the number of channels for different architectural strategies. These results are also compared to existing implementations of complete wireless biosignal telemetry systems.
Multi-gigabit optical interconnects for next-generation on-board digital equipment

NASA Astrophysics Data System (ADS)

Venet, Norbert; Favaro, Henri; Sotom, Michel; Maignan, Michel; Berthon, Jacques

2017-11-01

Parallel optical interconnects are experimentally assessed as a technology that may offer the high-throughput data communication capabilities required to the next-generation on-board digital processing units. An optical backplane interconnect was breadboarded, on the basis of a digital transparent processor that provides flexible connectivity and variable bandwidth in telecom missions with multi-beam antenna coverage. The unit selected for the demonstration required that more than tens of Gbit/s be supported by the backplane. The demonstration made use of commercial parallel optical link modules at 850 nm wavelength, with 12 channels running at up to 2.5 Gbit/s. A flexible optical fibre circuit was developed so as to route board-to-board connections. It was plugged to the optical transmitter and receiver modules through 12-fibre MPO connectors. BER below 10-14 and optical link budgets in excess of 12 dB were measured, which would enable to integrate broadcasting. Integration of the optical backplane interconnect was successfully demonstrated by validating the overall digital processor functionality.
Multi-gigabit optical interconnects for next-generation on-board digital equipment

NASA Astrophysics Data System (ADS)

Venet, Norbert; Favaro, Henri; Sotom, Michel; Maignan, Michel; Berthon, Jacques

2004-06-01

Parallel optical interconnects are experimentally assessed as a technology that may offer the high-throughput data communication capabilities required to the next-generation on-board digital processing units. An optical backplane interconnect was breadboarded, on the basis of a digital transparent processor that provides flexible connectivity and variable bandwidth in telecom missions with multi-beam antenna coverage. The unit selected for the demonstration required that more than tens of Gbit/s be supported by the backplane. The demonstration made use of commercial parallel optical link modules at 850 nm wavelength, with 12 channels running at up to 2.5 Gbit/s. A flexible optical fibre circuit was developed so as to route board-to-board connections. It was plugged to the optical transmitter and receiver modules through 12-fibre MPO connectors. BER below 10-14 and optical link budgets in excess of 12 dB were measured, which would enable to integrate broadcasting. Integration of the optical backplane interconnect was successfully demonstrated by validating the overall digital processor functionality.
Multi-task Gaussian process for imputing missing data in multi-trait and multi-environment trials.

PubMed

Hori, Tomoaki; Montcho, David; Agbangla, Clement; Ebana, Kaworu; Futakuchi, Koichi; Iwata, Hiroyoshi

2016-11-01

A method based on a multi-task Gaussian process using self-measuring similarity gave increased accuracy for imputing missing phenotypic data in multi-trait and multi-environment trials. Multi-environmental trial (MET) data often encounter the problem of missing data. Accurate imputation of missing data makes subsequent analysis more effective and the results easier to understand. Moreover, accurate imputation may help to reduce the cost of phenotyping for thinned-out lines tested in METs. METs are generally performed for multiple traits that are correlated to each other. Correlation among traits can be useful information for imputation, but single-trait-based methods cannot utilize information shared by traits that are correlated. In this paper, we propose imputation methods based on a multi-task Gaussian process (MTGP) using self-measuring similarity kernels reflecting relationships among traits, genotypes, and environments. This framework allows us to use genetic correlation among multi-trait multi-environment data and also to combine MET data and marker genotype data. We compared the accuracy of three MTGP methods and iterative regularized PCA using rice MET data. Two scenarios for the generation of missing data at various missing rates were considered. The MTGP performed a better imputation accuracy than regularized PCA, especially at high missing rates. Under the 'uniform' scenario, in which missing data arise randomly, inclusion of marker genotype data in the imputation increased the imputation accuracy at high missing rates. Under the 'fiber' scenario, in which missing data arise in all traits for some combinations between genotypes and environments, the inclusion of marker genotype data decreased the imputation accuracy for most traits while increasing the accuracy in a few traits remarkably. The proposed methods will be useful for solving the missing data problem in MET data.
Development and Evaluation of Vectorised and Multi-Core Event Reconstruction Algorithms within the CMS Software Framework

NASA Astrophysics Data System (ADS)

Hauth, T.; Innocente and, V.; Piparo, D.

2012-12-01

The processing of data acquired by the CMS detector at LHC is carried out with an object-oriented C++ software framework: CMSSW. With the increasing luminosity delivered by the LHC, the treatment of recorded data requires extraordinary large computing resources, also in terms of CPU usage. A possible solution to cope with this task is the exploitation of the features offered by the latest microprocessor architectures. Modern CPUs present several vector units, the capacity of which is growing steadily with the introduction of new processor generations. Moreover, an increasing number of cores per die is offered by the main vendors, even on consumer hardware. Most recent C++ compilers provide facilities to take advantage of such innovations, either by explicit statements in the programs sources or automatically adapting the generated machine instructions to the available hardware, without the need of modifying the existing code base. Programming techniques to implement reconstruction algorithms and optimised data structures are presented, that aim to scalable vectorization and parallelization of the calculations. One of their features is the usage of new language features of the C++11 standard. Portions of the CMSSW framework are illustrated which have been found to be especially profitable for the application of vectorization and multi-threading techniques. Specific utility components have been developed to help vectorization and parallelization. They can easily become part of a larger common library. To conclude, careful measurements are described, which show the execution speedups achieved via vectorised and multi-threaded code in the context of CMSSW.
Two-dimensional systolic-array architecture for pixel-level vision tasks

NASA Astrophysics Data System (ADS)

Vijverberg, Julien A.; de With, Peter H. N.

2010-05-01

This paper presents ongoing work on the design of a two-dimensional (2D) systolic array for image processing. This component is designed to operate on a multi-processor system-on-chip. In contrast with other 2D systolic-array architectures and many other hardware accelerators, we investigate the applicability of executing multiple tasks in a time-interleaved fashion on the Systolic Array (SA). This leads to a lower external memory bandwidth and better load balancing of the tasks on the different processing tiles. To enable the interleaving of tasks, we add a shadow-state register for fast task switching. To reduce the number of accesses to the external memory, we propose to share the communication assist between consecutive tasks. A preliminary, non-functional version of the SA has been synthesized for an XV4S25 FPGA device and yields a maximum clock frequency of 150 MHz requiring 1,447 slices and 5 memory blocks. Mapping tasks from video content-analysis applications from literature on the SA yields reductions in the execution time of 1-2 orders of magnitude compared to the software implementation. We conclude that the choice for an SA architecture is useful, but a scaled version of the SA featuring less logic with fewer processing and pipeline stages yielding a lower clock frequency, would be sufficient for a video analysis system-on-chip.
Robust media processing on programmable power-constrained systems

NASA Astrophysics Data System (ADS)

McVeigh, Jeff

2005-03-01

To achieve consumer-level quality, media systems must process continuous streams of audio and video data while maintaining exacting tolerances on sampling rate, jitter, synchronization, and latency. While it is relatively straightforward to design fixed-function hardware implementations to satisfy worst-case conditions, there is a growing trend to utilize programmable multi-tasking solutions for media applications. The flexibility of these systems enables support for multiple current and future media formats, which can reduce design costs and time-to-market. This paper provides practical engineering solutions to achieve robust media processing on such systems, with specific attention given to power-constrained platforms. The techniques covered in this article utilize the fundamental concepts of algorithm and software optimization, software/hardware partitioning, stream buffering, hierarchical prioritization, and system resource and power management. A novel enhancement to dynamically adjust processor voltage and frequency based on buffer fullness to reduce system power consumption is examined in detail. The application of these techniques is provided in a case study of a portable video player implementation based on a general-purpose processor running a non real-time operating system that achieves robust playback of synchronized H.264 video and MP3 audio from local storage and streaming over 802.11.
A universal computer control system for motors

NASA Technical Reports Server (NTRS)

Szakaly, Zoltan F. (Inventor)

1991-01-01

A control system for a multi-motor system such as a space telerobot, having a remote computational node and a local computational node interconnected with one another by a high speed data link is described. A Universal Computer Control System (UCCS) for the telerobot is located at each node. Each node is provided with a multibus computer system which is characterized by a plurality of processors with all processors being connected to a common bus, and including at least one command processor. The command processor communicates over the bus with a plurality of joint controller cards. A plurality of direct current torque motors, of the type used in telerobot joints and telerobot hand-held controllers, are connected to the controller cards and responds to digital control signals from the command processor. Essential motor operating parameters are sensed by analog sensing circuits and the sensed analog signals are converted to digital signals for storage at the controller cards where such signals can be read during an address read/write cycle of the command processing processor.
Vascular system modeling in parallel environment - distributed and shared memory approaches

PubMed Central

Jurczuk, Krzysztof; Kretowski, Marek; Bezy-Wendling, Johanne

2011-01-01

The paper presents two approaches in parallel modeling of vascular system development in internal organs. In the first approach, new parts of tissue are distributed among processors and each processor is responsible for perfusing its assigned parts of tissue to all vascular trees. Communication between processors is accomplished by passing messages and therefore this algorithm is perfectly suited for distributed memory architectures. The second approach is designed for shared memory machines. It parallelizes the perfusion process during which individual processing units perform calculations concerning different vascular trees. The experimental results, performed on a computing cluster and multi-core machines, show that both algorithms provide a significant speedup. PMID:21550891
A hybrid algorithm for parallel molecular dynamics simulations

NASA Astrophysics Data System (ADS)

Mangiardi, Chris M.; Meyer, R.

2017-10-01

This article describes algorithms for the hybrid parallelization and SIMD vectorization of molecular dynamics simulations with short-range forces. The parallelization method combines domain decomposition with a thread-based parallelization approach. The goal of the work is to enable efficient simulations of very large (tens of millions of atoms) and inhomogeneous systems on many-core processors with hundreds or thousands of cores and SIMD units with large vector sizes. In order to test the efficiency of the method, simulations of a variety of configurations with up to 74 million atoms have been performed. Results are shown that were obtained on multi-core systems with Sandy Bridge and Haswell processors as well as systems with Xeon Phi many-core processors.
MILC Code Performance on High End CPU and GPU Supercomputer Clusters

NASA Astrophysics Data System (ADS)

DeTar, Carleton; Gottlieb, Steven; Li, Ruizi; Toussaint, Doug

2018-03-01

With recent developments in parallel supercomputing architecture, many core, multi-core, and GPU processors are now commonplace, resulting in more levels of parallelism, memory hierarchy, and programming complexity. It has been necessary to adapt the MILC code to these new processors starting with NVIDIA GPUs, and more recently, the Intel Xeon Phi processors. We report on our efforts to port and optimize our code for the Intel Knights Landing architecture. We consider performance of the MILC code with MPI and OpenMP, and optimizations with QOPQDP and QPhiX. For the latter approach, we concentrate on the staggered conjugate gradient and gauge force. We also consider performance on recent NVIDIA GPUs using the QUDA library.

Multi-kilowatt modularized spacecraft power processing system development

NASA Technical Reports Server (NTRS)

Andrews, R. E.; Hayden, J. H.; Hedges, R. T.; Rehmann, D. W.

1975-01-01

A review of existing information pertaining to spacecraft power processing systems and equipment was accomplished with a view towards applicability to the modularization of multi-kilowatt power processors. Power requirements for future spacecraft were determined from the NASA mission model-shuttle systems payload data study which provided the limits for modular power equipment capabilities. Three power processing systems were compared to evaluation criteria to select the system best suited for modularity. The shunt regulated direct energy transfer system was selected by this analysis for a conceptual design effort which produced equipment specifications, schematics, envelope drawings, and power module configurations.
Scalability and Portability of Two Parallel Implementations of ADI

NASA Technical Reports Server (NTRS)

Phung, Thanh; VanderWijngaart, Rob F.

1994-01-01

Two domain decompositions for the implementation of the NAS Scalar Penta-diagonal Parallel Benchmark on MIMD systems are investigated, namely transposition and multi-partitioning. Hardware platforms considered are the Intel iPSC/860 and Paragon XP/S-15, and clusters of SGI workstations on ethernet, communicating through PVM. It is found that the multi-partitioning strategy offers the kind of coarse granularity that allows scaling up to hundreds of processors on a massively parallel machine. Moreover, efficiency is retained when the code is ported verbatim (save message passing syntax) to a PVM environment on a modest size cluster of workstations.
Development of the SEASIS instrument for SEDSAT

NASA Technical Reports Server (NTRS)

Maier, Mark W.

1996-01-01

Two SEASIS experiment objectives are key: take images that allow three axis attitude determination and take multi-spectral images of the earth. During the tether mission it is also desirable to capture images for the recoiling tether from the endmass perspective (which has never been observed). SEASIS must store all its imagery taken during the tether mission until the earth downlink can be established. SEASIS determines attitude with a panoramic camera and performs earth observation with a telephoto lens camera. Camera video is digitized, compressed, and stored in solid state memory. These objectives are addressed through the following architectural choices: (1) A camera system using a Panoramic Annular Lens (PAL). This lens has a 360 deg. azimuthal field of view by a +45 degree vertical field measured from a plan normal to the lens boresight axis. It has been shown in Mr. Mark Steadham's UAH M.S. thesis that his camera can determine three axis attitude anytime the earth and one other recognizable celestial object (for example, the sun) is in the field of view. This will be essentially all the time during tether deployment. (2) A second camera system using telephoto lens and filter wheel. The camera is a black and white standard video camera. The filters are chosen to cover the visible spectral bands of remote sensing interest. (3) A processor and mass memory arrangement linked to the cameras. Video signals from the cameras are digitized, compressed in the processor, and stored in a large static RAM bank. The processor is a multi-chip module consisting of a T800 Transputer and three Zoran floating point Digital Signal Processors. This processor module was supplied under ARPA contract by the Space Computer Corporation to demonstrate its use in space.
An Updated Version of the U.S. Air Force Multi-Attribute Task Battery (AF-MATB)

DTIC Science & Technology

2014-08-01

assessing human performance in a controlled multitask environment. The most recent release of AF-MATB contains numerous improvements and additions...Strategic Behavior, MATB, Multitasking , Task Battery, Simulator, Multi-Attribute Task Battery, Automation 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...performance and multitasking strategy. As a result, a specific Information Throughput (IT) Mode was designed to customize the task to fit the Human
Technology transfer of military space microprocessor developments

NASA Astrophysics Data System (ADS)

Gorden, C.; King, D.; Byington, L.; Lanza, D.

1999-01-01

Over the past 13 years the Air Force Research Laboratory (AFRL) has led the development of microprocessors and computers for USAF space and strategic missile applications. As a result of these Air Force development programs, advanced computer technology is available for use by civil and commercial space customers as well. The Generic VHSIC Spaceborne Computer (GVSC) program began in 1985 at AFRL to fulfill a deficiency in the availability of space-qualified data and control processors. GVSC developed a radiation hardened multi-chip version of the 16-bit, Mil-Std 1750A microprocessor. The follow-on to GVSC, the Advanced Spaceborne Computer Module (ASCM) program, was initiated by AFRL to establish two industrial sources for complete, radiation-hardened 16-bit and 32-bit computers and microelectronic components. Development of the Control Processor Module (CPM), the first of two ASCM contract phases, concluded in 1994 with the availability of two sources for space-qualified, 16-bit Mil-Std-1750A computers, cards, multi-chip modules, and integrated circuits. The second phase of the program, the Advanced Technology Insertion Module (ATIM), was completed in December 1997. ATIM developed two single board computers based on 32-bit reduced instruction set computer (RISC) processors. GVSC, CPM, and ATIM technologies are flying or baselined into the majority of today's DoD, NASA, and commercial satellite systems.
The Use of Field Programmable Gate Arrays (FPGA) in Small Satellite Communication Systems

NASA Technical Reports Server (NTRS)

Varnavas, Kosta; Sims, William Herbert; Casas, Joseph

2015-01-01

This paper will describe the use of digital Field Programmable Gate Arrays (FPGA) to contribute to advancing the state-of-the-art in software defined radio (SDR) transponder design for the emerging SmallSat and CubeSat industry and to provide advances for NASA as described in the TAO5 Communication and Navigation Roadmap (Ref 4). The use of software defined radios (SDR) has been around for a long time. A typical implementation of the SDR is to use a processor and write software to implement all the functions of filtering, carrier recovery, error correction, framing etc. Even with modern high speed and low power digital signal processors, high speed memories, and efficient coding, the compute intensive nature of digital filters, error correcting and other algorithms is too much for modern processors to get efficient use of the available bandwidth to the ground. By using FPGAs, these compute intensive tasks can be done in parallel, pipelined fashion and more efficiently use every clock cycle to significantly increase throughput while maintaining low power. These methods will implement digital radios with significant data rates in the X and Ka bands. Using these state-of-the-art technologies, unprecedented uplink and downlink capabilities can be achieved in a 1/2 U sized telemetry system. Additionally, modern FPGAs have embedded processing systems, such as ARM cores, integrated inside the FPGA allowing mundane tasks such as parameter commanding to occur easily and flexibly. Potential partners include other NASA centers, industry and the DOD. These assets are associated with small satellite demonstration flights, LEO and deep space applications. MSFC currently has an SDR transponder test-bed using Hardware-in-the-Loop techniques to evaluate and improve SDR technologies.
Time series modeling of human operator dynamics in manual control tasks

NASA Technical Reports Server (NTRS)

Biezad, D. J.; Schmidt, D. K.

1984-01-01

A time-series technique is presented for identifying the dynamic characteristics of the human operator in manual control tasks from relatively short records of experimental data. Control of system excitation signals used in the identification is not required. The approach is a multi-channel identification technique for modeling multi-input/multi-output situations. The method presented includes statistical tests for validity, is designed for digital computation, and yields estimates for the frequency responses of the human operator. A comprehensive relative power analysis may also be performed for validated models. This method is applied to several sets of experimental data; the results are discussed and shown to compare favorably with previous research findings. New results are also presented for a multi-input task that has not been previously modeled to demonstrate the strengths of the method.
Time Series Modeling of Human Operator Dynamics in Manual Control Tasks

NASA Technical Reports Server (NTRS)

Biezad, D. J.; Schmidt, D. K.

1984-01-01

A time-series technique is presented for identifying the dynamic characteristics of the human operator in manual control tasks from relatively short records of experimental data. Control of system excitation signals used in the identification is not required. The approach is a multi-channel identification technique for modeling multi-input/multi-output situations. The method presented includes statistical tests for validity, is designed for digital computation, and yields estimates for the frequency response of the human operator. A comprehensive relative power analysis may also be performed for validated models. This method is applied to several sets of experimental data; the results are discussed and shown to compare favorably with previous research findings. New results are also presented for a multi-input task that was previously modeled to demonstrate the strengths of the method.
Control system of the inspection robots group applying auctions and multi-criteria analysis for task allocation

NASA Astrophysics Data System (ADS)

Panfil, Wawrzyniec; Moczulski, Wojciech

2017-10-01

In the paper presented is a control system of a mobile robots group intended for carrying out inspection missions. The main research problem was to define such a control system in order to facilitate a cooperation of the robots resulting in realization of the committed inspection tasks. Many of the well-known control systems use auctions for tasks allocation, where a subject of an auction is a task to be allocated. It seems that in the case of missions characterized by much larger number of tasks than number of robots it will be better if robots (instead of tasks) are subjects of auctions. The second identified problem concerns the one-sided robot-to-task fitness evaluation. Simultaneous assessment of the robot-to-task fitness and task attractiveness for robot should affect positively for the overall effectiveness of the multi-robot system performance. The elaborated system allows to assign tasks to robots using various methods for evaluation of fitness between robots and tasks, and using some tasks allocation methods. There is proposed the method for multi-criteria analysis, which is composed of two assessments, i.e. robot's concurrency position for task among other robots and task's attractiveness for robot among other tasks. Furthermore, there are proposed methods for tasks allocation applying the mentioned multi-criteria analysis method. The verification of both the elaborated system and the proposed tasks' allocation methods was carried out with the help of simulated experiments. The object under test was a group of inspection mobile robots being a virtual counterpart of the real mobile-robot group.
Power processor for a 30cm ion thruster

NASA Technical Reports Server (NTRS)

Biess, J. J.; Inouye, L. Y.

1974-01-01

A thermal vacuum power processor for the NASA Lewis 30cm Mercury Ion Engine was designed, fabricated and tested to determine compliance with electrical specifications. The power processor breadboard used the silicon controlled rectifier (SCR) series resonant inverter as the basic power stage to process all the power to an ion engine. The power processor includes a digital interface unit to process all input commands and internal telemetry signals so that operation is compatible with a central computer system. The breadboard was tested in a thermal vacuum environment. Integration tests were performed with the ion engine and demonstrate operational compatibility and reliable operation without any component failures. Electromagnetic interference data were also recorded on the design to provide information on the interaction with total spacecraft.
Production Task Queue Optimization Based on Multi-Attribute Evaluation for Complex Product Assembly Workshop.

PubMed

Li, Lian-Hui; Mo, Rong

2015-01-01

The production task queue has a great significance for manufacturing resource allocation and scheduling decision. Man-made qualitative queue optimization method has a poor effect and makes the application difficult. A production task queue optimization method is proposed based on multi-attribute evaluation. According to the task attributes, the hierarchical multi-attribute model is established and the indicator quantization methods are given. To calculate the objective indicator weight, criteria importance through intercriteria correlation (CRITIC) is selected from three usual methods. To calculate the subjective indicator weight, BP neural network is used to determine the judge importance degree, and then the trapezoid fuzzy scale-rough AHP considering the judge importance degree is put forward. The balanced weight, which integrates the objective weight and the subjective weight, is calculated base on multi-weight contribution balance model. The technique for order preference by similarity to an ideal solution (TOPSIS) improved by replacing Euclidean distance with relative entropy distance is used to sequence the tasks and optimize the queue by the weighted indicator value. A case study is given to illustrate its correctness and feasibility.
Production Task Queue Optimization Based on Multi-Attribute Evaluation for Complex Product Assembly Workshop

PubMed Central

Li, Lian-hui; Mo, Rong

2015-01-01

The production task queue has a great significance for manufacturing resource allocation and scheduling decision. Man-made qualitative queue optimization method has a poor effect and makes the application difficult. A production task queue optimization method is proposed based on multi-attribute evaluation. According to the task attributes, the hierarchical multi-attribute model is established and the indicator quantization methods are given. To calculate the objective indicator weight, criteria importance through intercriteria correlation (CRITIC) is selected from three usual methods. To calculate the subjective indicator weight, BP neural network is used to determine the judge importance degree, and then the trapezoid fuzzy scale-rough AHP considering the judge importance degree is put forward. The balanced weight, which integrates the objective weight and the subjective weight, is calculated base on multi-weight contribution balance model. The technique for order preference by similarity to an ideal solution (TOPSIS) improved by replacing Euclidean distance with relative entropy distance is used to sequence the tasks and optimize the queue by the weighted indicator value. A case study is given to illustrate its correctness and feasibility. PMID:26414758
Parallel, stochastic measurement of molecular surface area.

PubMed

Juba, Derek; Varshney, Amitabh

2008-08-01

Biochemists often wish to compute surface areas of proteins. A variety of algorithms have been developed for this task, but they are designed for traditional single-processor architectures. The current trend in computer hardware is towards increasingly parallel architectures for which these algorithms are not well suited. We describe a parallel, stochastic algorithm for molecular surface area computation that maps well to the emerging multi-core architectures. Our algorithm is also progressive, providing a rough estimate of surface area immediately and refining this estimate as time goes on. Furthermore, the algorithm generates points on the molecular surface which can be used for point-based rendering. We demonstrate a GPU implementation of our algorithm and show that it compares favorably with several existing molecular surface computation programs, giving fast estimates of the molecular surface area with good accuracy.
HyperForest: A high performance multi-processor architecture for real-time intelligent systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Garcia, P. Jr.; Rebeil, J.P.; Pollard, H.

1997-04-01

Intelligent Systems are characterized by the intensive use of computer power. The computer revolution of the last few years is what has made possible the development of the first generation of Intelligent Systems. Software for second generation Intelligent Systems will be more complex and will require more powerful computing engines in order to meet real-time constraints imposed by new robots, sensors, and applications. A multiprocessor architecture was developed that merges the advantages of message-passing and shared-memory structures: expendability and real-time compliance. The HyperForest architecture will provide an expandable real-time computing platform for computationally intensive Intelligent Systems and open the doorsmore » for the application of these systems to more complex tasks in environmental restoration and cleanup projects, flexible manufacturing systems, and DOE`s own production and disassembly activities.« less
Multi-Task Learning with Low Rank Attribute Embedding for Multi-Camera Person Re-Identification.

PubMed

Su, Chi; Yang, Fan; Zhang, Shiliang; Tian, Qi; Davis, Larry Steven; Gao, Wen

2018-05-01

We propose Multi-Task Learning with Low Rank Attribute Embedding (MTL-LORAE) to address the problem of person re-identification on multi-cameras. Re-identifications on different cameras are considered as related tasks, which allows the shared information among different tasks to be explored to improve the re-identification accuracy. The MTL-LORAE framework integrates low-level features with mid-level attributes as the descriptions for persons. To improve the accuracy of such description, we introduce the low-rank attribute embedding, which maps original binary attributes into a continuous space utilizing the correlative relationship between each pair of attributes. In this way, inaccurate attributes are rectified and missing attributes are recovered. The resulting objective function is constructed with an attribute embedding error and a quadratic loss concerning class labels. It is solved by an alternating optimization strategy. The proposed MTL-LORAE is tested on four datasets and is validated to outperform the existing methods with significant margins.
A microcomputer based frequency-domain processor for laser Doppler anemometry

NASA Technical Reports Server (NTRS)

Horne, W. Clifton; Adair, Desmond

1988-01-01

A prototype multi-channel laser Doppler anemometry (LDA) processor was assembled using a wideband transient recorder and a microcomputer with an array processor for fast Fourier transform (FFT) computations. The prototype instrument was used to acquire, process, and record signals from a three-component wind tunnel LDA system subject to various conditions of noise and flow turbulence. The recorded data was used to evaluate the effectiveness of burst acceptance criteria, processing algorithms, and selection of processing parameters such as record length. The recorded signals were also used to obtain comparative estimates of signal-to-noise ratio between time-domain and frequency-domain signal detection schemes. These comparisons show that the FFT processing scheme allows accurate processing of signals for which the signal-to-noise ratio is 10 to 15 dB less than is practical using counter processors.
HD-MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition.

PubMed

Fan, Jianping; Zhao, Tianyi; Kuang, Zhenzhong; Zheng, Yu; Zhang, Ji; Yu, Jun; Peng, Jinye

2017-02-09

In this paper, a hierarchical deep multi-task learning (HD-MTL) algorithm is developed to support large-scale visual recognition (e.g., recognizing thousands or even tens of thousands of atomic object classes automatically). First, multiple sets of multi-level deep features are extracted from different layers of deep convolutional neural networks (deep CNNs), and they are used to achieve more effective accomplishment of the coarseto- fine tasks for hierarchical visual recognition. A visual tree is then learned by assigning the visually-similar atomic object classes with similar learning complexities into the same group, which can provide a good environment for determining the interrelated learning tasks automatically. By leveraging the inter-task relatedness (inter-class similarities) to learn more discriminative group-specific deep representations, our deep multi-task learning algorithm can train more discriminative node classifiers for distinguishing the visually-similar atomic object classes effectively. Our hierarchical deep multi-task learning (HD-MTL) algorithm can integrate two discriminative regularization terms to control the inter-level error propagation effectively, and it can provide an end-to-end approach for jointly learning more representative deep CNNs (for image representation) and more discriminative tree classifier (for large-scale visual recognition) and updating them simultaneously. Our incremental deep learning algorithms can effectively adapt both the deep CNNs and the tree classifier to the new training images and the new object classes. Our experimental results have demonstrated that our HD-MTL algorithm can achieve very competitive results on improving the accuracy rates for large-scale visual recognition.
Multi-robot task allocation based on two dimensional artificial fish swarm algorithm

NASA Astrophysics Data System (ADS)

Zheng, Taixiong; Li, Xueqin; Yang, Liangyi

2007-12-01

The problem of task allocation for multiple robots is to allocate more relative-tasks to less relative-robots so as to minimize the processing time of these tasks. In order to get optimal multi-robot task allocation scheme, a twodimensional artificial swarm algorithm based approach is proposed in this paper. In this approach, the normal artificial fish is extended to be two dimension artificial fish. In the two dimension artificial fish, each vector of primary artificial fish is extended to be an m-dimensional vector. Thus, each vector can express a group of tasks. By redefining the distance between artificial fish and the center of artificial fish, the behavior of two dimension fish is designed and the task allocation algorithm based on two dimension artificial swarm algorithm is put forward. At last, the proposed algorithm is applied to the problem of multi-robot task allocation and comparer with GA and SA based algorithm is done. Simulation and compare result shows the proposed algorithm is effective.
Implementation of a level 1 trigger system using high speed serial (VXS) techniques for the 12GeV high luminosity experimental programs at Thomas Jefferson National Accelerator Facility

DOE Office of Scientific and Technical Information (OSTI.GOV)

C. Cuevas, B. Raydo, H. Dong, A. Gupta, F.J. Barbosa, J. Wilson, W.M. Taylor, E. Jastrzembski, D. Abbott

We will demonstrate a hardware and firmware solution for a complete fully pipelined multi-crate trigger system that takes advantage of the elegant high speed VXS serial extensions for VME. This trigger system includes three sections starting with the front end crate trigger processor (CTP), a global Sub-System Processor (SSP) and a Trigger Supervisor that manages the timing, synchronization and front end event readout. Within a front end crate, trigger information is gathered from each 16 Channel, 12 bit Flash ADC module at 4 nS intervals via the VXS backplane, to a Crate Trigger Processor (CTP). Each Crate Trigger Processor receivesmore » these 500 MB/S VXS links from the 16 FADC-250 modules, aligns skewed data inherent of Aurora protocol, and performs real time crate level trigger algorithms. The algorithm results are encoded using a Reed-Solomon technique and transmission of this Level 1 trigger data is sent to the SSP using a multi-fiber link. The multi-fiber link achieves an aggregate trigger data transfer rate to the global trigger at 8 Gb/s. The SSP receives and decodes Reed-Solomon error correcting transmission from each crate, aligns the data, and performs the global level trigger algorithms. The entire trigger system is synchronous and operates at 250 MHz with the Trigger Supervisor managing not only the front end event readout, but also the distribution of the critical timing clocks, synchronization signals, and the global trigger signals to each front end readout crate. These signals are distributed to the front end crates on a separate fiber link and each crate is synchronized using a unique encoding scheme to guarantee that each front end crate is synchronous with a fixed latency, independent of the distance between each crate. The overall trigger signal latency is <3 uS, and the proposed 12GeV experiments at Jefferson Lab require up to 200KHz Level 1 trigger rate.« less
Energy-Efficient Scheduling for Hybrid Tasks in Control Devices for the Internet of Things

PubMed Central

Gao, Zhigang; Wu, Yifan; Dai, Guojun; Xia, Haixia

2012-01-01

In control devices for the Internet of Things (IoT), energy is one of the critical restriction factors. Dynamic voltage scaling (DVS) has been proved to be an effective method for reducing the energy consumption of processors. This paper proposes an energy-efficient scheduling algorithm for IoT control devices with hard real-time control tasks (HRCTs) and soft real-time tasks (SRTs). The main contribution of this paper includes two parts. First, it builds the Hybrid tasks with multi-subtasks of different function Weight (HoW) task model for IoT control devices. HoW describes the structure of HRCTs and SRTs, and their properties, e.g., deadlines, execution time, preemption properties, and energy-saving goals, etc. Second, it presents the Hybrid Tasks' Dynamic Voltage Scaling (HTDVS) algorithm. HTDVS first sets the slowdown factors of subtasks while meeting the different real-time requirements of HRCTs and SRTs, and then dynamically reclaims, reserves, and reuses the slack time of the subtasks to meet their ideal energy-saving goals. Experimental results show HTDVS can reduce energy consumption about 10%–80% while meeting the real-time requirements of HRCTs, HRCTs help to reduce the deadline miss ratio (DMR) of systems, and HTDVS has comparable performance with the greedy algorithm and is more favorable to keep the subtasks' ideal speeds. PMID:23112659

The Experimental Mathematician: The Pleasure of Discovery and the Role of Proof

ERIC Educational Resources Information Center

Borwein, Jonathan M.

2005-01-01

The emergence of powerful mathematical computing environments, the growing availability of correspondingly powerful (multi-processor) computers and the pervasive presence of the Internet allow for mathematicians, students and teachers, to proceed heuristically and "quasi-inductively." We may increasingly use symbolic and numeric computation,…
Multi-task learning for cross-platform siRNA efficacy prediction: an in-silico study

PubMed Central

2010-01-01

Background Gene silencing using exogenous small interfering RNAs (siRNAs) is now a widespread molecular tool for gene functional study and new-drug target identification. The key mechanism in this technique is to design efficient siRNAs that incorporated into the RNA-induced silencing complexes (RISC) to bind and interact with the mRNA targets to repress their translations to proteins. Although considerable progress has been made in the computational analysis of siRNA binding efficacy, few joint analysis of different RNAi experiments conducted under different experimental scenarios has been done in research so far, while the joint analysis is an important issue in cross-platform siRNA efficacy prediction. A collective analysis of RNAi mechanisms for different datasets and experimental conditions can often provide new clues on the design of potent siRNAs. Results An elegant multi-task learning paradigm for cross-platform siRNA efficacy prediction is proposed. Experimental studies were performed on a large dataset of siRNA sequences which encompass several RNAi experiments recently conducted by different research groups. By using our multi-task learning method, the synergy among different experiments is exploited and an efficient multi-task predictor for siRNA efficacy prediction is obtained. The 19 most popular biological features for siRNA according to their jointly importance in multi-task learning were ranked. Furthermore, the hypothesis is validated out that the siRNA binding efficacy on different messenger RNAs(mRNAs) have different conditional distribution, thus the multi-task learning can be conducted by viewing tasks at an "mRNA"-level rather than at the "experiment"-level. Such distribution diversity derived from siRNAs bound to different mRNAs help indicate that the properties of target mRNA have important implications on the siRNA binding efficacy. Conclusions The knowledge gained from our study provides useful insights on how to analyze various cross-platform RNAi data for uncovering of their complex mechanism. PMID:20380733
Multi-task learning for cross-platform siRNA efficacy prediction: an in-silico study.

PubMed

Liu, Qi; Xu, Qian; Zheng, Vincent W; Xue, Hong; Cao, Zhiwei; Yang, Qiang

2010-04-10

Gene silencing using exogenous small interfering RNAs (siRNAs) is now a widespread molecular tool for gene functional study and new-drug target identification. The key mechanism in this technique is to design efficient siRNAs that incorporated into the RNA-induced silencing complexes (RISC) to bind and interact with the mRNA targets to repress their translations to proteins. Although considerable progress has been made in the computational analysis of siRNA binding efficacy, few joint analysis of different RNAi experiments conducted under different experimental scenarios has been done in research so far, while the joint analysis is an important issue in cross-platform siRNA efficacy prediction. A collective analysis of RNAi mechanisms for different datasets and experimental conditions can often provide new clues on the design of potent siRNAs. An elegant multi-task learning paradigm for cross-platform siRNA efficacy prediction is proposed. Experimental studies were performed on a large dataset of siRNA sequences which encompass several RNAi experiments recently conducted by different research groups. By using our multi-task learning method, the synergy among different experiments is exploited and an efficient multi-task predictor for siRNA efficacy prediction is obtained. The 19 most popular biological features for siRNA according to their jointly importance in multi-task learning were ranked. Furthermore, the hypothesis is validated out that the siRNA binding efficacy on different messenger RNAs(mRNAs) have different conditional distribution, thus the multi-task learning can be conducted by viewing tasks at an "mRNA"-level rather than at the "experiment"-level. Such distribution diversity derived from siRNAs bound to different mRNAs help indicate that the properties of target mRNA have important implications on the siRNA binding efficacy. The knowledge gained from our study provides useful insights on how to analyze various cross-platform RNAi data for uncovering of their complex mechanism.
Artificial intelligence for multi-mission planetary operations

NASA Technical Reports Server (NTRS)

Atkinson, David J.; Lawson, Denise L.; James, Mark L.

1990-01-01

A brief introduction is given to an automated system called the Spacecraft Health Automated Reasoning Prototype (SHARP). SHARP is designed to demonstrate automated health and status analysis for multi-mission spacecraft and ground data systems operations. The SHARP system combines conventional computer science methodologies with artificial intelligence techniques to produce an effective method for detecting and analyzing potential spacecraft and ground systems problems. The system performs real-time analysis of spacecraft and other related telemetry, and is also capable of examining data in historical context. Telecommunications link analysis of the Voyager II spacecraft is the initial focus for evaluation of the prototype in a real-time operations setting during the Voyager spacecraft encounter with Neptune in August, 1989. The preliminary results of the SHARP project and plans for future application of the technology are discussed.
Multi-agent cooperation rescue algorithm based on influence degree and state prediction

NASA Astrophysics Data System (ADS)

Zheng, Yanbin; Ma, Guangfu; Wang, Linlin; Xi, Pengxue

2018-04-01

Aiming at the multi-agent cooperative rescue in disaster, a multi-agent cooperative rescue algorithm based on impact degree and state prediction is proposed. Firstly, based on the influence of the information in the scene on the collaborative task, the influence degree function is used to filter the information. Secondly, using the selected information to predict the state of the system and Agent behavior. Finally, according to the result of the forecast, the cooperative behavior of Agent is guided and improved the efficiency of individual collaboration. The simulation results show that this algorithm can effectively solve the cooperative rescue problem of multi-agent and ensure the efficient completion of the task.
A new multi-spectral feature level image fusion method for human interpretation

NASA Astrophysics Data System (ADS)

Leviner, Marom; Maltz, Masha

2009-03-01

Various different methods to perform multi-spectral image fusion have been suggested, mostly on the pixel level. However, the jury is still out on the benefits of a fused image compared to its source images. We present here a new multi-spectral image fusion method, multi-spectral segmentation fusion (MSSF), which uses a feature level processing paradigm. To test our method, we compared human observer performance in a three-task experiment using MSSF against two established methods: averaging and principle components analysis (PCA), and against its two source bands, visible and infrared. The three tasks that we studied were: (1) simple target detection, (2) spatial orientation, and (3) camouflaged target detection. MSSF proved superior to the other fusion methods in all three tests; MSSF also outperformed the source images in the spatial orientation and camouflaged target detection tasks. Based on these findings, current speculation about the circumstances in which multi-spectral image fusion in general and specific fusion methods in particular would be superior to using the original image sources can be further addressed.
FPGA for Power Control of MSL Avionics

NASA Technical Reports Server (NTRS)

Wang, Duo; Burke, Gary R.

2011-01-01

A PLGT FPGA (Field Programmable Gate Array) is included in the LCC (Load Control Card), GID (Guidance Interface & Drivers), TMC (Telemetry Multiplexer Card), and PFC (Pyro Firing Card) boards of the Mars Science Laboratory (MSL) spacecraft. (PLGT stands for PFC, LCC, GID, and TMC.) It provides the interface between the backside bus and the power drivers on these boards. The LCC drives power switches to switch power loads, and also relays. The GID drives the thrusters and latch valves, as well as having the star-tracker and Sun-sensor interface. The PFC drives pyros, and the TMC receives digital and analog telemetry. The FPGA is implemented both in Xilinx (Spartan 3- 400) and in Actel (RTSX72SU, ASX72S). The Xilinx Spartan 3 part is used for the breadboard, the Actel ASX part is used for the EM (Engineer Module), and the pin-compatible, radiation-hardened RTSX part is used for final EM and flight. The MSL spacecraft uses a FC (Flight Computer) to control power loads, relays, thrusters, latch valves, Sun-sensor, and star-tracker, and to read telemetry such as temperature. Commands are sent over a 1553 bus to the MREU (Multi-Mission System Architecture Platform Remote Engineering Unit). The MREU resends over a remote serial command bus c-bus to the LCC, GID TMC, and PFC. The MREU also sends out telemetry addresses via a remote serial telemetry address bus to the LCC, GID, TMC, and PFC, and the status is returned over the remote serial telemetry data bus.
A queueing model of pilot decision making in a multi-task flight management situation

NASA Technical Reports Server (NTRS)

Walden, R. S.; Rouse, W. B.

1977-01-01

Allocation of decision making responsibility between pilot and computer is considered and a flight management task, designed for the study of pilot-computer interaction, is discussed. A queueing theory model of pilot decision making in this multi-task, control and monitoring situation is presented. An experimental investigation of pilot decision making and the resulting model parameters are discussed.
REACH: Real-Time Data Awareness in Multi-Spacecraft Missions

NASA Technical Reports Server (NTRS)

Maks, Lori; Coleman, Jason; Obenschain, Arthur F. (Technical Monitor)

2002-01-01

Missions have been proposed that will use multiple spacecraft to perform scientific or commercial tasks. Indeed, in the commercial world, some spacecraft constellations already exist. Aside from the technical challenges of constructing and flying these missions, there is also the financial challenge presented by the tradition model of the flight operations team (FOT) when it is applied to a constellation mission. Proposed constellation missions range in size from three spacecraft to more than 50. If the current ratio of three-to-five FOT personnel per spacecraft is maintained, the size of the FOT becomes cost prohibitive. The Advanced Architectures and Automation Branch at the Goddard Space Flight Center (GSFC Code 588) saw the potential to reduce the cost of these missions by creating new user interfaces to the ground system health-and-safety data. The goal is to enable a smaller FOT to remain aware and responsive to the increased amount of ground system information in a multi-spacecraft environment. Rather than abandon the tried and true, these interfaces were developed to run alongside existing ground system software to provide additional support to the FOT. These new user interfaces have been combined in a tool called REACH. REACH-the Real-time Evaluation and Analysis of Consolidated Health-is a software product that uses advanced visualization techniques to make spacecraft anomalies easy to spot, no matter how many spacecraft are in the constellation. REACH reads a real-time stream of data from the ground system and displays it to the FOT such that anomalies are easy to pick out and investigate. Data visualization has been used in ground system operations for many years. To provide a unique visualization tool, we developed a unique source of data to visualize: the REACH Health Model Engine. The Health Model Engine is rule-based software that receives real-time telemetry information and outputs "health" information related to the subsystems and spacecraft that the telemetry belong to. The Health Engine can run out-of-the-box or can be tailored with a scripting language. Out of the box, it uses limit violations to determine the health of subsystems and spacecraft; when tailored, it determines health using equations combining the values and limits of any telemetry in the spacecraft. The REACH visualizations then "roll up" the information from the Health Engine into high level, summary displays. These summary visualizations can be "zoomed" into for increasing levels of detail. Currently REACH is installed in the Small Explorer (SMEX) lab at GSFC, and is monitoring three of their five spacecraft. We are scheduled to install REACH in the Mid-sized Explorer (MIDEX) lab, which will allow us to monitor up to six more spacecraft. The process of installing and using our "research" software in an operational environment has provided many insights into which parts of REACH are a step forward and which of our ideas are missteps. Our paper explores both the new concepts in spacecraft health-and-safety visualization, the difficulties of such systems in the operational environment, and the cost and safety issues of multi-spacecraft missions.
Parallel Ada benchmarks for the SVMS

NASA Technical Reports Server (NTRS)

Collard, Philippe E.

1990-01-01

The use of parallel processing paradigm to design and develop faster and more reliable computers appear to clearly mark the future of information processing. NASA started the development of such an architecture: the Spaceborne VHSIC Multi-processor System (SVMS). Ada will be one of the languages used to program the SVMS. One of the unique characteristics of Ada is that it supports parallel processing at the language level through the tasking constructs. It is important for the SVMS project team to assess how efficiently the SVMS architecture will be implemented, as well as how efficiently Ada environment will be ported to the SVMS. AUTOCLASS II, a Bayesian classifier written in Common Lisp, was selected as one of the benchmarks for SVMS configurations. The purpose of the R and D effort was to provide the SVMS project team with the version of AUTOCLASS II, written in Ada, that would make use of Ada tasking constructs as much as possible so as to constitute a suitable benchmark. Additionally, a set of programs was developed that would measure Ada tasking efficiency on parallel architectures as well as determine the critical parameters influencing tasking efficiency. All this was designed to provide the SVMS project team with a set of suitable tools in the development of the SVMS architecture.
Landsat-1 and Landsat-2 evaluation report, 23 January 1975 to 23 April 1975

NASA Technical Reports Server (NTRS)

1975-01-01

A description of the work accomplished with the Landsat-1 and Landsat-2 satellites during the period 23 Jan. - 23 Apr. 1975 was presented. The following information was given for each satellite: operational summary, orbital parameters, power subsystem, attitude control subsystem, command/clock subsystem, telemetry subsystem, orbit adjust subsystem, magnetic moment compensating assembly, unified S-band/premodulation processor, electrical interface subsystem, thermal subsystem, narrowband tape recorders, wideband telemetry subsystem, attitude measurement sensor, wideband video tape recorders, return beam vidicon, multispectral scanner subsystem, and data collection subsystem.
Multi-level manual and autonomous control superposition for intelligent telerobot

NASA Technical Reports Server (NTRS)

Hirai, Shigeoki; Sato, T.

1989-01-01

Space telerobots are recognized to require cooperation with human operators in various ways. Multi-level manual and autonomous control superposition in telerobot task execution is described. The object model, the structured master-slave manipulation system, and the motion understanding system are proposed to realize the concept. The object model offers interfaces for task level and object level human intervention. The structured master-slave manipulation system offers interfaces for motion level human intervention. The motion understanding system maintains the consistency of the knowledge through all the levels which supports the robot autonomy while accepting the human intervention. The superposing execution of the teleoperational task at multi-levels realizes intuitive and robust task execution for wide variety of objects and in changeful environment. The performance of several examples of operating chemical apparatuses is shown.
Multi-AUV autonomous task planning based on the scroll time domain quantum bee colony optimization algorithm in uncertain environment

PubMed Central

Zhang, Rubo; Yang, Yu

2017-01-01

Research on distributed task planning model for multi-autonomous underwater vehicle (MAUV). A scroll time domain quantum artificial bee colony (STDQABC) optimization algorithm is proposed to solve the multi-AUV optimal task planning scheme. In the uncertain marine environment, the rolling time domain control technique is used to realize a numerical optimization in a narrowed time range. Rolling time domain control is one of the better task planning techniques, which can greatly reduce the computational workload and realize the tradeoff between AUV dynamics, environment and cost. Finally, a simulation experiment was performed to evaluate the distributed task planning performance of the scroll time domain quantum bee colony optimization algorithm. The simulation results demonstrate that the STDQABC algorithm converges faster than the QABC and ABC algorithms in terms of both iterations and running time. The STDQABC algorithm can effectively improve MAUV distributed tasking planning performance, complete the task goal and get the approximate optimal solution. PMID:29186166
Multi-AUV autonomous task planning based on the scroll time domain quantum bee colony optimization algorithm in uncertain environment.

PubMed

Li, Jianjun; Zhang, Rubo; Yang, Yu

2017-01-01

Research on distributed task planning model for multi-autonomous underwater vehicle (MAUV). A scroll time domain quantum artificial bee colony (STDQABC) optimization algorithm is proposed to solve the multi-AUV optimal task planning scheme. In the uncertain marine environment, the rolling time domain control technique is used to realize a numerical optimization in a narrowed time range. Rolling time domain control is one of the better task planning techniques, which can greatly reduce the computational workload and realize the tradeoff between AUV dynamics, environment and cost. Finally, a simulation experiment was performed to evaluate the distributed task planning performance of the scroll time domain quantum bee colony optimization algorithm. The simulation results demonstrate that the STDQABC algorithm converges faster than the QABC and ABC algorithms in terms of both iterations and running time. The STDQABC algorithm can effectively improve MAUV distributed tasking planning performance, complete the task goal and get the approximate optimal solution.
Long-term memory-based control of attention in multi-step tasks requires working memory: evidence from domain-specific interference

PubMed Central

Foerster, Rebecca M.; Carbone, Elena; Schneider, Werner X.

2014-01-01

Evidence for long-term memory (LTM)-based control of attention has been found during the execution of highly practiced multi-step tasks. However, does LTM directly control for attention or are working memory (WM) processes involved? In the present study, this question was investigated with a dual-task paradigm. Participants executed either a highly practiced visuospatial sensorimotor task (speed stacking) or a verbal task (high-speed poem reciting), while maintaining visuospatial or verbal information in WM. Results revealed unidirectional and domain-specific interference. Neither speed stacking nor high-speed poem reciting was influenced by WM retention. Stacking disrupted the retention of visuospatial locations, but did not modify memory performance of verbal material (letters). Reciting reduced the retention of verbal material substantially whereas it affected the memory performance of visuospatial locations to a smaller degree. We suggest that the selection of task-relevant information from LTM for the execution of overlearned multi-step tasks recruits domain-specific WM. PMID:24847304
Managing coherence via put/get windows

DOEpatents

Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY

2011-01-11

A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
Managing coherence via put/get windows

DOEpatents

Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY

2012-02-21

A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
Software defined radio (SDR) architecture for concurrent multi-satellite communications

NASA Astrophysics Data System (ADS)

Maheshwarappa, Mamatha R.

SDRs have emerged as a viable approach for space communications over the last decade by delivering low-cost hardware and flexible software solutions. The flexibility introduced by the SDR concept not only allows the realisation of concurrent multiple standards on one platform, but also promises to ease the implementation of one communication standard on differing SDR platforms by signal porting. This technology would facilitate implementing reconfigurable nodes for parallel satellite reception in Mobile/Deployable Ground Segments and Distributed Satellite Systems (DSS) for amateur radio/university satellite operations. This work outlines the recent advances in embedded technologies that can enable new communication architectures for concurrent multi-satellite or satellite-to-ground missions where multi-link challenges are associated. This research proposes a novel concept to run advanced parallelised SDR back-end technologies in a Commercial-Off-The-Shelf (COTS) embedded system that can support multi-signal processing for multi-satellite scenarios simultaneously. The initial SDR implementation could support only one receiver chain due to system saturation. However, the design was optimised to facilitate multiple signals within the limited resources available on an embedded system at any given time. This was achieved by providing a VHDL solution to the existing Python and C/C++ programming languages along with parallelisation so as to accelerate performance whilst maintaining the flexibility. The improvement in the performance was validated at every stage through profiling. Various cases of concurrent multiple signals with different standards such as frequency (with Doppler effect) and symbol rates were simulated in order to validate the novel architecture proposed in this research. Also, the architecture allows the system to be reconfigurable by providing the opportunity to change the communication standards in soft real-time. The chosen COTS solution provides a generic software methodology for both ground and space applications that will remain unaltered despite new evolutions in hardware, and supports concurrent multi-standard, multi-channel and multi-rate telemetry signals.
Design of multi-function sensor detection system in coal mine based on ARM

NASA Astrophysics Data System (ADS)

Ge, Yan-Xiang; Zhang, Quan-Zhu; Deng, Yong-Hong

2017-06-01

The traditional coal mine sensor in the specific measurement points, the number and type of channel will be greater than or less than the number of monitoring points, resulting in a waste of resources or cannot meet the application requirements, in order to enable the sensor to adapt to the needs of different occasions and reduce the cost, a kind of multi-functional intelligent sensor multiple sensors and ARM11 the S3C6410 processor is used to design and realize the dust, gas, temperature and humidity sensor functions together, and has storage, display, voice, pictures, data query, alarm and other new functions.
Techniques in processing multi-frequency multi-polarization spaceborne SAR data

NASA Technical Reports Server (NTRS)

Curlander, John C.; Chang, C. Y.

1991-01-01

This paper presents the algorithm design of the SIR-C ground data processor, with emphasis on the unique elements involved in the production of registered multifrequency polarimetric data products. A quick-look processing algorithm used for generation of low-resolution browse image products and estimation of echo signal parameters is also presented. Specifically the discussion covers: (1) azimuth reference function generation to produce registered polarimetric imagery; (2) geometric rectification to accommondate cross-track and along-track Doppler drifts; (3) multilook filtering designed to generate output imagery with a uniform resolution; and (4) efficient coding to compress the polarimetric image data for distribution.

ROOT 6 and beyond: TObject, C++14 and many cores

DOE PAGES

Bellenot, B.; Canal, Ph; Couet, O.; ...

2015-12-23

Following the release of version 6, ROOT has entered a new area of development. It will leverage the industrial strength compiler library shipping in ROOT 6 and its support of the C++11/14 standard, to significantly simplify and harden ROOT's interfaces and to clarify and substantially improve ROOT's support for multi-threaded environments. Furthermore, this talk will also recap the most important new features and enhancements in ROOT in general, focusing on those allowed by the improved interpreter and better compiler support, including I/O for smart pointers, easier type safe access to the content of TTrees and enhanced multi processor support.
SEAWAT: A Computer Program for Simulation of Variable-Density Groundwater Flow and Multi-Species Solute and Heat Transport

USGS Publications Warehouse

Langevin, Christian D.

2009-01-01

SEAWAT is a MODFLOW-based computer program designed to simulate variable-density groundwater flow coupled with multi-species solute and heat transport. The program has been used for a wide variety of groundwater studies including saltwater intrusion in coastal aquifers, aquifer storage and recovery in brackish limestone aquifers, and brine migration within continental aquifers. SEAWAT is relatively easy to apply because it uses the familiar MODFLOW structure. Thus, most commonly used pre- and post-processors can be used to create datasets and visualize results. SEAWAT is a public domain computer program distributed free of charge by the U.S. Geological Survey.
Mcqueuer

DOE Office of Scientific and Technical Information (OSTI.GOV)

2016-09-12

Mcqueuer is a simple tool that allows anyone from researchers to experienced developers to create multi-node/multi-core jobs by simply creating a file with a list of commands. Users simply combine tasks, which would otherwise each be their own job on the cluster, into a single file that is given to Mcqueuer. Mcqueuer then does the heavy lifting required to process the tasks in parallel in a single multi-node job. In addition, Mcqueuer provides load-balancing, which frees the user from having to worry about complex memory and CPU considerations, and instead focus on the processing itself.
Goal-oriented robot navigation learning using a multi-scale space representation.

PubMed

Llofriu, M; Tejera, G; Contreras, M; Pelc, T; Fellous, J M; Weitzenfeld, A

2015-12-01

There has been extensive research in recent years on the multi-scale nature of hippocampal place cells and entorhinal grid cells encoding which led to many speculations on their role in spatial cognition. In this paper we focus on the multi-scale nature of place cells and how they contribute to faster learning during goal-oriented navigation when compared to a spatial cognition system composed of single scale place cells. The task consists of a circular arena with a fixed goal location, in which a robot is trained to find the shortest path to the goal after a number of learning trials. Synaptic connections are modified using a reinforcement learning paradigm adapted to the place cells multi-scale architecture. The model is evaluated in both simulation and physical robots. We find that larger scale and combined multi-scale representations favor goal-oriented navigation task learning. Copyright © 2015 Elsevier Ltd. All rights reserved.
A theoretical framework for negotiating the path of emergency management multi-agency coordination.

PubMed

Curnin, Steven; Owen, Christine; Paton, Douglas; Brooks, Benjamin

2015-03-01

Multi-agency coordination represents a significant challenge in emergency management. The need for liaison officers working in strategic level emergency operations centres to play organizational boundary spanning roles within multi-agency coordination arrangements that are enacted in complex and dynamic emergency response scenarios creates significant research and practical challenges. The aim of the paper is to address a gap in the literature regarding the concept of multi-agency coordination from a human-environment interaction perspective. We present a theoretical framework for facilitating multi-agency coordination in emergency management that is grounded in human factors and ergonomics using the methodology of core-task analysis. As a result we believe the framework will enable liaison officers to cope more efficiently within the work domain. In addition, we provide suggestions for extending the theory of core-task analysis to an alternate high reliability environment. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Application of convolve-multiply-convolve SAW processor for satellite communications

NASA Technical Reports Server (NTRS)

Lie, Y. S.; Ching, M.

1991-01-01

There is a need for a satellite communications receiver than can perform simultaneous multi-channel processing of single channel per carrier (SCPC) signals originating from various small (mobile or fixed) earth stations. The number of ground users can be as many as 1000. Conventional techniques of simultaneously processing these signals is by employing as many RF-bandpass filters as the number of channels. Consequently, such an approach would result in a bulky receiver, which becomes impractical for satellite applications. A unique approach utilizing a realtime surface acoustic wave (SAW) chirp transform processor is presented. The application of a Convolve-Multiply-Convolve (CMC) chirp transform processor is described. The CMC processor transforms each input channel into a unique timeslot, while preserving its modulation content (in this case QPSK). Subsequently, each channel is individually demodulated without the need of input channel filters. Circuit complexity is significantly reduced, because the output frequency of the CMC processor is common for all input channel frequencies. The results of theoretical analysis and experimental results are in good agreement.
Design and realization of the baseband processor in satellite navigation and positioning receiver

NASA Astrophysics Data System (ADS)

Zhang, Dawei; Hu, Xiulin; Li, Chen

2007-11-01

The content of this paper is focused on the Design and realization of the baseband processor in satellite navigation and positioning receiver. Baseband processor is the most important part of the satellite positioning receiver. The design covers baseband processor's main functions include multi-channel digital signal DDC, acquisition, code tracking, carrier tracking, demodulation, etc. The realization is based on an Altera's FPGA device, that makes the system can be improved and upgraded without modifying the hardware. It embodies the theory of software defined radio (SDR), and puts the theory of the spread spectrum into practice. This paper puts emphasis on the realization of baseband processor in FPGA. In the order of choosing chips, design entry, debugging and synthesis, the flow is presented detailedly. Additionally the paper detailed realization of Digital PLL in order to explain a method of reducing the consumption of FPGA. Finally, the paper presents the result of Synthesis. This design has been used in BD-1, BD-2 and GPS.
Adaptive Allocation of Decision Making Responsibility Between Human and Computer in Multi-Task Situations. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Chu, Y. Y.

1978-01-01

A unified formulation of computer-aided, multi-task, decision making is presented. Strategy for the allocation of decision making responsibility between human and computer is developed. The plans of a flight management systems are studied. A model based on the queueing theory was implemented.
Continuous Video Modeling to Prompt Completion of Multi-Component Tasks by Adults with Moderate Intellectual Disability

ERIC Educational Resources Information Center

Mechling, Linda C.; Ayres, Kevin M.; Purrazzella, Kaitlin; Purrazzella, Kimberly

2014-01-01

This investigation examined the ability of four adults with moderate intellectual disability to complete multi-component tasks using continuous video modeling. Continuous video modeling, which is a newly researched application of video modeling, presents video in a "looping" format which automatically repeats playing of the video while…
Data Mining Algorithms for Classification of Complex Biomedical Data

ERIC Educational Resources Information Center

Lan, Liang

2012-01-01

In my dissertation, I will present my research which contributes to solve the following three open problems from biomedical informatics: (1) Multi-task approaches for microarray classification; (2) Multi-label classification of gene and protein prediction from multi-source biological data; (3) Spatial scan for movement data. In microarray…
RoboCup: Multi-disciplinary Senior Design Project.

ERIC Educational Resources Information Center

Elder, Kevin Lee

A cross-college team of educators has developed a collaborative, multi-disciplinary senior design course at Ohio University. This course offers an attractive opportunity for students from a variety of disciplines to work together in a learning community to accomplish a challenging task. It provides a novel multi-disciplinary learning environment…
Improved sensitivity and specificity for resting state and task fMRI with multiband multi-echo EPI compared to multi-echo EPI at 7 T.

PubMed

Boyacioğlu, Rasim; Schulz, Jenni; Koopmans, Peter J; Barth, Markus; Norris, David G

2015-10-01

A multiband multi-echo (MBME) sequence is implemented and compared to a matched standard multi-echo (ME) protocol to investigate the potential improvement in sensitivity and spatial specificity at 7 T for resting state and task fMRI. ME acquisition is attractive because BOLD sensitivity is less affected by variation in T2*, and because of the potential for separating BOLD and non-BOLD signal components. MBME further reduces TR thus increasing the potential reduction in physiological noise. In this study we used FSL-FIX to clean ME and MBME resting state and task fMRI data (both 3.5mm isotropic). After noise correction, the detection of resting state networks improves with more non-artifactual independent components being observed. Additional activation clusters for task data are discovered for MBME data (increased sensitivity) whereas existing clusters become more localized for resting state (improved spatial specificity). The results obtained indicate that MBME is superior to ME at high field strengths. Copyright © 2015 Elsevier Inc. All rights reserved.
Problems With Deployment of Multi-Domained, Multi-Homed Mobile Networks

NASA Technical Reports Server (NTRS)

Ivancic, William D.

2008-01-01

This document describes numerous problems associated with deployment of multi-homed mobile platforms consisting of multiple networks and traversing large geographical areas. The purpose of this document is to provide insight to real-world deployment issues and provide information to groups that are addressing many issues related to multi-homing, policy-base routing, route optimization and mobile security - particularly those groups within the Internet Engineering Task Force.
Experiments with a Parallel Multi-Objective Evolutionary Algorithm for Scheduling

NASA Technical Reports Server (NTRS)

Brown, Matthew; Johnston, Mark D.

2013-01-01

Evolutionary multi-objective algorithms have great potential for scheduling in those situations where tradeoffs among competing objectives represent a key requirement. One challenge, however, is runtime performance, as a consequence of evolving not just a single schedule, but an entire population, while attempting to sample the Pareto frontier as accurately and uniformly as possible. The growing availability of multi-core processors in end user workstations, and even laptops, has raised the question of the extent to which such hardware can be used to speed up evolutionary algorithms. In this paper we report on early experiments in parallelizing a Generalized Differential Evolution (GDE) algorithm for scheduling long-range activities on NASA's Deep Space Network. Initial results show that significant speedups can be achieved, but that performance does not necessarily improve as more cores are utilized. We describe our preliminary results and some initial suggestions from parallelizing the GDE algorithm. Directions for future work are outlined.
MREG V1.1 : a multi-scale image registration algorithm for SAR applications.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eichel, Paul H.

2013-08-01

MREG V1.1 is the sixth generation SAR image registration algorithm developed by the Signal Processing&Technology Department for Synthetic Aperture Radar applications. Like its predecessor algorithm REGI, it employs a powerful iterative multi-scale paradigm to achieve the competing goals of sub-pixel registration accuracy and the ability to handle large initial offsets. Since it is not model based, it allows for high fidelity tracking of spatially varying terrain-induced misregistration. Since it does not rely on image domain phase, it is equally adept at coherent and noncoherent image registration. This document provides a brief history of the registration processors developed by Dept. 5962more » leading up to MREG V1.1, a full description of the signal processing steps involved in the algorithm, and a user's manual with application specific recommendations for CCD, TwoColor MultiView, and SAR stereoscopy.« less
A Generic Framework for Real-Time Multi-Channel Neuronal Signal Analysis, Telemetry Control, and Sub-Millisecond Latency Feedback Generation

PubMed Central

Zrenner, Christoph; Eytan, Danny; Wallach, Avner; Thier, Peter; Marom, Shimon

2010-01-01

Distinct modules of the neural circuitry interact with each other and (through the motor-sensory loop) with the environment, forming a complex dynamic system. Neuro-prosthetic devices seeking to modulate or restore CNS function need to interact with the information flow at the level of neural modules electrically, bi-directionally and in real-time. A set of freely available generic tools is presented that allow computationally demanding multi-channel short-latency bi-directional interactions to be realized in in vivo and in vitro preparations using standard PC data acquisition and processing hardware and software (Mathworks Matlab and Simulink). A commercially available 60-channel extracellular multi-electrode recording and stimulation set-up connected to an ex vivo developing cortical neuronal culture is used as a model system to validate the method. We demonstrate how complex high-bandwidth (>10 MBit/s) neural recording data can be analyzed in real-time while simultaneously generating specific complex electrical stimulation feedback with deterministically timed responses at sub-millisecond resolution. PMID:21060803
Parallelization of combinatorial search when solving knapsack optimization problem on computing systems based on multicore processors

NASA Astrophysics Data System (ADS)

Rahman, P. A.

2018-05-01

This scientific paper deals with the model of the knapsack optimization problem and method of its solving based on directed combinatorial search in the boolean space. The offered by the author specialized mathematical model of decomposition of the search-zone to the separate search-spheres and the algorithm of distribution of the search-spheres to the different cores of the multi-core processor are also discussed. The paper also provides an example of decomposition of the search-zone to the several search-spheres and distribution of the search-spheres to the different cores of the quad-core processor. Finally, an offered by the author formula for estimation of the theoretical maximum of the computational acceleration, which can be achieved due to the parallelization of the search-zone to the search-spheres on the unlimited number of the processor cores, is also given.
Processor farming in two-level analysis of historical bridge

NASA Astrophysics Data System (ADS)

Krejčí, T.; Kruis, J.; Koudelka, T.; Šejnoha, M.

2017-11-01

This contribution presents a processor farming method in connection with a multi-scale analysis. In this method, each macro-scopic integration point or each finite element is connected with a certain meso-scopic problem represented by an appropriate representative volume element (RVE). The solution of a meso-scale problem provides then effective parameters needed on the macro-scale. Such an analysis is suitable for parallel computing because the meso-scale problems can be distributed among many processors. The application of the processor farming method to a real world masonry structure is illustrated by an analysis of Charles bridge in Prague. The three-dimensional numerical model simulates the coupled heat and moisture transfer of one half of arch No. 3. and it is a part of a complex hygro-thermo-mechanical analysis which has been developed to determine the influence of climatic loading on the current state of the bridge.
System Architecture For High Speed Sorting Of Potatoes

NASA Astrophysics Data System (ADS)

Marchant, J. A.; Onyango, C. M.; Street, M. J.

1989-03-01

This paper illustrates an industrial application of vision processing in which potatoes are sorted according to their size and shape at speeds of up to 40 objects per second. The result is a multi-processing approach built around the VME bus. A hardware unit has been designed and constructed to encode the boundary of the potatoes, to reducing the amount of data to be processed. A master 68000 processor is used to control this unit and to handle data transfers along the bus. Boundary data is passed to one of three 68010 slave processors each responsible for a line of potatoes across a conveyor belt. The slave processors calculate attributes such as shape, size and estimated weight of each potato and the master processor uses this data to operate the sorting mechanism. The system has been interfaced with a commercial grading machine and performance trials are now in progress.
Application of Multi-task Lasso Regression in the Stellar Parametrization

NASA Astrophysics Data System (ADS)

Chang, L. N.; Zhang, P. A.

2015-01-01

The multi-task learning approaches have attracted the increasing attention in the fields of machine learning, computer vision, and artificial intelligence. By utilizing the correlations in tasks, learning multiple related tasks simultaneously is better than learning each task independently. An efficient multi-task Lasso (Least Absolute Shrinkage Selection and Operator) regression algorithm is proposed in this paper to estimate the physical parameters of stellar spectra. It not only makes different physical parameters share the common features, but also can effectively preserve their own peculiar features. Experiments were done based on the ELODIE data simulated with the stellar atmospheric simulation model, and on the SDSS data released by the American large survey Sloan. The precision of the model is better than those of the methods in the related literature, especially for the acceleration of gravity (lg g) and the chemical abundance ([Fe/H]). In the experiments, we changed the resolution of the spectrum, and applied the noises with different signal-to-noise ratio (SNR) to the spectrum, so as to illustrate the stability of the model. The results show that the model is influenced by both the resolution and the noise. But the influence of the noise is larger than that of the resolution. In general, the multi-task Lasso regression algorithm is easy to operate, has a strong stability, and also can improve the overall accuracy of the model.

Stream Processors

NASA Astrophysics Data System (ADS)

Erez, Mattan; Dally, William J.

Stream processors, like other multi core architectures partition their functional units and storage into multiple processing elements. In contrast to typical architectures, which contain symmetric general-purpose cores and a cache hierarchy, stream processors have a significantly leaner design. Stream processors are specifically designed for the stream execution model, in which applications have large amounts of explicit parallel computation, structured and predictable control, and memory accesses that can be performed at a coarse granularity. Applications in the streaming model are expressed in a gather-compute-scatter form, yielding programs with explicit control over transferring data to and from on-chip memory. Relying on these characteristics, which are common to many media processing and scientific computing applications, stream architectures redefine the boundary between software and hardware responsibilities with software bearing much of the complexity required to manage concurrency, locality, and latency tolerance. Thus, stream processors have minimal control consisting of fetching medium- and coarse-grained instructions and executing them directly on the many ALUs. Moreover, the on-chip storage hierarchy of stream processors is under explicit software control, as is all communication, eliminating the need for complex reactive hardware mechanisms.
Mars Observer screen display design for a multimission environment

NASA Technical Reports Server (NTRS)

Chafin, Roy L.

1993-01-01

The Multi Mission Control Team (MMCT) is responsible for support to real time operations of the Mars Observer Mission. The team has the responsibility for monitoring the ground data system for the integrity of the telemetry and command data links. It also supports the Mars Observers Spacecraft Team in monitoring spacecraft events. The Data Monitor and Display subsystem (DMD) workstation provides the data interface with the ground data system. DMD workstation displays were developed to support the Mission Controllers in accomplishing their assigned tasks for supporting the Mars Observer mission. The display design concepts that were used in the Mar Observer MMCT displays to minimize the cognitive demands on the controllers and enhance the MMCT operations were presented. The Data Monitor and Display subsystem (DMD) is the controllers window into the spacecraft and the ground data system. The DMD is a workstation that provides a variety of formatted data displays to the controller. The displays present both spacecraft telemetry data and ground system monitor data. Some displays are preplanned and developed prior to the operations in which they are used. These are called fixed displays and are quite versatile in format and content. Other displays and plots can be created in real time. These displays have limited formats but flexibility in content. These are called list or message displays. They can be rapidly generated by the controller as needed. The MMCT display repertoire provides a mix of displays appropriate to the needs of the MMCT controllers.
Variations and relationship of quality and NIR spectral characteristics of cotton fibers collected from multi-location field performance trials

USDA-ARS?s Scientific Manuscript database

High volume instrumentation (HVITM) and advanced fiber information system (AFIS) measurements are increasingly being utilized as primary and routine means of acquiring fiber quality data by cotton breeders and fiber processors. There is amount of information regarding fiber and yarn qualities, but l...
Hierarchial parallel computer architecture defined by computational multidisciplinary mechanics

NASA Technical Reports Server (NTRS)

Padovan, Joe; Gute, Doug; Johnson, Keith

1989-01-01

The goal is to develop an architecture for parallel processors enabling optimal handling of multi-disciplinary computation of fluid-solid simulations employing finite element and difference schemes. The goals, philosphical and modeling directions, static and dynamic poly trees, example problems, interpolative reduction, the impact on solvers are shown in viewgraph form.
Multi-Rate Secure Processor Terminal Architecture Study. Volume 1. Terminal Architecture.

DTIC Science & Technology

1981-06-01

together because of the intimate relationship that must be established between the KG devices and the control of those devices to satisy security...9.6 kilobit for ti’.:., pass filter funtion because it’s time span is larger. The resultdot loading is estimated at 260 microseconds out of 833
Multi-Velocity Component LDV

NASA Technical Reports Server (NTRS)

Johnson, Dennis A. (Inventor)

1996-01-01

A laser doppler velocimeter uses frequency shifting of a laser beam to provide signal information for each velocity component. A composite electrical signal generated by a light detector is digitized and a processor produces a discrete Fourier transform based on the digitized electrical signal. The transform includes two peak frequencies corresponding to the two velocity components.
Investigation of Large Scale Cortical Models on Clustered Multi-Core Processors

DTIC Science & Technology

2013-02-01

with the bias node ( gray ) denoted as ww and the weights associated with the remaining first layer nodes (black) denoted as W. In forming the overall...Implementation of RBF network on GPU Platform 3.5.1 The Cholesky decomposition algorithm We need to invert the matrix multiplication GTG to
Real-time video compressing under DSP/BIOS

NASA Astrophysics Data System (ADS)

Chen, Qiu-ping; Li, Gui-ju

2009-10-01

This paper presents real-time MPEG-4 Simple Profile video compressing based on the DSP processor. The programming framework of video compressing is constructed using TMS320C6416 Microprocessor, TDS510 simulator and PC. It uses embedded real-time operating system DSP/BIOS and the API functions to build periodic function, tasks and interruptions etcs. Realize real-time video compressing. To the questions of data transferring among the system. Based on the architecture of the C64x DSP, utilized double buffer switched and EDMA data transfer controller to transit data from external memory to internal, and realize data transition and processing at the same time; the architecture level optimizations are used to improve software pipeline. The system used DSP/BIOS to realize multi-thread scheduling. The whole system realizes high speed transition of a great deal of data. Experimental results show the encoder can realize real-time encoding of 768*576, 25 frame/s video images.
A study of the parallel algorithm for large-scale DC simulation of nonlinear systems

NASA Astrophysics Data System (ADS)

Cortés Udave, Diego Ernesto; Ogrodzki, Jan; Gutiérrez de Anda, Miguel Angel

Newton-Raphson DC analysis of large-scale nonlinear circuits may be an extremely time consuming process even if sparse matrix techniques and bypassing of nonlinear models calculation are used. A slight decrease in the time required for this task may be enabled on multi-core, multithread computers if the calculation of the mathematical models for the nonlinear elements as well as the stamp management of the sparse matrix entries are managed through concurrent processes. This numerical complexity can be further reduced via the circuit decomposition and parallel solution of blocks taking as a departure point the BBD matrix structure. This block-parallel approach may give a considerable profit though it is strongly dependent on the system topology and, of course, on the processor type. This contribution presents the easy-parallelizable decomposition-based algorithm for DC simulation and provides a detailed study of its effectiveness.
2nd Generation QUATARA Flight Computer Project

NASA Technical Reports Server (NTRS)

Falker, Jay; Keys, Andrew; Fraticelli, Jose Molina; Capo-Iugo, Pedro; Peeples, Steven

2015-01-01

Single core flight computer boards have been designed, developed, and tested (DD&T) to be flown in small satellites for the last few years. In this project, a prototype flight computer will be designed as a distributed multi-core system containing four microprocessors running code in parallel. This flight computer will be capable of performing multiple computationally intensive tasks such as processing digital and/or analog data, controlling actuator systems, managing cameras, operating robotic manipulators and transmitting/receiving from/to a ground station. In addition, this flight computer will be designed to be fault tolerant by creating both a robust physical hardware connection and by using a software voting scheme to determine the processor's performance. This voting scheme will leverage on the work done for the Space Launch System (SLS) flight software. The prototype flight computer will be constructed with Commercial Off-The-Shelf (COTS) components which are estimated to survive for two years in a low-Earth orbit.
A Framework for Parallel Unstructured Grid Generation for Complex Aerodynamic Simulations

NASA Technical Reports Server (NTRS)

Zagaris, George; Pirzadeh, Shahyar Z.; Chrisochoides, Nikos

2009-01-01

A framework for parallel unstructured grid generation targeting both shared memory multi-processors and distributed memory architectures is presented. The two fundamental building-blocks of the framework consist of: (1) the Advancing-Partition (AP) method used for domain decomposition and (2) the Advancing Front (AF) method used for mesh generation. Starting from the surface mesh of the computational domain, the AP method is applied recursively to generate a set of sub-domains. Next, the sub-domains are meshed in parallel using the AF method. The recursive nature of domain decomposition naturally maps to a divide-and-conquer algorithm which exhibits inherent parallelism. For the parallel implementation, the Master/Worker pattern is employed to dynamically balance the varying workloads of each task on the set of available CPUs. Performance results by this approach are presented and discussed in detail as well as future work and improvements.
Improving the performance of heterogeneous multi-core processors by modifying the cache coherence protocol

NASA Astrophysics Data System (ADS)

Fang, Juan; Hao, Xiaoting; Fan, Qingwen; Chang, Zeqing; Song, Shuying

2017-05-01

In the Heterogeneous multi-core architecture, CPU and GPU processor are integrated on the same chip, which poses a new challenge to the last-level cache management. In this architecture, the CPU application and the GPU application execute concurrently, accessing the last-level cache. CPU and GPU have different memory access characteristics, so that they have differences in the sensitivity of last-level cache (LLC) capacity. For many CPU applications, a reduced share of the LLC could lead to significant performance degradation. On the contrary, GPU applications can tolerate increase in memory access latency when there is sufficient thread-level parallelism. Taking into account the GPU program memory latency tolerance characteristics, this paper presents a method that let GPU applications can access to memory directly, leaving lots of LLC space for CPU applications, in improving the performance of CPU applications and does not affect the performance of GPU applications. When the CPU application is cache sensitive, and the GPU application is insensitive to the cache, the overall performance of the system is improved significantly.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Trędak, Przemysław, E-mail: przemyslaw.tredak@fuw.edu.pl; Rudnicki, Witold R.; Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, ul. Pawińskiego 5a, 02-106 Warsaw

The second generation Reactive Bond Order (REBO) empirical potential is commonly used to accurately model a wide range hydrocarbon materials. It is also extensible to other atom types and interactions. REBO potential assumes complex multi-body interaction model, that is difficult to represent efficiently in the SIMD or SIMT programming model. Hence, despite its importance, no efficient GPGPU implementation has been developed for this potential. Here we present a detailed description of a highly efficient GPGPU implementation of molecular dynamics algorithm using REBO potential. The presented algorithm takes advantage of rarely used properties of the SIMT architecture of a modern GPUmore » to solve difficult synchronizations issues that arise in computations of multi-body potential. Techniques developed for this problem may be also used to achieve efficient solutions of different problems. The performance of proposed algorithm is assessed using a range of model systems. It is compared to highly optimized CPU implementation (both single core and OpenMP) available in LAMMPS package. These experiments show up to 6x improvement in forces computation time using single processor of the NVIDIA Tesla K80 compared to high end 16-core Intel Xeon processor.« less
Image processing using Gallium Arsenide (GaAs) technology

NASA Technical Reports Server (NTRS)

Miller, Warner H.

1989-01-01

The need to increase the information return from space-borne imaging systems has increased in the past decade. The use of multi-spectral data has resulted in the need for finer spatial resolution and greater spectral coverage. Onboard signal processing will be necessary in order to utilize the available Tracking and Data Relay Satellite System (TDRSS) communication channel at high efficiency. A generally recognized approach to the increased efficiency of channel usage is through data compression techniques. The compression technique implemented is a differential pulse code modulation (DPCM) scheme with a non-uniform quantizer. The need to advance the state-of-the-art of onboard processing was recognized and a GaAs integrated circuit technology was chosen. An Adaptive Programmable Processor (APP) chip set was developed which is based on an 8-bit slice general processor. The reason for choosing the compression technique for the Multi-spectral Linear Array (MLA) instrument is described. Also a description is given of the GaAs integrated circuit chip set which will demonstrate that data compression can be performed onboard in real time at data rate in the order of 500 Mb/s.
A dual-processor multi-frequency implementation of the FINDS algorithm

NASA Technical Reports Server (NTRS)

Godiwala, Pankaj M.; Caglayan, Alper K.

1987-01-01

This report presents a parallel processing implementation of the FINDS (Fault Inferring Nonlinear Detection System) algorithm on a dual processor configured target flight computer. First, a filter initialization scheme is presented which allows the no-fail filter (NFF) states to be initialized using the first iteration of the flight data. A modified failure isolation strategy, compatible with the new failure detection strategy reported earlier, is discussed and the performance of the new FDI algorithm is analyzed using flight recorded data from the NASA ATOPS B-737 aircraft in a Microwave Landing System (MLS) environment. The results show that low level MLS, IMU, and IAS sensor failures are detected and isolated instantaneously, while accelerometer and rate gyro failures continue to take comparatively longer to detect and isolate. The parallel implementation is accomplished by partitioning the FINDS algorithm into two parts: one based on the translational dynamics and the other based on the rotational kinematics. Finally, a multi-rate implementation of the algorithm is presented yielding significantly low execution times with acceptable estimation and FDI performance.
Telerobot local-remote control architecture for space flight program applications

NASA Technical Reports Server (NTRS)

Zimmerman, Wayne; Backes, Paul; Steele, Robert; Long, Mark; Bon, Bruce; Beahan, John

1993-01-01

The JPL Supervisory Telerobotics (STELER) Laboratory has developed and demonstrated a unique local-remote robot control architecture which enables management of intermittent communication bus latencies and delays such as those expected for ground-remote operation of Space Station robotic systems via the Tracking and Data Relay Satellite System (TDRSS) communication platform. The current work at JPL in this area has focused on enhancing the technologies and transferring the control architecture to hardware and software environments which are more compatible with projected ground and space operational environments. At the local site, the operator updates the remote worksite model using stereo video and a model overlay/fitting algorithm which outputs the location and orientation of the object in free space. That information is relayed to the robot User Macro Interface (UMI) to enable programming of the robot control macros. This capability runs on a single Silicon Graphics Inc. machine. The operator can employ either manual teleoperation, shared control, or supervised autonomous control to manipulate the intended object. The remote site controller, called the Modular Telerobot Task Execution System (MOTES), runs in a multi-processor VME environment and performs the task sequencing, task execution, trajectory generation, closed loop force/torque control, task parameter monitoring, and reflex action. This paper describes the new STELER architecture implementation, and also documents the results of the recent autonomous docking task execution using the local site and MOTES.
Design of Smart Multi-Functional Integrated Aviation Photoelectric Payload

NASA Astrophysics Data System (ADS)

Zhang, X.

2018-04-01

To coordinate with the small UAV at reconnaissance mission, we've developed a smart multi-functional integrated aviation photoelectric payload. The payload weighs only 1kg, and has a two-axis stabilized platform with visible task payload, infrared task payload, laser pointers and video tracker. The photoelectric payload could complete the reconnaissance tasks above the target area (including visible and infrared). Because of its light weight, small size, full-featured, high integrated, the constraints of the UAV platform carrying the payload will be reduced a lot, which helps the payload suit for more extensive using occasions. So all users of this type of smart multi-functional integrated aviation photoelectric payload will do better works on completion of the ground to better pinpoint targets, artillery calibration, assessment of observe strike damage, customs officials and other tasks.
Distributed digital signal processors for multi-body structures

NASA Technical Reports Server (NTRS)

Lee, Gordon K.

1990-01-01

Several digital filter designs were investigated which may be used to process sensor data from large space structures and to design digital hardware to implement the distributed signal processing architecture. Several experimental tests articles are available at NASA Langley Research Center to evaluate these designs. A summary of some of the digital filter designs is presented, an evaluation of their characteristics relative to control design is discussed, and candidate hardware microcontroller/microcomputer components are given. Future activities include software evaluation of the digital filter designs and actual hardware inplementation of some of the signal processor algorithms on an experimental testbed at NASA Langley.
A miniature on-chip multi-functional ECG signal processor with 30 µW ultra-low power consumption.

PubMed

Liu, Xin; Zheng, Yuan Jin; Phyu, Myint Wai; Zhao, Bin; Je, Minkyu; Yuan, Xiao Jun

2010-01-01

In this paper, a miniature low-power Electrocardiogram (ECG) signal processing application specific integrated circuit (ASIC) chip is proposed. This chip provides multiple critical functions for ECG analysis using a systematic wavelet transform algorithm and a novel SRAM-based ASIC architecture, while achieves low cost and high performance. Using 0.18 µm CMOS technology and 1 V power supply, this ASIC chip consumes only 29 µW and occupies an area of 3 mm(2). This on-chip ECG processor is highly suitable for reliable real-time cardiac status monitoring applications.
MARVEL: A knowledge-based productivity enhancement tool for real-time multi-mission and multi-subsystem spacecraft operations

NASA Astrophysics Data System (ADS)

Schwuttke, Ursula M.; Veregge, John, R.; Angelino, Robert; Childs, Cynthia L.

1990-10-01

The Monitor/Analyzer of Real-time Voyager Engineering Link (MARVEL) is described. It is the first automation tool to be used in an online mode for telemetry monitoring and analysis in mission operations. MARVEL combines standard automation techniques with embedded knowledge base systems to simultaneously provide real time monitoring of data from subsystems, near real time analysis of anomaly conditions, and both real time and non-real time user interface functions. MARVEL is currently capable of monitoring the Computer Command Subsystem (CCS), Flight Data Subsystem (FDS), and Attitude and Articulation Control Subsystem (AACS) for both Voyager spacecraft, simultaneously, on a single workstation. The goal of MARVEL is to provide cost savings and productivity enhancement in mission operations and to reduce the need for constant availability of subsystem expertise.

Bovine viral diarrhea viruses (BVDV) and their cousins the HoBi-like viruses: Multi symptom, multi host, multi tasking pathogens

USDA-ARS?s Scientific Manuscript database

The term bovine viral diarrhea (BVD) has come to refer to a diverse collection of clinical presentations that include respiratory, enteric and reproductive symptoms accompanied by immunosuppression. While the majority of cases are subclinical in nature two forms exist, mucosal disease and hemorrhag...
Collective Machine Learning: Team Learning and Classification in Multi-Agent Systems

ERIC Educational Resources Information Center

Gifford, Christopher M.

2009-01-01

This dissertation focuses on the collaboration of multiple heterogeneous, intelligent agents (hardware or software) which collaborate to learn a task and are capable of sharing knowledge. The concept of collaborative learning in multi-agent and multi-robot systems is largely under studied, and represents an area where further research is needed to…
Developing a Multi-Dimensional Evaluation Framework for Faculty Teaching and Service Performance

ERIC Educational Resources Information Center

Baker, Diane F.; Neely, Walter P.; Prenshaw, Penelope J.; Taylor, Patrick A.

2015-01-01

A task force was created in a small, AACSB-accredited business school to develop a more comprehensive set of standards for faculty performance. The task force relied heavily on faculty input to identify and describe key dimensions that capture effective teaching and service performance. The result is a multi-dimensional framework that will be used…
Task Dependent Prefrontal Dysfunction in Persons with Asperger's Disorder Investigated with Multi-Channel Near-Infrared Spectroscopy

ERIC Educational Resources Information Center

Iwanami, Akira; Okajima, Yuka; Ota, Haruhisa; Tani, Masayuki; Yamada, Takashi; Hashimoro, Ryuichiro; Kanai, Chieko; Watanabe, Hiromi; Yamasue, Hidenori; Kawakubo, Yuki; Kato, Nobumasa

2011-01-01

Dysfunction of the prefrontal cortex has been previously reported in individuals with Asperger's disorder. In the present study, we used multi-channel near-infrared spectroscopy (NIRS) to detect changes in the oxygenated hemoglobin concentration ([oxy-Hb]) during two verbal fluency tasks. The subjects were 20 individuals with Asperger's disorder…
High-performance sparse matrix-matrix products on Intel KNL and multicore architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nagasaka, Y; Matsuoka, S; Azad, A

Sparse matrix-matrix multiplication (SpGEMM) is a computational primitive that is widely used in areas ranging from traditional numerical applications to recent big data analysis and machine learning. Although many SpGEMM algorithms have been proposed, hardware specific optimizations for multi- and many-core processors are lacking and a detailed analysis of their performance under various use cases and matrices is not available. We firstly identify and mitigate multiple bottlenecks with memory management and thread scheduling on Intel Xeon Phi (Knights Landing or KNL). Specifically targeting multi- and many-core processors, we develop a hash-table-based algorithm and optimize a heap-based shared-memory SpGEMM algorithm. Wemore » examine their performance together with other publicly available codes. Different from the literature, our evaluation also includes use cases that are representative of real graph algorithms, such as multi-source breadth-first search or triangle counting. Our hash-table and heap-based algorithms are showing significant speedups from libraries in the majority of the cases while different algorithms dominate the other scenarios with different matrix size, sparsity, compression factor and operation type. We wrap up in-depth evaluation results and make a recipe to give the best SpGEMM algorithm for target scenario. A critical finding is that hash-table-based SpGEMM gets a significant performance boost if the nonzeros are not required to be sorted within each row of the output matrix.« less
Locomotor Adaptation Improves Balance Control, Multitasking Ability and Reduces the Metabolic Cost of Postural Instability

NASA Technical Reports Server (NTRS)

Bloomberg, J. J.; Peters, B. T.; Mulavara, A. P.; Brady, R. A.; Batson, C. D.; Miller, C. A.; Ploutz-Snyder, R. J.; Guined, J. R.; Buxton, R. E.; Cohen, H. S.

2011-01-01

During exploration-class missions, sensorimotor disturbances may lead to disruption in the ability to ambulate and perform functional tasks during the initial introduction to a novel gravitational environment following a landing on a planetary surface. The overall goal of our current project is to develop a sensorimotor adaptability training program to facilitate rapid adaptation to these environments. We have developed a unique training system comprised of a treadmill placed on a motion-base facing a virtual visual scene. It provides an unstable walking surface combined with incongruent visual flow designed to enhance sensorimotor adaptability. Greater metabolic cost incurred during balance instability means more physical work is required during adaptation to new environments possibly affecting crewmembers? ability to perform mission critical tasks during early surface operations on planetary expeditions. The goal of this study was to characterize adaptation to a discordant sensory challenge across a number of performance modalities including locomotor stability, multi-tasking ability and metabolic cost. METHODS: Subjects (n=15) walked (4.0 km/h) on a treadmill for an 8 -minute baseline walking period followed by 20-minutes of walking (4.0 km/h) with support surface motion (0.3 Hz, sinusoidal lateral motion, peak amplitude 25.4 cm) provided by the treadmill/motion-base system. Stride frequency and auditory reaction time were collected as measures of locomotor stability and multi-tasking ability, respectively. Metabolic data (VO2) were collected via a portable metabolic gas analysis system. RESULTS: At the onset of lateral support surface motion, subj ects walking on our treadmill showed an increase in stride frequency and auditory reaction time indicating initial balance and multi-tasking disturbances. During the 20-minute adaptation period, balance control and multi-tasking performance improved. Similarly, throughout the 20-minute adaptation period, VO2 gradually decreased following an initial increase after the onset of support surface motion. DISCUSSION: Resu lts confirmed that walking in discordant conditions not only compromises locomotor stability and the ability to multi-task, but comes at a quantifiable metabolic cost. Importantly, like locomotor stability and multi-tasking ability, metabolic expenditure while walking in discordant sensory conditions improved during adaptation. This confirms that sensorimotor adaptability training can benefit multiple performance parameters central to the successful completion of critical mission tasks.
Microdot - A Four-Bit Microcontroller Designed for Distributed Low-End Computing in Satellites

NASA Astrophysics Data System (ADS)

2002-03-01

Many satellites are an integrated collection of sensors and actuators that require dedicated real-time control. For single processor systems, additional sensors require an increase in computing power and speed to provide the multi-tasking capability needed to service each sensor. Faster processors cost more and consume more power, which taxes a satellite's power resources and may lead to shorter satellite lifetimes. An alternative design approach is a distributed network of small and low power microcontrollers designed for space that handle the computing requirements of each individual sensor and actuator. The design of microdot, a four-bit microcontroller for distributed low-end computing, is presented. The design is based on previous research completed at the Space Electronics Branch, Air Force Research Laboratory (AFRL/VSSE) at Kirtland AFB, NM, and the Air Force Institute of Technology at Wright-Patterson AFB, OH. The Microdot has 29 instructions and a 1K x 4 instruction memory. The distributed computing architecture is based on the Philips Semiconductor I2C Serial Bus Protocol. A prototype was implemented and tested using an Altera Field Programmable Gate Array (FPGA). The prototype was operable to 9.1 MHz. The design was targeted for fabrication in a radiation-hardened-by-design gate-array cell library for the TSMC 0.35 micrometer CMOS process.
Transcranial direct current stimulation facilitates cognitive multi-task performance differentially depending on anode location and subtask

PubMed Central

Scheldrup, Melissa; Greenwood, Pamela M.; McKendrick, Ryan; Strohl, Jon; Bikson, Marom; Alam, Mahtab; McKinley, R. Andy; Parasuraman, Raja

2014-01-01

There is a need to facilitate acquisition of real world cognitive multi-tasks that require long periods of training (e.g., air traffic control, intelligence analysis, medicine). Non-invasive brain stimulation—specifically transcranial Direct Current Stimulation (tDCS)—has promise as a method to speed multi-task training. We hypothesized that during acquisition of the complex multi-task Space Fortress, subtasks that require focused attention on ship control would benefit from tDCS aimed at the dorsal attention network while subtasks that require redirection of attention would benefit from tDCS aimed at the right hemisphere ventral attention network. We compared effects of 30 min prefrontal and parietal stimulation to right and left hemispheres on subtask performance during the first 45 min of training. The strongest effects both overall and for ship flying (control and velocity subtasks) were seen with a right parietal (C4, reference to left shoulder) montage, shown by modeling to induce an electric field that includes nodes in both dorsal and ventral attention networks. This is consistent with the re-orienting hypothesis that the ventral attention network is activated along with the dorsal attention network if a new, task-relevant event occurs while visuospatial attention is focused (Corbetta et al., 2008). No effects were seen with anodes over sites that stimulated only dorsal (C3) or only ventral (F10) attention networks. The speed subtask (update memory for symbols) benefited from an F9 anode over left prefrontal cortex. These results argue for development of tDCS as a training aid in real world settings where multi-tasking is critical. PMID:25249958
Transcranial direct current stimulation facilitates cognitive multi-task performance differentially depending on anode location and subtask.

PubMed

Scheldrup, Melissa; Greenwood, Pamela M; McKendrick, Ryan; Strohl, Jon; Bikson, Marom; Alam, Mahtab; McKinley, R Andy; Parasuraman, Raja

2014-01-01

There is a need to facilitate acquisition of real world cognitive multi-tasks that require long periods of training (e.g., air traffic control, intelligence analysis, medicine). Non-invasive brain stimulation-specifically transcranial Direct Current Stimulation (tDCS)-has promise as a method to speed multi-task training. We hypothesized that during acquisition of the complex multi-task Space Fortress, subtasks that require focused attention on ship control would benefit from tDCS aimed at the dorsal attention network while subtasks that require redirection of attention would benefit from tDCS aimed at the right hemisphere ventral attention network. We compared effects of 30 min prefrontal and parietal stimulation to right and left hemispheres on subtask performance during the first 45 min of training. The strongest effects both overall and for ship flying (control and velocity subtasks) were seen with a right parietal (C4, reference to left shoulder) montage, shown by modeling to induce an electric field that includes nodes in both dorsal and ventral attention networks. This is consistent with the re-orienting hypothesis that the ventral attention network is activated along with the dorsal attention network if a new, task-relevant event occurs while visuospatial attention is focused (Corbetta et al., 2008). No effects were seen with anodes over sites that stimulated only dorsal (C3) or only ventral (F10) attention networks. The speed subtask (update memory for symbols) benefited from an F9 anode over left prefrontal cortex. These results argue for development of tDCS as a training aid in real world settings where multi-tasking is critical.
An Application-Based Performance Characterization of the Columbia Supercluster

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Djomehri, Jahed M.; Hood, Robert; Jin, Hoaqiang; Kiris, Cetin; Saini, Subhash

2005-01-01

Columbia is a 10,240-processor supercluster consisting of 20 Altix nodes with 512 processors each, and currently ranked as the second-fastest computer in the world. In this paper, we present the performance characteristics of Columbia obtained on up to four computing nodes interconnected via the InfiniBand and/or NUMAlink4 communication fabrics. We evaluate floating-point performance, memory bandwidth, message passing communication speeds, and compilers using a subset of the HPC Challenge benchmarks, and some of the NAS Parallel Benchmarks including the multi-zone versions. We present detailed performance results for three scientific applications of interest to NASA, one from molecular dynamics, and two from computational fluid dynamics. Our results show that both the NUMAlink4 and the InfiniBand hold promise for application scaling to a large number of processors.
Simplifying and speeding the management of intra-node cache coherence

DOEpatents

Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Phillip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY

2012-04-17

A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
Managing coherence via put/get windows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blumrich, Matthias A; Chen, Dong; Coteus, Paul W

A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an areamore » of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.« less
Internet Distribution of Spacecraft Telemetry Data

NASA Technical Reports Server (NTRS)

Specht, Ted; Noble, David

2006-01-01

Remote Access Multi-mission Processing and Analysis Ground Environment (RAMPAGE) is a Java-language server computer program that enables near-real-time display of spacecraft telemetry data on any authorized client computer that has access to the Internet and is equipped with Web-browser software. In addition to providing a variety of displays of the latest available telemetry data, RAMPAGE can deliver notification of an alarm by electronic mail. Subscribers can then use RAMPAGE displays to determine the state of the spacecraft and formulate a response to the alarm, if necessary. A user can query spacecraft mission data in either binary or comma-separated-value format by use of a Web form or a Practical Extraction and Reporting Language (PERL) script to automate the query process. RAMPAGE runs on Linux and Solaris server computers in the Ground Data System (GDS) of NASA's Jet Propulsion Laboratory and includes components designed specifically to make it compatible with legacy GDS software. The client/server architecture of RAMPAGE and the use of the Java programming language make it possible to utilize a variety of competitive server and client computers, thereby also helping to minimize costs.
Attention in a multi-task environment

NASA Technical Reports Server (NTRS)

Andre, Anthony D.; Heers, Susan T.

1993-01-01

Two experiments used a low fidelity multi-task simulation to investigate the effects of cue specificity on task preparation and performance. Subjects performed a continuous compensatory tracking task and were periodically prompted to perform one of several concurrent secondary tasks. The results provide strong evidence that subjects enacted a strategy to actively divert resources towards secondary task preparation only when they had specific information about an upcoming task to be performed. However, this strategy was not as much affected by the type of task cued (Experiment 1) or its difficulty level (Experiment 2). Overall, subjects seemed aware of both the costs (degraded primary task tracking) and benefits (improved secondary task performance) of cue information. Implications of the present results for computational human performance/workload models are discussed.
Design and realization of the real-time spectrograph controller for LAMOST based on FPGA

NASA Astrophysics Data System (ADS)

Wang, Jianing; Wu, Liyan; Zeng, Yizhong; Dai, Songxin; Hu, Zhongwen; Zhu, Yongtian; Wang, Lei; Wu, Zhen; Chen, Yi

2008-08-01

A large Schmitt reflector telescope, Large Sky Area Multi-Object Fiber Spectroscopic Telescope(LAMOST), is being built in China, which has effective aperture of 4 meters and can observe the spectra of as many as 4000 objects simultaneously. To fit such a large amount of observational objects, the dispersion part is composed of a set of 16 multipurpose fiber-fed double-beam Schmidt spectrographs, of which each has about ten of moveable components realtimely accommodated and manipulated by a controller. An industrial Ethernet network connects those 16 spectrograph controllers. The light from stars is fed to the entrance slits of the spectrographs with optical fibers. In this paper, we mainly introduce the design and realization of our real-time controller for the spectrograph, our design using the technique of System On Programmable Chip (SOPC) based on Field Programmable Gate Array (FPGA) and then realizing the control of the spectrographs through NIOSII Soft Core Embedded Processor. We seal the stepper motor controller as intellectual property (IP) cores and reuse it, greatly simplifying the design process and then shortening the development time. Under the embedded operating system μC/OS-II, a multi-tasks control program has been well written to realize the real-time control of the moveable parts of the spectrographs. At present, a number of such controllers have been applied in the spectrograph of LAMOST.
Correlation between model observers in uniform background and human observers in patient liver background for a low-contrast detection task in CT

NASA Astrophysics Data System (ADS)

Gong, Hao; Yu, Lifeng; Leng, Shuai; Dilger, Samantha; Zhou, Wei; Ren, Liqiang; McCollough, Cynthia H.

2018-03-01

Channelized Hotelling observer (CHO) has demonstrated strong correlation with human observer (HO) in both single-slice viewing mode and multi-slice viewing mode in low-contrast detection tasks with uniform background. However, it remains unknown if the simplest single-slice CHO in uniform background can be used to predict human observer performance in more realistic tasks that involve patient anatomical background and multi-slice viewing mode. In this study, we aim to investigate the correlation between CHO in a uniform water background and human observer performance at a multi-slice viewing mode on patient liver background for a low-contrast lesion detection task. The human observer study was performed on CT images from 7 abdominal CT exams. A noise insertion tool was employed to synthesize CT scans at two additional dose levels. A validated lesion insertion tool was used to numerically insert metastatic liver lesions of various sizes and contrasts into both phantom and patient images. We selected 12 conditions out of 72 possible experimental conditions to evaluate the correlation at various radiation doses, lesion sizes, lesion contrasts and reconstruction algorithms. CHO with both single and multi-slice viewing modes were strongly correlated with HO. The corresponding Pearson's correlation coefficient was 0.982 (with 95% confidence interval (CI) [0.936, 0.995]) and 0.989 (with 95% CI of [0.960, 0.997]) in multi-slice and single-slice viewing modes, respectively. Therefore, this study demonstrated the potential to use the simplest single-slice CHO to assess image quality for more realistic clinically relevant CT detection tasks.
Application of Multi-task Lasso Regression in the Parametrization of Stellar Spectra

NASA Astrophysics Data System (ADS)

Chang, Li-Na; Zhang, Pei-Ai

2015-07-01

The multi-task learning approaches have attracted the increasing attention in the fields of machine learning, computer vision, and artificial intelligence. By utilizing the correlations in tasks, learning multiple related tasks simultaneously is better than learning each task independently. An efficient multi-task Lasso (Least Absolute Shrinkage Selection and Operator) regression algorithm is proposed in this paper to estimate the physical parameters of stellar spectra. It not only can obtain the information about the common features of the different physical parameters, but also can preserve effectively their own peculiar features. Experiments were done based on the ELODIE synthetic spectral data simulated with the stellar atmospheric model, and on the SDSS data released by the American large-scale survey Sloan. The estimation precision of our model is better than those of the methods in the related literature, especially for the estimates of the gravitational acceleration (lg g) and the chemical abundance ([Fe/H]). In the experiments we changed the spectral resolution, and applied the noises with different signal-to-noise ratios (SNRs) to the spectral data, so as to illustrate the stability of the model. The results show that the model is influenced by both the resolution and the noise. But the influence of the noise is larger than that of the resolution. In general, the multi-task Lasso regression algorithm is easy to operate, it has a strong stability, and can also improve the overall prediction accuracy of the model.
Enhancing Learning Management Systems Utility for Blind Students: A Task-Oriented, User-Centered, Multi-Method Evaluation Technique

ERIC Educational Resources Information Center

Babu, Rakesh; Singh, Rahul

2013-01-01

This paper presents a novel task-oriented, user-centered, multi-method evaluation (TUME) technique and shows how it is useful in providing a more complete, practical and solution-oriented assessment of the accessibility and usability of Learning Management Systems (LMS) for blind and visually impaired (BVI) students. Novel components of TUME…
Multi-Robot Coalitions Formation with Deadlines: Complexity Analysis and Solutions

PubMed Central

2017-01-01

Multi-robot task allocation is one of the main problems to address in order to design a multi-robot system, very especially when robots form coalitions that must carry out tasks before a deadline. A lot of factors affect the performance of these systems and among them, this paper is focused on the physical interference effect, produced when two or more robots want to access the same point simultaneously. To our best knowledge, this paper presents the first formal description of multi-robot task allocation that includes a model of interference. Thanks to this description, the complexity of the allocation problem is analyzed. Moreover, the main contribution of this paper is to provide the conditions under which the optimal solution of the aforementioned allocation problem can be obtained solving an integer linear problem. The optimal results are compared to previous allocation algorithms already proposed by the first two authors of this paper and with a new method proposed in this paper. The results obtained show how the new task allocation algorithms reach up more than an 80% of the median of the optimal solution, outperforming previous auction algorithms with a huge reduction of the execution time. PMID:28118384
Multi-Robot Coalitions Formation with Deadlines: Complexity Analysis and Solutions.

PubMed

Guerrero, Jose; Oliver, Gabriel; Valero, Oscar

2017-01-01

Multi-robot task allocation is one of the main problems to address in order to design a multi-robot system, very especially when robots form coalitions that must carry out tasks before a deadline. A lot of factors affect the performance of these systems and among them, this paper is focused on the physical interference effect, produced when two or more robots want to access the same point simultaneously. To our best knowledge, this paper presents the first formal description of multi-robot task allocation that includes a model of interference. Thanks to this description, the complexity of the allocation problem is analyzed. Moreover, the main contribution of this paper is to provide the conditions under which the optimal solution of the aforementioned allocation problem can be obtained solving an integer linear problem. The optimal results are compared to previous allocation algorithms already proposed by the first two authors of this paper and with a new method proposed in this paper. The results obtained show how the new task allocation algorithms reach up more than an 80% of the median of the optimal solution, outperforming previous auction algorithms with a huge reduction of the execution time.

MetAlign 3.0: performance enhancement by efficient use of advances in computer hardware.

PubMed

Lommen, Arjen; Kools, Harrie J

2012-08-01

A new, multi-threaded version of the GC-MS and LC-MS data processing software, metAlign, has been developed which is able to utilize multiple cores on one PC. This new version was tested using three different multi-core PCs with different operating systems. The performance of noise reduction, baseline correction and peak-picking was 8-19 fold faster compared to the previous version on a single core machine from 2008. The alignment was 5-10 fold faster. Factors influencing the performance enhancement are discussed. Our observations show that performance scales with the increase in processor core numbers we currently see in consumer PC hardware development.
A fast ultrasonic simulation tool based on massively parallel implementations

NASA Astrophysics Data System (ADS)

Lambert, Jason; Rougeron, Gilles; Lacassagne, Lionel; Chatillon, Sylvain

2014-02-01

This paper presents a CIVA optimized ultrasonic inspection simulation tool, which takes benefit of the power of massively parallel architectures: graphical processing units (GPU) and multi-core general purpose processors (GPP). This tool is based on the classical approach used in CIVA: the interaction model is based on Kirchoff, and the ultrasonic field around the defect is computed by the pencil method. The model has been adapted and parallelized for both architectures. At this stage, the configurations addressed by the tool are : multi and mono-element probes, planar specimens made of simple isotropic materials, planar rectangular defects or side drilled holes of small diameter. Validations on the model accuracy and performances measurements are presented.
From Wheatstone to Cameron and beyond: overview in 3-D and 4-D imaging technology

NASA Astrophysics Data System (ADS)

Gilbreath, G. Charmaine

2012-02-01

This paper reviews three-dimensional (3-D) and four-dimensional (4-D) imaging technology, from Wheatstone through today, with some prognostications for near future applications. This field is rich in variety, subject specialty, and applications. A major trend, multi-view stereoscopy, is moving the field forward to real-time wide-angle 3-D reconstruction as breakthroughs in parallel processing and multi-processor computers enable very fast processing. Real-time holography meets 4-D imaging reconstruction at the goal of achieving real-time, interactive, 3-D imaging. Applications to telesurgery and telemedicine as well as to the needs of the defense and intelligence communities are also discussed.
L2 Word Recognition: Influence of L1 Orthography on Multi-syllabic Word Recognition.

PubMed

Hamada, Megumi

2017-10-01

L2 reading research suggests that L1 orthographic experience influences L2 word recognition. Nevertheless, the findings on multi-syllabic words in English are still limited despite the fact that a vast majority of words are multi-syllabic. The study investigated whether L1 orthography influences the recognition of multi-syllabic words, focusing on the position of an embedded word. The participants were Arabic ESL learners, Chinese ESL learners, and native speakers of English. The task was a word search task, in which the participants identified a target word embedded in a pseudoword at the initial, middle, or final position. The search accuracy and speed indicated that all groups showed a strong preference for the initial position. The accuracy data further indicated group differences. The Arabic group showed higher accuracy in the final than middle, while the Chinese group showed the opposite and the native speakers showed no difference between the two positions. The findings suggest that L2 multi-syllabic word recognition involves unique processes.
Design, Implementation, and Evaluation of a Virtual Shared Memory System in a Multi-Transputer Network.

DTIC Science & Technology

1987-12-01

Synchronization and Data Passing Mechanism ........ 50 4. System Shut Down .................................................................. 51 5...high performance, fault tolerance, and extensibility. These features are attained by synchronizing and coordinating the dis- tributed multicomputer... synchronizing all processors in the network. In a multitransputer network, processes that communicate with each other do so synchronously . This makes
Software Epistemology

DTIC Science & Technology

2016-03-01

in-vitro decision to incubate a startup, Lexumo [7], which is developing a commercial Software as a Service ( SaaS ) vulnerability assessment...LTS Label Transition System MUSE Mining and Understanding Software Enclaves RTEMS Real-Time Executive for Multi-processor Systems SaaS Software ...as a Service SSA Static Single Assignment SWE Software Epistemology UD/DU Def-Use/Use-Def Chains (Dataflow Graph)
Decrypting Information Sensitivity: Risk, Privacy, and Data Protection Law in the United States and the European Union

ERIC Educational Resources Information Center

Fazlioglu, Muge

2017-01-01

This dissertation examines the risk-based approach to privacy and data protection and the role of information sensitivity within risk management. Determining what information carries the greatest risk is a multi-layered challenge that involves balancing the rights and interests of multiple actors, including data controllers, data processors, and…
Label-aligned Multi-task Feature Learning for Multimodal Classification of Alzheimer’s Disease and Mild Cognitive Impairment

PubMed Central

Zu, Chen; Jie, Biao; Liu, Mingxia; Chen, Songcan

2015-01-01

Multimodal classification methods using different modalities of imaging and non-imaging data have recently shown great advantages over traditional single-modality-based ones for diagnosis and prognosis of Alzheimer’s disease (AD), as well as its prodromal stage, i.e., mild cognitive impairment (MCI). However, to the best of our knowledge, most existing methods focus on mining the relationship across multiple modalities of the same subjects, while ignoring the potentially useful relationship across different subjects. Accordingly, in this paper, we propose a novel learning method for multimodal classification of AD/MCI, by fully exploring the relationships across both modalities and subjects. Specifically, our proposed method includes two subsequent components, i.e., label-aligned multi-task feature selection and multimodal classification. In the first step, the feature selection learning from multiple modalities are treated as different learning tasks and a group sparsity regularizer is imposed to jointly select a subset of relevant features. Furthermore, to utilize the discriminative information among labeled subjects, a new label-aligned regularization term is added into the objective function of standard multi-task feature selection, where label-alignment means that all multi-modality subjects with the same class labels should be closer in the new feature-reduced space. In the second step, a multi-kernel support vector machine (SVM) is adopted to fuse the selected features from multi-modality data for final classification. To validate our method, we perform experiments on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using baseline MRI and FDG-PET imaging data. The experimental results demonstrate that our proposed method achieves better classification performance compared with several state-of-the-art methods for multimodal classification of AD/MCI. PMID:26572145
MULTI-CORE AND OPTICAL PROCESSOR RELATED APPLICATIONS RESEARCH AT OAK RIDGE NATIONAL LABORATORY

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barhen, Jacob; Kerekes, Ryan A; ST Charles, Jesse Lee

2008-01-01

High-speed parallelization of common tasks holds great promise as a low-risk approach to achieving the significant increases in signal processing and computational performance required for next generation innovations in reconfigurable radio systems. Researchers at the Oak Ridge National Laboratory have been working on exploiting the parallelization offered by this emerging technology and applying it to a variety of problems. This paper will highlight recent experience with four different parallel processors applied to signal processing tasks that are directly relevant to signal processing required for SDR/CR waveforms. The first is the EnLight Optical Core Processor applied to matched filter (MF) correlationmore » processing via fast Fourier transform (FFT) of broadband Dopplersensitive waveforms (DSW) using active sonar arrays for target tracking. The second is the IBM CELL Broadband Engine applied to 2-D discrete Fourier transform (DFT) kernel for image processing and frequency domain processing. And the third is the NVIDIA graphical processor applied to document feature clustering. EnLight Optical Core Processor. Optical processing is inherently capable of high-parallelism that can be translated to very high performance, low power dissipation computing. The EnLight 256 is a small form factor signal processing chip (5x5 cm2) with a digital optical core that is being developed by an Israeli startup company. As part of its evaluation of foreign technology, ORNL's Center for Engineering Science Advanced Research (CESAR) had access to a precursor EnLight 64 Alpha hardware for a preliminary assessment of capabilities in terms of large Fourier transforms for matched filter banks and on applications related to Doppler-sensitive waveforms. This processor is optimized for array operations, which it performs in fixed-point arithmetic at the rate of 16 TeraOPS at 8-bit precision. This is approximately 1000 times faster than the fastest DSP available today. The optical core performs the matrix-vector multiplications, where the nominal matrix size is 256x256. The system clock is 125MHz. At each clock cycle, 128K multiply-and-add operations per second (OPS) are carried out, which yields a peak performance of 16 TeraOPS. IBM Cell Broadband Engine. The Cell processor is the extraordinary resulting product of 5 years of sustained, intensive R&D collaboration (involving over $400M investment) between IBM, Sony, and Toshiba. Its architecture comprises one multithreaded 64-bit PowerPC processor element (PPE) with VMX capabilities and two levels of globally coherent cache, and 8 synergistic processor elements (SPEs). Each SPE consists of a processor (SPU) designed for streaming workloads, local memory, and a globally coherent direct memory access (DMA) engine. Computations are performed in 128-bit wide single instruction multiple data streams (SIMD). An integrated high-bandwidth element interconnect bus (EIB) connects the nine processors and their ports to external memory and to system I/O. The Applied Software Engineering Research (ASER) Group at the ORNL is applying the Cell to a variety of text and image analysis applications. Research on Cell-equipped PlayStation3 (PS3) consoles has led to the development of a correlation-based image recognition engine that enables a single PS3 to process images at more than 10X the speed of state-of-the-art single-core processors. NVIDIA Graphics Processing Units. The ASER group is also employing the latest NVIDIA graphical processing units (GPUs) to accelerate clustering of thousands of text documents using recently developed clustering algorithms such as document flocking and affinity propagation.« less
Optimal processor assignment for pipeline computations

NASA Technical Reports Server (NTRS)

Nicol, David M.; Simha, Rahul; Choudhury, Alok N.; Narahari, Bhagirath

1991-01-01

The availability of large scale multitasked parallel architectures introduces the following processor assignment problem for pipelined computations. Given a set of tasks and their precedence constraints, along with their experimentally determined individual responses times for different processor sizes, find an assignment of processor to tasks. Two objectives are of interest: minimal response given a throughput requirement, and maximal throughput given a response time requirement. These assignment problems differ considerably from the classical mapping problem in which several tasks share a processor; instead, it is assumed that a large number of processors are to be assigned to a relatively small number of tasks. Efficient assignment algorithms were developed for different classes of task structures. For a p processor system and a series parallel precedence graph with n constituent tasks, an O(np2) algorithm is provided that finds the optimal assignment for the response time optimization problem; it was found that the assignment optimizing the constrained throughput in O(np2log p) time. Special cases of linear, independent, and tree graphs are also considered.
List-mode PET image reconstruction for motion correction using the Intel XEON PHI co-processor

NASA Astrophysics Data System (ADS)

Ryder, W. J.; Angelis, G. I.; Bashar, R.; Gillam, J. E.; Fulton, R.; Meikle, S.

2014-03-01

List-mode image reconstruction with motion correction is computationally expensive, as it requires projection of hundreds of millions of rays through a 3D array. To decrease reconstruction time it is possible to use symmetric multiprocessing computers or graphics processing units. The former can have high financial costs, while the latter can require refactoring of algorithms. The Xeon Phi is a new co-processor card with a Many Integrated Core architecture that can run 4 multiple-instruction, multiple data threads per core with each thread having a 512-bit single instruction, multiple data vector register. Thus, it is possible to run in the region of 220 threads simultaneously. The aim of this study was to investigate whether the Xeon Phi co-processor card is a viable alternative to an x86 Linux server for accelerating List-mode PET image reconstruction for motion correction. An existing list-mode image reconstruction algorithm with motion correction was ported to run on the Xeon Phi coprocessor with the multi-threading implemented using pthreads. There were no differences between images reconstructed using the Phi co-processor card and images reconstructed using the same algorithm run on a Linux server. However, it was found that the reconstruction runtimes were 3 times greater for the Phi than the server. A new version of the image reconstruction algorithm was developed in C++ using OpenMP for mutli-threading and the Phi runtimes decreased to 1.67 times that of the host Linux server. Data transfer from the host to co-processor card was found to be a rate-limiting step; this needs to be carefully considered in order to maximize runtime speeds. When considering the purchase price of a Linux workstation with Xeon Phi co-processor card and top of the range Linux server, the former is a cost-effective computation resource for list-mode image reconstruction. A multi-Phi workstation could be a viable alternative to cluster computers at a lower cost for medical imaging applications.
Behavior-based multi-robot collaboration for autonomous construction tasks

NASA Technical Reports Server (NTRS)

Stroupe, Ashley; Huntsberger, Terry; Okon, Avi; Aghazarian, Hrand; Robinson, Matthew

2005-01-01

The Robot Construction Crew (RCC) is a heterogeneous multi-robot system for autonomous construction of a structure through assembly of Long components. The two robot team demonstrates component placement into an existing structure in a realistic environment. The task requires component acquisition, cooperative transport, and cooperative precision manipulation. A behavior-based architecture provides adaptability. The RCC approach minimizes computation, power, communication, and sensing for applicability to space-related construction efforts, but the techniques are applicable to terrestrial construction tasks.
Behavior-Based Multi-Robot Collaboration for Autonomous Construction Tasks

NASA Technical Reports Server (NTRS)

Stroupe, Ashley; Huntsberger, Terry; Okon, Avi; Aghazarian, Hrand; Robinson, Matthew

2005-01-01

We present a heterogeneous multi-robot system for autonomous construction of a structure through assembly of long components. Placement of a component within an existing structure in a realistic environment is demonstrated on a two-robot team. The task requires component acquisition, cooperative transport, and cooperative precision manipulation. Far adaptability, the system is designed as a behavior-based architecture. Far applicability to space-related construction efforts, computation, power, communication, and sensing are minimized, though the techniques developed are also applicable to terrestrial construction tasks.
On-board fault diagnostics for fly-by-light flight control systems using neural network flight processors

NASA Astrophysics Data System (ADS)

Urnes, James M., Sr.; Cushing, John; Bond, William E.; Nunes, Steve

1996-10-01

Fly-by-Light control systems offer higher performance for fighter and transport aircraft, with efficient fiber optic data transmission, electric control surface actuation, and multi-channel high capacity centralized processing combining to provide maximum aircraft flight control system handling qualities and safety. The key to efficient support for these vehicles is timely and accurate fault diagnostics of all control system components. These diagnostic tests are best conducted during flight when all facts relating to the failure are present. The resulting data can be used by the ground crew for efficient repair and turnaround of the aircraft, saving time and money in support costs. These difficult to diagnose (Cannot Duplicate) fault indications average 40 - 50% of maintenance activities on today's fighter and transport aircraft, adding significantly to fleet support cost. Fiber optic data transmission can support a wealth of data for fault monitoring; the most efficient method of fault diagnostics is accurate modeling of the component response under normal and failed conditions for use in comparison with the actual component flight data. Neural Network hardware processors offer an efficient and cost-effective method to install fault diagnostics in flight systems, permitting on-board diagnostic modeling of very complex subsystems. Task 2C of the ARPA FLASH program is a design demonstration of this diagnostics approach, using the very high speed computation of the Adaptive Solutions Neural Network processor to monitor an advanced Electrohydrostatic control surface actuator linked through a AS-1773A fiber optic bus. This paper describes the design approach and projected performance of this on-line diagnostics system.
Telemetry Standards, Part 1

DTIC Science & Technology

2015-07-01

IMAGE FRAME RATE (R-x\\ IFR -n) PRE-TRIGGER FRAMES (R-x\\PTG-n) TOTAL FRAMES (R-x\\TOTF-n) EXPOSURE TIME (R-x\\EXP-n) SENSOR ROTATION (R-x...0” (Single frame). “1” (Multi-frame). “2” (Continuous). Allowed when: When R\\CDT is “IMGIN” IMAGE FRAME RATE R-x\\ IFR -n R/R Ch 10 Status: RO...the settings that the user wishes to modify. Return Value The impact : A partial IHAL <configuration> element containing only the new settings for
The Rational Behavior Model: A Multi-Paradigm, Tri-Level Software Architecture for the Control of Autonomous Vehicles

DTIC Science & Technology

1993-03-01

possible over a RF link when surfaced and over acoustic telemetry when submerged . Lockheed Missiles and Space Company has been awarded the contract to...ACL), is purely hierarchical and consists of three major components: the Data Manager, the ACL Controller, and the Model- 22 Based Reasoner ( MBR ). The...Data Manager receives, processes, and analyzes sensor and status data for use by the MBR and ACL Controller. The ACL Controller communicates commands
Navigating to the Future: Understanding Common Tasks in a Multi-Campus Environment in the Dramatically Changing Acquisition World

ERIC Educational Resources Information Center

Bock, Julia; Burgos-Mira, Rosemary

2010-01-01

In the context of a diverse, multi-campus university, this study discusses historical factors and recent changes in scholarly communication and the economic impact of these changes as we seek to forge stronger cooperation among our campuses. After consulting literature, the authors found that issues relating to acquisitions of multi-campus…
Multi-Cultural Competency-Based Vocational Curricula. Machine Trades. Multi-Cultural Competency-Based Vocational/Technical Curricula Series.

ERIC Educational Resources Information Center

Hepburn, Larry; Shin, Masako

This document, one of eight in a multi-cultural competency-based vocational/technical curricula series, is on machine trades. This program is designed to run 36 weeks and cover 6 instructional areas: use of measuring tools; benchwork/tool bit grinding; lathe work; milling work; precision grinding; and combination machine work. A duty-task index…
Reference Models for Multi-Layer Tissue Structures

DTIC Science & Technology

2016-09-01

simulation, finite element analysis 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a. NAME OF RESPONSIBLE PERSON USAMRMC...Physiologically realistic, fully specimen-specific, nonlinear reference models. Tasks. Finite element analysis of non-linear mechanics of cadaver...models. Tasks. Finite element analysis of non-linear mechanics of multi-layer tissue regions of human subjects. Deliverables. Partially subject- and
Continuous Video Modeling to Assist with Completion of Multi-Step Home Living Tasks by Young Adults with Moderate Intellectual Disability

ERIC Educational Resources Information Center

Mechling, Linda C.; Ayres, Kevin M.; Bryant, Kathryn J.; Foster, Ashley L.

2014-01-01

The current study evaluated a relatively new video-based procedure, continuous video modeling (CVM), to teach multi-step cleaning tasks to high school students with moderate intellectual disability. CVM in contrast to video modeling and video prompting allows repetition of the video model (looping) as many times as needed while the user completes…

Onboard Interferometric SAR Processor for the Ka-Band Radar Interferometer (KaRIn)

NASA Technical Reports Server (NTRS)

Esteban-Fernandez, Daniel; Rodriquez, Ernesto; Peral, Eva; Clark, Duane I.; Wu, Xiaoqing

2011-01-01

An interferometric synthetic aperture radar (SAR) onboard processor concept and algorithm has been developed for the Ka-band radar interferometer (KaRIn) instrument on the Surface and Ocean Topography (SWOT) mission. This is a mission- critical subsystem that will perform interferometric SAR processing and multi-look averaging over the oceans to decrease the data rate by three orders of magnitude, and therefore enable the downlink of the radar data to the ground. The onboard processor performs demodulation, range compression, coregistration, and re-sampling, and forms nine azimuth squinted beams. For each of them, an interferogram is generated, including common-band spectral filtering to improve correlation, followed by averaging to the final 1 1-km ground resolution pixel. The onboard processor has been prototyped on a custom FPGA-based cPCI board, which will be part of the radar s digital subsystem. The level of complexity of this technology, dictated by the implementation of interferometric SAR processing at high resolution, the extremely tight level of accuracy required, and its implementation on FPGAs are unprecedented at the time of this reporting for an onboard processor for flight applications.
Examining the volume efficiency of the cortical architecture in a multi-processor network model.

PubMed

Ruppin, E; Schwartz, E L; Yeshurun, Y

1993-01-01

The convoluted form of the sheet-like mammalian cortex naturally raises the question whether there is a simple geometrical reason for the prevalence of cortical architecture in the brains of higher vertebrates. Addressing this question, we present a formal analysis of the volume occupied by a massively connected network or processors (neurons) and then consider the pertaining cortical data. Three gross macroscopic features of cortical organization are examined: the segregation of white and gray matter, the circumferential organization of the gray matter around the white matter, and the folded cortical structure. Our results testify to the efficiency of cortical architecture.
A Multi-mission Event-Driven Component-Based System for Support of Flight Software Development, ATLO, and Operations first used by the Mars Science Laboratory (MSL) Project

NASA Technical Reports Server (NTRS)

Dehghani, Navid; Tankenson, Michael

2006-01-01

This paper details an architectural description of the Mission Data Processing and Control System (MPCS), an event-driven, multi-mission ground data processing components providing uplink, downlink, and data management capabilities which will support the Mars Science Laboratory (MSL) project as its first target mission. MPCS is developed based on a set of small reusable components, implemented in Java, each designed with a specific function and well-defined interfaces. An industry standard messaging bus is used to transfer information among system components. Components generate standard messages which are used to capture system information, as well as triggers to support the event-driven architecture of the system. Event-driven systems are highly desirable for processing high-rate telemetry (science and engineering) data, and for supporting automation for many mission operations processes.
Modern multicore and manycore architectures: Modelling, optimisation and benchmarking a multiblock CFD code

NASA Astrophysics Data System (ADS)

Hadade, Ioan; di Mare, Luca

2016-08-01

Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range of architectural features such as SIMD for data parallel execution or threads for core parallelism. The exploitation of multi-level parallelism is therefore crucial for achieving superior performance on current and future processors. This paper presents the performance tuning of a multiblock CFD solver on Intel SandyBridge and Haswell multicore CPUs and the Intel Xeon Phi Knights Corner coprocessor. Code optimisations have been applied on two computational kernels exhibiting different computational patterns: the update of flow variables and the evaluation of the Roe numerical fluxes. We discuss at great length the code transformations required for achieving efficient SIMD computations for both kernels across the selected devices including SIMD shuffles and transpositions for flux stencil computations and global memory transformations. Core parallelism is expressed through threading based on a number of domain decomposition techniques together with optimisations pertaining to alleviating NUMA effects found in multi-socket compute nodes. Results are correlated with the Roofline performance model in order to assert their efficiency for each distinct architecture. We report significant speedups for single thread execution across both kernels: 2-5X on the multicore CPUs and 14-23X on the Xeon Phi coprocessor. Computations at full node and chip concurrency deliver a factor of three speedup on the multicore processors and up to 24X on the Xeon Phi manycore coprocessor.
Data preprocessing for determining outer/inner parallelization in the nested loop problem using OpenMP

NASA Astrophysics Data System (ADS)

Handhika, T.; Bustamam, A.; Ernastuti, Kerami, D.

2017-07-01

Multi-thread programming using OpenMP on the shared-memory architecture with hyperthreading technology allows the resource to be accessed by multiple processors simultaneously. Each processor can execute more than one thread for a certain period of time. However, its speedup depends on the ability of the processor to execute threads in limited quantities, especially the sequential algorithm which contains a nested loop. The number of the outer loop iterations is greater than the maximum number of threads that can be executed by a processor. The thread distribution technique that had been found previously only be applied by the high-level programmer. This paper generates a parallelization procedure for low-level programmer in dealing with 2-level nested loop problems with the maximum number of threads that can be executed by a processor is smaller than the number of the outer loop iterations. Data preprocessing which is related to the number of the outer loop and the inner loop iterations, the computational time required to execute each iteration and the maximum number of threads that can be executed by a processor are used as a strategy to determine which parallel region that will produce optimal speedup.
NeuroFlow: A General Purpose Spiking Neural Network Simulation Platform using Customizable Processors.

PubMed

Cheung, Kit; Schultz, Simon R; Luk, Wayne

2015-01-01

NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation.
NeuroFlow: A General Purpose Spiking Neural Network Simulation Platform using Customizable Processors

PubMed Central

Cheung, Kit; Schultz, Simon R.; Luk, Wayne

2016-01-01

NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation. PMID:26834542
A Novel Approach to Develop the Lower Order Model of Multi-Input Multi-Output System

NASA Astrophysics Data System (ADS)

Rajalakshmy, P.; Dharmalingam, S.; Jayakumar, J.

2017-10-01

A mathematical model is a virtual entity that uses mathematical language to describe the behavior of a system. Mathematical models are used particularly in the natural sciences and engineering disciplines like physics, biology, and electrical engineering as well as in the social sciences like economics, sociology and political science. Physicists, Engineers, Computer scientists, and Economists use mathematical models most extensively. With the advent of high performance processors and advanced mathematical computations, it is possible to develop high performing simulators for complicated Multi Input Multi Ouptut (MIMO) systems like Quadruple tank systems, Aircrafts, Boilers etc. This paper presents the development of the mathematical model of a 500 MW utility boiler which is a highly complex system. A synergistic combination of operational experience, system identification and lower order modeling philosophy has been effectively used to develop a simplified but accurate model of a circulation system of a utility boiler which is a MIMO system. The results obtained are found to be in good agreement with the physics of the process and with the results obtained through design procedure. The model obtained can be directly used for control system studies and to realize hardware simulators for boiler testing and operator training.
Abnormal functional specialization within medial prefrontal cortex in high-functioning autism: a multi-voxel similarity analysis

PubMed Central

Meuwese, Julia D.I.; Towgood, Karren J.; Frith, Christopher D.; Burgess, Paul W.

2009-01-01

Multi-voxel pattern analyses have proved successful in ‘decoding’ mental states from fMRI data, but have not been used to examine brain differences associated with atypical populations. We investigated a group of 16 (14 males) high-functioning participants with autism spectrum disorder (ASD) and 16 non-autistic control participants (12 males) performing two tasks (spatial/verbal) previously shown to activate medial rostral prefrontal cortex (mrPFC). Each task manipulated: (i) attention towards perceptual versus self-generated information and (ii) reflection on another person's mental state (‘mentalizing'versus ‘non-mentalizing’) in a 2 × 2 design. Behavioral performance and group-level fMRI results were similar between groups. However, multi-voxel similarity analyses revealed strong differences. In control participants, the spatial distribution of activity generalized significantly between task contexts (spatial/verbal) when examining the same function (attention/mentalizing) but not when comparing different functions. This pattern was disrupted in the ASD group, indicating abnormal functional specialization within mrPFC, and demonstrating the applicability of multi-voxel pattern analysis to investigations of atypical populations. PMID:19174370
Bayesian multi-task learning for decoding multi-subject neuroimaging data.

PubMed

Marquand, Andre F; Brammer, Michael; Williams, Steven C R; Doyle, Orla M

2014-05-15

Decoding models based on pattern recognition (PR) are becoming increasingly important tools for neuroimaging data analysis. In contrast to alternative (mass-univariate) encoding approaches that use hierarchical models to capture inter-subject variability, inter-subject differences are not typically handled efficiently in PR. In this work, we propose to overcome this problem by recasting the decoding problem in a multi-task learning (MTL) framework. In MTL, a single PR model is used to learn different but related "tasks" simultaneously. The primary advantage of MTL is that it makes more efficient use of the data available and leads to more accurate models by making use of the relationships between tasks. In this work, we construct MTL models where each subject is modelled by a separate task. We use a flexible covariance structure to model the relationships between tasks and induce coupling between them using Gaussian process priors. We present an MTL method for classification problems and demonstrate a novel mapping method suitable for PR models. We apply these MTL approaches to classifying many different contrasts in a publicly available fMRI dataset and show that the proposed MTL methods produce higher decoding accuracy and more consistent discriminative activity patterns than currently used techniques. Our results demonstrate that MTL provides a promising method for multi-subject decoding studies by focusing on the commonalities between a group of subjects rather than the idiosyncratic properties of different subjects. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Functional consequences of experience-dependent plasticity on tactile perception following perceptual learning.

PubMed

Trzcinski, Natalie K; Gomez-Ramirez, Manuel; Hsiao, Steven S

2016-09-01

Continuous training enhances perceptual discrimination and promotes neural changes in areas encoding the experienced stimuli. This type of experience-dependent plasticity has been demonstrated in several sensory and motor systems. Particularly, non-human primates trained to detect consecutive tactile bar indentations across multiple digits showed expanded excitatory receptive fields (RFs) in somatosensory cortex. However, the perceptual implications of these anatomical changes remain undetermined. Here, we trained human participants for 9 days on a tactile task that promoted expansion of multi-digit RFs. Participants were required to detect consecutive indentations of bar stimuli spanning multiple digits. Throughout the training regime we tracked participants' discrimination thresholds on spatial (grating orientation) and temporal tasks on the trained and untrained hands in separate sessions. We hypothesized that training on the multi-digit task would decrease perceptual thresholds on tasks that require stimulus processing across multiple digits, while also increasing thresholds on tasks requiring discrimination on single digits. We observed an increase in orientation thresholds on a single digit. Importantly, this effect was selective for the stimulus orientation and hand used during multi-digit training. We also found that temporal acuity between digits improved across trained digits, suggesting that discriminating the temporal order of multi-digit stimuli can transfer to temporal discrimination of other tactile stimuli. These results suggest that experience-dependent plasticity following perceptual learning improves and interferes with tactile abilities in manners predictive of the task and stimulus features used during training. © 2016 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Functional consequences of experience-dependent plasticity on tactile perception following perceptual learning

PubMed Central

Trzcinski, Natalie K; Gomez-Ramirez, Manuel; Hsiao, Steven S.

2016-01-01

Continuous training enhances perceptual discrimination and promotes neural changes in areas encoding the experienced stimuli. This type of experience-dependent plasticity has been demonstrated in several sensory and motor systems. Particularly, non-human primates trained to detect consecutive tactile bar indentations across multiple digits showed expanded excitatory receptive fields (RFs) in somatosensory cortex. However, the perceptual implications of these anatomical changes remain undetermined. Here, we trained human participants for nine days on a tactile task that promoted expansion of multi-digit RFs. Participants were required to detect consecutive indentations of bar stimuli spanning multiple digits. Throughout the training regime we tracked participants’ discrimination thresholds on spatial (grating orientation) and temporal tasks on the trained and untrained hands in separate sessions. We hypothesized that training on the multi-digit task would decrease perceptual thresholds on tasks that require stimulus processing across multiple digits, while also increasing thresholds on tasks requiring discrimination on single digits. We observed an increase in orientation thresholds on a single-digit. Importantly, this effect was selective for the stimulus orientation and hand used during multi-digit training. We also found that temporal acuity between digits improved across trained digits, suggesting that discriminating the temporal order of multi-digit stimuli can transfer to temporal discrimination of other tactile stimuli. These results suggest that experience-dependent plasticity following perceptual learning improves and interferes with tactile abilities in manners predictive of the task and stimulus features used during training. PMID:27422224
Application of Multi-task Sparse Lasso Feature Extraction and Support Vector Machine Regression in the Stellar Atmospheric Parameterization

NASA Astrophysics Data System (ADS)

Gao, Wei; Li, Xiang-ru

2017-07-01

The multi-task learning takes the multiple tasks together to make analysis and calculation, so as to dig out the correlations among them, and therefore to improve the accuracy of the analyzed results. This kind of methods have been widely applied to the machine learning, pattern recognition, computer vision, and other related fields. This paper investigates the application of multi-task learning in estimating the stellar atmospheric parameters, including the surface temperature (Teff), surface gravitational acceleration (lg g), and chemical abundance ([Fe/H]). Firstly, the spectral features of the three stellar atmospheric parameters are extracted by using the multi-task sparse group Lasso algorithm, then the support vector machine is used to estimate the atmospheric physical parameters. The proposed scheme is evaluated on both the Sloan stellar spectra and the theoretical spectra computed from the Kurucz's New Opacity Distribution Function (NEWODF) model. The mean absolute errors (MAEs) on the Sloan spectra are: 0.0064 for lg (Teff /K), 0.1622 for lg (g/(cm · s-2)), and 0.1221 dex for [Fe/H]; the MAEs on the synthetic spectra are 0.0006 for lg (Teff /K), 0.0098 for lg (g/(cm · s-2)), and 0.0082 dex for [Fe/H]. Experimental results show that the proposed scheme has a rather high accuracy for the estimation of stellar atmospheric parameters.
Symplectic multi-particle tracking on GPUs

NASA Astrophysics Data System (ADS)

Liu, Zhicong; Qiang, Ji

2018-05-01

A symplectic multi-particle tracking model is implemented on the Graphic Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) language. The symplectic tracking model can preserve phase space structure and reduce non-physical effects in long term simulation, which is important for beam property evaluation in particle accelerators. Though this model is computationally expensive, it is very suitable for parallelization and can be accelerated significantly by using GPUs. In this paper, we optimized the implementation of the symplectic tracking model on both single GPU and multiple GPUs. Using a single GPU processor, the code achieves a factor of 2-10 speedup for a range of problem sizes compared with the time on a single state-of-the-art Central Processing Unit (CPU) node with similar power consumption and semiconductor technology. It also shows good scalability on a multi-GPU cluster at Oak Ridge Leadership Computing Facility. In an application to beam dynamics simulation, the GPU implementation helps save more than a factor of two total computing time in comparison to the CPU implementation.
Results of SEI Independent Research and Development Projects

DTIC Science & Technology

2008-12-01

contained there. When laptops with a dual-core processor came out, ITunes fails crashed. ITunes was designed as multi-threaded application, but until...involving product portfolio, in-bound technical marketing, research and development, product engineering, supply chain, and out-bound sales and marketing...of quality and process improvement professionals to the marketing, product engineering, supply chain, product test and sales professionals. 3
Photonic-Networks-on-Chip for High Performance Radiation Survivable Multi-Core Processor Systems

DTIC Science & Technology

2013-12-01

Loss Spectra” Proceedings of SPIE 8255, (2012) and in a journal publication: M. T. Crowley, D. Murrell, N. Patel, M. Breivik , C.-Y. Lin, Y. Li, B.-O...Crowley, D. Murrell, N. Patel, M. Breivik , C.-Y. Lin, Y. Li, B.-O. Fimland and L. F. Lester, "Analytical Modeling of the Temperature Performance of
DOE Office of Scientific and Technical Information (OSTI.GOV)

Dmitriev, Alexander S.; Yemelyanov, Ruslan Yu.; Moscow Institute of Physics and Technology

The paper deals with a new multi-element processor platform assigned for modelling the behaviour of interacting dynamical systems, i.e., active wireless network. Experimentally, this ensemble is implemented in an active network, the active nodes of which include direct chaotic transceivers and special actuator boards containing microcontrollers for modelling the dynamical systems and an information display unit (colored LEDs). The modelling technique and experimental results are described and analyzed.
Accelerating the Gillespie Exact Stochastic Simulation Algorithm using hybrid parallel execution on graphics processing units.

PubMed

Komarov, Ivan; D'Souza, Roshan M

2012-01-01

The Gillespie Stochastic Simulation Algorithm (GSSA) and its variants are cornerstone techniques to simulate reaction kinetics in situations where the concentration of the reactant is too low to allow deterministic techniques such as differential equations. The inherent limitations of the GSSA include the time required for executing a single run and the need for multiple runs for parameter sweep exercises due to the stochastic nature of the simulation. Even very efficient variants of GSSA are prohibitively expensive to compute and perform parameter sweeps. Here we present a novel variant of the exact GSSA that is amenable to acceleration by using graphics processing units (GPUs). We parallelize the execution of a single realization across threads in a warp (fine-grained parallelism). A warp is a collection of threads that are executed synchronously on a single multi-processor. Warps executing in parallel on different multi-processors (coarse-grained parallelism) simultaneously generate multiple trajectories. Novel data-structures and algorithms reduce memory traffic, which is the bottleneck in computing the GSSA. Our benchmarks show an 8×-120× performance gain over various state-of-the-art serial algorithms when simulating different types of models.
An ultra-compact processor module based on the R3000

NASA Astrophysics Data System (ADS)

Mullenhoff, D. J.; Kaschmitter, J. L.; Lyke, J. C.; Forman, G. A.

1992-08-01

Viable high density packaging is of critical importance for future military systems, particularly space borne systems which require minimum weight and size and high mechanical integrity. A leading, emerging technology for high density packaging is multi-chip modules (MCM). During the 1980's, a number of different MCM technologies have emerged. In support of Strategic Defense Initiative Organization (SDIO) programs, Lawrence Livermore National Laboratory (LLNL) has developed, utilized, and evaluated several different MCM technologies. Prior LLNL efforts include modules developed in 1986, using hybrid wafer scale packaging, which are still operational in an Air Force satellite mission. More recent efforts have included very high density cache memory modules, developed using laser pantography. As part of the demonstration effort, LLNL and Phillips Laboratory began collaborating in 1990 in the Phase 3 Multi-Chip Module (MCM) technology demonstration project. The goal of this program was to demonstrate the feasibility of General Electric's (GE) High Density Interconnect (HDI) MCM technology. The design chosen for this demonstration was the processor core for a MIPS R3000 based reduced instruction set computer (RISC), which has been described previously. It consists of the R3000 microprocessor, R3010 floating point coprocessor and 128 Kbytes of cache memory.
Efficient implementation of the many-body Reactive Bond Order (REBO) potential on GPU

NASA Astrophysics Data System (ADS)

Trędak, Przemysław; Rudnicki, Witold R.; Majewski, Jacek A.

2016-09-01

The second generation Reactive Bond Order (REBO) empirical potential is commonly used to accurately model a wide range hydrocarbon materials. It is also extensible to other atom types and interactions. REBO potential assumes complex multi-body interaction model, that is difficult to represent efficiently in the SIMD or SIMT programming model. Hence, despite its importance, no efficient GPGPU implementation has been developed for this potential. Here we present a detailed description of a highly efficient GPGPU implementation of molecular dynamics algorithm using REBO potential. The presented algorithm takes advantage of rarely used properties of the SIMT architecture of a modern GPU to solve difficult synchronizations issues that arise in computations of multi-body potential. Techniques developed for this problem may be also used to achieve efficient solutions of different problems. The performance of proposed algorithm is assessed using a range of model systems. It is compared to highly optimized CPU implementation (both single core and OpenMP) available in LAMMPS package. These experiments show up to 6x improvement in forces computation time using single processor of the NVIDIA Tesla K80 compared to high end 16-core Intel Xeon processor.

Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Secchi, Simone; Tumeo, Antonino; Villa, Oreste

Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy inmore » reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.« less
NASA Tech Briefs, December 2012

NASA Technical Reports Server (NTRS)

2012-01-01

The topics include: Pattern Generator for Bench Test of Digital Boards; 670-GHz Down- and Up-Converting HEMT-Based Mixers; Lidar Electro-Optic Beam Switch with a Liquid Crystal Variable Retarder; Feedback Augmented Sub-Ranging (FASR) Quantizer; Real-Time Distributed Embedded Oscillator Operating Frequency Monitoring; Software Modules for the Proximity-1 Space Link Interleaved Time Synchronization (PITS) Protocol; Description and User Instructions for the Quaternion to Orbit v3 Software; AdapChem; Mars Relay Lander and Orbiter Overflight Profile Estimation; Extended Testability Analysis Tool; Interactive 3D Mars Visualization; Rapid Diagnostics of Onboard Sequences; MER Telemetry Processor; pyam: Python Implementation of YaM; Process for Patterning Indium for Bump Bonding; Archway for Radiation and Micrometeorite Occurrence Resistance; 4D Light Field Imaging System Using Programmable Aperture; Device and Container for Reheating and Sterilization; Radio Frequency Plasma Discharge Lamps for Use as Stable Calibration Light Sources; Membrane Shell Reflector Segment Antenna; High-Speed Transport of Fluid Drops and Solid Particles via Surface Acoustic Waves; Compact Autonomous Hemispheric Vision System; A Distributive, Non-Destructive, Real-Time Approach to Snowpack Monitoring; Wideband Single-Crystal Transducer for Bone Characterization; Numerical Simulation of Rocket Exhaust Interaction With Lunar Soil; Motion Imagery and Robotics Application (MIRA): Standards-Based Robotics; Particle Filtering for Model-Based Anomaly Detection in Sensor Networks; Ka-band Digitally Beamformed Airborne Radar Using SweepSAR Technique; Composite With In Situ Plenums; Multi-Beam Approach for Accelerating Alignment and Calibration of HyspIRI-Like Imaging Spectrometers; JWST Lifting System; Next-Generation Tumbleweed Rover; Pneumatic System for Concentration of Micrometer-Size Lunar Soil.
Multi-Modal Traveler Information System - Gateway Functional Requirements

DOT National Transportation Integrated Search

1997-11-17

The Multi-Modal Traveler Information System (MMTIS) project involves a large number of Intelligent Transportation System (ITS) related tasks. It involves research of all ITS initiatives in the Gary-Chicago-Milwaukee (GCM) Corridor which are currently...
Multi-Modal Traveler Information System - Gateway Interface Control Requirements

DOT National Transportation Integrated Search

1997-10-30

The Multi-Modal Traveler Information System (MMTIS) project involves a large number of Intelligent Transportation System (ITS) related tasks. It involves research of all ITS initiatives in the Gary-Chicago-Milwaukee (GCM) Corridor which are currently...
Center for Technology for Advanced Scientific Componet Software (TASCS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Govindaraju, Madhusudhan

Advanced Scientific Computing Research Computer Science FY 2010Report Center for Technology for Advanced Scientific Component Software: Distributed CCA State University of New York, Binghamton, NY, 13902 Summary The overall objective of Binghamton's involvement is to work on enhancements of the CCA environment, motivated by the applications and research initiatives discussed in the proposal. This year we are working on re-focusing our design and development efforts to develop proof-of-concept implementations that have the potential to significantly impact scientific components. We worked on developing parallel implementations for non-hydrostatic code and worked on a model coupling interface for biogeochemical computations coded in MATLAB.more » We also worked on the design and implementation modules that will be required for the emerging MapReduce model to be effective for scientific applications. Finally, we focused on optimizing the processing of scientific datasets on multi-core processors. Research Details We worked on the following research projects that we are working on applying to CCA-based scientific applications. 1. Non-Hydrostatic Hydrodynamics: Non-static hydrodynamics are significantly more accurate at modeling internal waves that may be important in lake ecosystems. Non-hydrostatic codes, however, are significantly more computationally expensive, often prohibitively so. We have worked with Chin Wu at the University of Wisconsin to parallelize non-hydrostatic code. We have obtained a speed up of about 26 times maximum. Although this is significant progress, we hope to improve the performance further, such that it becomes a practical alternative to hydrostatic codes. 2. Model-coupling for water-based ecosystems: To answer pressing questions about water resources requires that physical models (hydrodynamics) be coupled with biological and chemical models. Most hydrodynamics codes are written in Fortran, however, while most ecologists work in MATLAB. This disconnect creates a great barrier. To address this, we are working on a model coupling interface that will allow biogeochemical computations written in MATLAB to couple with Fortran codes. This will greatly improve the productivity of ecosystem scientists. 2. Low overhead and Elastic MapReduce Implementation Optimized for Memory and CPU-Intensive Applications: Since its inception, MapReduce has frequently been associated with Hadoop and large-scale datasets. Its deployment at Amazon in the cloud, and its applications at Yahoo! for large-scale distributed document indexing and database building, among other tasks, have thrust MapReduce to the forefront of the data processing application domain. The applicability of the paradigm however extends far beyond its use with data intensive applications and diskbased systems, and can also be brought to bear in processing small but CPU intensive distributed applications. MapReduce however carries its own burdens. Through experiments using Hadoop in the context of diverse applications, we uncovered latencies and delay conditions potentially inhibiting the expected performance of a parallel execution in CPU-intensive applications. Furthermore, as it currently stands, MapReduce is favored for data-centric applications, and as such tends to be solely applied to disk-based applications. The paradigm, falls short in bringing its novelty to diskless systems dedicated to in-memory applications, and compute intensive programs processing much smaller data, but requiring intensive computations. In this project, we focused both on the performance of processing large-scale hierarchical data in distributed scientific applications, as well as the processing of smaller but demanding input sizes primarily used in diskless, and memory resident I/O systems. We designed LEMO-MR [1], a Low overhead, elastic, configurable for in- memory applications, and on-demand fault tolerance, an optimized implementation of MapReduce, for both on disk and in memory applications. We conducted experiments to identify not only the necessary components of this model, but also trade offs and factors to be considered. We have initial results to show the efficacy of our implementation in terms of potential speedup that can be achieved for representative data sets used by cloud applications. We have quantified the performance gains exhibited by our MapReduce implementation over Apache Hadoop in a compute intensive environment. 3. Cache Performance Optimization for Processing XML and HDF-based Application Data on Multi-core Processors: It is important to design and develop scientific middleware libraries to harness the opportunities presented by emerging multi-core processors. Implementations of scientific middleware and applications that do not adapt to the programming paradigm when executing on emerging processors can severely impact the overall performance. In this project, we focused on the utilization of the L2 cache, which is a critical shared resource on chip multiprocessors (CMP). The access pattern of the shared L2 cache, which is dependent on how the application schedules and assigns processing work to each thread, can either enhance or hurt the ability to hide memory latency on a multi-core processor. Therefore, while processing scientific datasets such as HDF5, it is essential to conduct fine-grained analysis of cache utilization, to inform scheduling decisions in multi-threaded programming. In this project, using the TAU toolkit for performance feedback from dual- and quad-core machines, we conducted performance analysis and recommendations on how processing threads can be scheduled on multi-core nodes to enhance the performance of a class of scientific applications that requires processing of HDF5 data. In particular, we quantified the gains associated with the use of the adaptations we have made to the Cache-Affinity and Balanced-Set scheduling algorithms to improve L2 cache performance, and hence the overall application execution time [2]. References: 1. Zacharia Fadika, Madhusudhan Govindaraju, ``MapReduce Implementation for Memory-Based and Processing Intensive Applications'', accepted in 2nd IEEE International Conference on Cloud Computing Technology and Science, Indianapolis, USA, Nov 30 - Dec 3, 2010. 2. Rajdeep Bhowmik, Madhusudhan Govindaraju, ``Cache Performance Optimization for Processing XML-based Application Data on Multi-core Processors'', in proceedings of The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 17-20, 2010, Melbourne, Victoria, Australia. Contact Information: Madhusudhan Govindaraju Binghamton University State University of New York (SUNY) mgovinda@cs.binghamton.edu Phone: 607-777-4904« less
An enhanced Ada run-time system for real-time embedded processors

NASA Technical Reports Server (NTRS)

Sims, J. T.

1991-01-01

An enhanced Ada run-time system has been developed to support real-time embedded processor applications. The primary focus of this development effort has been on the tasking system and the memory management facilities of the run-time system. The tasking system has been extended to support efficient and precise periodic task execution as required for control applications. Event-driven task execution providing a means of task-asynchronous control and communication among Ada tasks is supported in this system. Inter-task control is even provided among tasks distributed on separate physical processors. The memory management system has been enhanced to provide object allocation and protected access support for memory shared between disjoint processors, each of which is executing a distinct Ada program.
Cross-domain and multi-task transfer learning of deep convolutional neural network for breast cancer diagnosis in digital breast tomosynthesis

NASA Astrophysics Data System (ADS)

Samala, Ravi K.; Chan, Heang-Ping; Hadjiiski, Lubomir; Helvie, Mark A.; Richter, Caleb; Cha, Kenny

2018-02-01

We propose a cross-domain, multi-task transfer learning framework to transfer knowledge learned from non-medical images by a deep convolutional neural network (DCNN) to medical image recognition task while improving the generalization by multi-task learning of auxiliary tasks. A first stage cross-domain transfer learning was initiated from ImageNet trained DCNN to mammography trained DCNN. 19,632 regions-of-interest (ROI) from 2,454 mass lesions were collected from two imaging modalities: digitized-screen film mammography (SFM) and full-field digital mammography (DM), and split into training and test sets. In the multi-task transfer learning, the DCNN learned the mass classification task simultaneously from the training set of SFM and DM. The best transfer network for mammography was selected from three transfer networks with different number of convolutional layers frozen. The performance of single-task and multitask transfer learning on an independent SFM test set in terms of the area under the receiver operating characteristic curve (AUC) was 0.78+/-0.02 and 0.82+/-0.02, respectively. In the second stage cross-domain transfer learning, a set of 12,680 ROIs from 317 mass lesions on DBT were split into validation and independent test sets. We first studied the data requirements for the first stage mammography trained DCNN by varying the mammography training data from 1% to 100% and evaluated its learning on the DBT validation set in inference mode. We found that the entire available mammography set provided the best generalization. The DBT validation set was then used to train only the last four fully connected layers, resulting in an AUC of 0.90+/-0.04 on the independent DBT test set.
On the role of cost-sensitive learning in multi-class brain-computer interfaces.

PubMed

Devlaminck, Dieter; Waegeman, Willem; Wyns, Bart; Otte, Georges; Santens, Patrick

2010-06-01

Brain-computer interfaces (BCIs) present an alternative way of communication for people with severe disabilities. One of the shortcomings in current BCI systems, recently put forward in the fourth BCI competition, is the asynchronous detection of motor imagery versus resting state. We investigated this extension to the three-class case, in which the resting state is considered virtually lying between two motor classes, resulting in a large penalty when one motor task is misclassified into the other motor class. We particularly focus on the behavior of different machine-learning techniques and on the role of multi-class cost-sensitive learning in such a context. To this end, four different kernel methods are empirically compared, namely pairwise multi-class support vector machines (SVMs), two cost-sensitive multi-class SVMs and kernel-based ordinal regression. The experimental results illustrate that ordinal regression performs better than the other three approaches when a cost-sensitive performance measure such as the mean-squared error is considered. By contrast, multi-class cost-sensitive learning enables us to control the number of large errors made between two motor tasks.
Unsupervised learning in probabilistic neural networks with multi-state metal-oxide memristive synapses

NASA Astrophysics Data System (ADS)

Serb, Alexander; Bill, Johannes; Khiat, Ali; Berdan, Radu; Legenstein, Robert; Prodromakis, Themis

2016-09-01

In an increasingly data-rich world the need for developing computing systems that cannot only process, but ideally also interpret big data is becoming continuously more pressing. Brain-inspired concepts have shown great promise towards addressing this need. Here we demonstrate unsupervised learning in a probabilistic neural network that utilizes metal-oxide memristive devices as multi-state synapses. Our approach can be exploited for processing unlabelled data and can adapt to time-varying clusters that underlie incoming data by supporting the capability of reversible unsupervised learning. The potential of this work is showcased through the demonstration of successful learning in the presence of corrupted input data and probabilistic neurons, thus paving the way towards robust big-data processors.
Load Balancing Strategies for Multi-Block Overset Grid Applications

NASA Technical Reports Server (NTRS)

Djomehri, M. Jahed; Biswas, Rupak; Lopez-Benitez, Noe; Biegel, Bryan (Technical Monitor)

2002-01-01

The multi-block overset grid method is a powerful technique for high-fidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process uses a grid system that discretizes the problem domain by using separately generated but overlapping structured grids that periodically update and exchange boundary information through interpolation. For efficient high performance computations of large-scale realistic applications using this methodology, the individual grids must be properly partitioned among the parallel processors. Overall performance, therefore, largely depends on the quality of load balancing. In this paper, we present three different load balancing strategies far overset grids and analyze their effects on the parallel efficiency of a Navier-Stokes CFD application running on an SGI Origin2000 machine.
Estimation of surface soil moisture and roughness from multi-angular ASAR imagery in the Watershed Allied Telemetry Experimental Research (WATER)

NASA Astrophysics Data System (ADS)

Wang, S. G.; Li, X.; Han, X. J.; Jin, R.

2010-06-01

Radar remote sensing has demonstrated its applicability to the retrieval of basin-scale soil moisture. The mechanism of radar backscattering from soils is complicated and strongly influenced by surface roughness. Furthermore, retrieval of soil moisture using AIEM-like models is a classic example of the underdetermined problem due to a lack of credible known soil roughness distributions at a regional scale. Characterization of this roughness is therefore crucial for an accurate derivation of soil moisture based on backscattering models. This study aims to directly obtain surface roughness information along with soil moisture from multi-angular ASAR images. The method first used a semi-empirical relationship that connects the roughness slope (Zs) and the difference in backscattering coefficient (Δσ) from ASAR data in different incidence angles, in combination with an optimal calibration form consisting of two roughness parameters (the standard deviation of surface height and the correlation length), to estimate the roughness parameters. The deduced surface roughness was then used in the AIEM model for the retrieval of soil moisture. An evaluation of the proposed method was performed in a grassland site in the middle stream of the Heihe River Basin, where the Watershed Allied Telemetry Experimental Research (WATER) was taken place. It has demonstrated that the method is feasible to achieve reliable estimation of soil water content. The key challenge to surface soil moisture retrieval is the presence of vegetation cover, which significantly impacts the estimates of surface roughness and soil moisture.
The planning coordinator: A design architecture for autonomous error recovery and on-line planning of intelligent tasks

NASA Technical Reports Server (NTRS)

Farah, Jeffrey J.

1992-01-01

Developing a robust, task level, error recovery and on-line planning architecture is an open research area. There is previously published work on both error recovery and on-line planning; however, none incorporates error recovery and on-line planning into one integrated platform. The integration of these two functionalities requires an architecture that possesses the following characteristics. The architecture must provide for the inclusion of new information without the destruction of existing information. The architecture must provide for the relating of pieces of information, old and new, to one another in a non-trivial rather than trivial manner (e.g., object one is related to object two under the following constraints, versus, yes, they are related; no, they are not related). Finally, the architecture must be not only a stand alone architecture, but also one that can be easily integrated as a supplement to some existing architecture. This thesis proposal addresses architectural development. Its intent is to integrate error recovery and on-line planning onto a single, integrated, multi-processor platform. This intelligent x-autonomous platform, called the Planning Coordinator, will be used initially to supplement existing x-autonomous systems and eventually replace them.
Smart active pilot-in-the-loop systems

NASA Astrophysics Data System (ADS)

Thomas, Segun

1995-04-01

Representation of on-orbit microgravity environment in a 1-g environment is a continuing problem in space engineering analysis, procedures development and crew training. A way of adequately depicting weightlessness in the performance of on-orbit tasks is by a realistic (or real-time) computer based representation that provides the look, touch, and feel of on-orbit operation. This paper describes how a facility, the Systems Engineering Simulator at the Johnson Space Center, is utilizing recent advances in computer processing power and multi- processing capability to intelligently represent all systems, sub-systems and environmental elements associated with space flight operations. It first describes the computer hardware and interconnection between processors; the computer software responsible for task scheduling, health monitoring, sub-system and environment representation; control room and crew station. It then describes, the mathematical models that represent the dynamics of contact between the Mir and the Space Shuttle during the upcoming US and Russian Shuttle/Mir space mission. Results are presented comparing the response of the smart, active pilot-in-the-loop system to non-time critical CRAY model. A final example of how these systems are utilized is given in the development that supported the highly successful Hubble Space Telescope repair mission.
Modeling Alzheimer's disease cognitive scores using multi-task sparse group lasso.

PubMed

Liu, Xiaoli; Goncalves, André R; Cao, Peng; Zhao, Dazhe; Banerjee, Arindam

2018-06-01

Alzheimer's disease (AD) is a severe neurodegenerative disorder characterized by loss of memory and reduction in cognitive functions due to progressive degeneration of neurons and their connections, eventually leading to death. In this paper, we consider the problem of simultaneously predicting several different cognitive scores associated with categorizing subjects as normal, mild cognitive impairment (MCI), or Alzheimer's disease (AD) in a multi-task learning framework using features extracted from brain images obtained from ADNI (Alzheimer's Disease Neuroimaging Initiative). To solve the problem, we present a multi-task sparse group lasso (MT-SGL) framework, which estimates sparse features coupled across tasks, and can work with loss functions associated with any Generalized Linear Models. Through comparisons with a variety of baseline models using multiple evaluation metrics, we illustrate the promising predictive performance of MT-SGL on ADNI along with its ability to identify brain regions more likely to help the characterization Alzheimer's disease progression. Copyright © 2017 Elsevier Ltd. All rights reserved.
Genotype-phenotype association study via new multi-task learning model

PubMed Central

Huo, Zhouyuan; Shen, Dinggang

2018-01-01

Research on the associations between genetic variations and imaging phenotypes is developing with the advance in high-throughput genotype and brain image techniques. Regression analysis of single nucleotide polymorphisms (SNPs) and imaging measures as quantitative traits (QTs) has been proposed to identify the quantitative trait loci (QTL) via multi-task learning models. Recent studies consider the interlinked structures within SNPs and imaging QTs through group lasso, e.g. ℓ2,1-norm, leading to better predictive results and insights of SNPs. However, group sparsity is not enough for representing the correlation between multiple tasks and ℓ2,1-norm regularization is not robust either. In this paper, we propose a new multi-task learning model to analyze the associations between SNPs and QTs. We suppose that low-rank structure is also beneficial to uncover the correlation between genetic variations and imaging phenotypes. Finally, we conduct regression analysis of SNPs and QTs. Experimental results show that our model is more accurate in prediction than compared methods and presents new insights of SNPs. PMID:29218896
Subjective scaling of mental workload in a multi-task environment

NASA Technical Reports Server (NTRS)

Daryanian, B.

1982-01-01

Those factors in a multi-task environment that contribute to the operators' "sense" of mental workload were identified. The subjective judgment as conscious experience of mental effort was decided to be the appropriate method of measurement. Thurstone's law of comparative judgment was employed in order to construct interval scales of subjective mental workload from paired comparisons data. An experimental paradigm (Simulated Multi-Task Decision-Making Environment) was employed to represent the ideal experimentally controlled environment in which human operators were asked to "attend" to different cases of Tulga's decision making tasks. Through various statistical analyses it was found that, in general, a lower number of tasks-to-be-processed per unit time (a condition associated with longer interarrival times), results in a lower mental workload, a higher consistency of judgments within a subject, a higher degree of agreement among the subjects, and larger distances between the cases on the Thurstone scale of subjective mental workload. The effects of various control variables and their interactions, and the different characteristics of the subjects on the variation of subjective mental workload are demonstrated.
Demonstrating Acquisition of Real-Time Thermal Data Over Fires Utilizing UAVs

NASA Technical Reports Server (NTRS)

Ambrosia, Vincent G.; Wegener, Steven S.; Brass, James A.; Buechel, Sally W.; Peterson, David L. (Technical Monitor)

2002-01-01

A disaster mitigation demonstration, designed to integrate remote-piloted aerial platforms, a thermal infrared imaging payload, over-the-horizon (OTH) data telemetry and advanced image geo-rectification technologies was initiated in 2001. Project FiRE incorporates the use of a remotely piloted Uninhabited Aerial Vehicle (UAV), thermal imagery, and over-the-horizon satellite data telemetry to provide geo-corrected data over a controlled burn, to a fire management community in near real-time. The experiment demonstrated the use of a thermal multi-spectral scanner, integrated on a large payload capacity UAV, distributing data over-the-horizon via satellite communication telemetry equipment, and precision geo-rectification of the resultant data on the ground for data distribution to the Internet. The use of the UAV allowed remote-piloted flight (thereby reducing the potential for loss of human life during hazardous missions), and the ability to "finger and stare" over the fire for extended periods of time (beyond the capabilities of human-pilot endurance). Improved bit-rate capacity telemetry capabilities increased the amount, structure, and information content of the image data relayed to the ground. The integration of precision navigation instrumentation allowed improved accuracies in geo-rectification of the resultant imagery, easing data ingestion and overlay in a GIS framework. We focus on these technological advances and demonstrate how these emerging technologies can be readily integrated to support disaster mitigation and monitoring strategies regionally and nationally.
Multi-Modal Traveler Information System - GCM Corridor Architecture Interface Control Requirements

DOT National Transportation Integrated Search

1997-10-31

The Multi-Modal Traveler Information System (MMTIS) project involves a large number of Intelligent Transportation System (ITS) related tasks. It involves research of all ITS initiatives in the Gary-Chicago-Milwaukee (GCM) Corridor which are currently...
Multi-Modal Traveler Information System - GCM Corridor Architecture Functional Requirements

DOT National Transportation Integrated Search

1997-11-17

The Multi-Modal Traveler Information System (MMTIS) project involves a large number of Intelligent Transportation System (ITS) related tasks. It involves research of all ITS initiatives in the Gary-Chicago-Milwaukee (GCM) Corridor which are currently...
Study of multi-LLID technology to support multi-services carring in EPONS

NASA Astrophysics Data System (ADS)

Li, Wang; Yi, Benshun; Cheng, Chuanqing

2006-09-01

The Ethernet Passive Optical Network (EPON) has recently attracted more and more research attentions since it could be a perfect candidate for next generation access networks. EPON utilizes pon structure to carry ethernet data, having the both advantages of pon and ethernet devices. From traditional view, EPON is considered to only be a Ethernet services access platform and wake in supporting multi-services especially real-time service. It is obvious that if epon designed only to aim to carrying data service, it is difficult for epon devices to fulfill service provider's command of taking EPON as a integrated service access platform. So discussing the multi-services carrying technology in EPONs is a significative task. This paper deploy a novel method of multi-llid to support multi-services carrying in EPONs.

A New Multi-Sensor Track Fusion Architecture for Multi-Sensor Information Integration

DTIC Science & Technology

2004-09-01

NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION ...NAME(S) AND ADDRESS(ES) Lockheed Martin Aeronautical Systems Company,Marietta,GA,3063 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING...tracking process and degrades the track accuracy. ARCHITECHTURE OF MULTI-SENSOR TRACK FUSION MODEL The Alpha
Modulation frequency discrimination with single and multiple channels in cochlear implant users

PubMed Central

Galvin, John J.; Oba, Sandy; Başkent, Deniz; Fu, Qian-Jie

2015-01-01

Temporal envelope cues convey important speech information for cochlear implant (CI) users. Many studies have explored CI users’ single-channel temporal envelope processing. However, in clinical CI speech processors, temporal envelope information is processed by multiple channels. Previous studies have shown that amplitude modulation frequency discrimination (AMFD) thresholds are better when temporal envelopes are delivered to multiple rather than single channels. In clinical fitting, current levels on single channels must often be reduced to accommodate multi-channel loudness summation. As such, it is unclear whether the multi-channel advantage in AMFD observed in previous studies was due to coherent envelope information distributed across the cochlea or to greater loudness associated with multi-channel stimulation. In this study, single- and multi-channel AMFD thresholds were measured in CI users. Multi-channel component electrodes were either widely or narrowly spaced to vary the degree of overlap between neural populations. The reference amplitude modulation (AM) frequency was 100 Hz, and coherent modulation was applied to all channels. In Experiment 1, single- and multi-channel AMFD thresholds were measured at similar loudness. In this case, current levels on component channels were higher for single- than for multi-channel AM stimuli, and the modulation depth was approximately 100% of the perceptual dynamic range (i.e., between threshold and maximum acceptable loudness). Results showed no significant difference in AMFD thresholds between similarly loud single- and multi-channel modulated stimuli. In Experiment 2, single- and multi-channel AMFD thresholds were compared at substantially different loudness. In this case, current levels on component channels were the same for single-and multi-channel stimuli (“summation-adjusted” current levels) and the same range of modulation (in dB) was applied to the component channels for both single- and multi-channel testing. With the summation-adjusted current levels, loudness was lower with single than with multiple channels and the AM depth resulted in substantial stimulation below single-channel audibility, thereby reducing the perceptual range of AM. Results showed that AMFD thresholds were significantly better with multiple channels than with any of the single component channels. There was no significant effect of the distribution of electrodes on multi-channel AMFD thresholds. The results suggest that increased loudness due to multi-channel summation may contribute to the multi-channel advantage in AMFD, and that that overall loudness may matter more than the distribution of envelope information in the cochlea. PMID:25746914
Multi-view L2-SVM and its multi-view core vector machine.

PubMed

Huang, Chengquan; Chung, Fu-lai; Wang, Shitong

2016-03-01

In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like μ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.
A label distance maximum-based classifier for multi-label learning.

PubMed

Liu, Xiaoli; Bao, Hang; Zhao, Dazhe; Cao, Peng

2015-01-01

Multi-label classification is useful in many bioinformatics tasks such as gene function prediction and protein site localization. This paper presents an improved neural network algorithm, Max Label Distance Back Propagation Algorithm for Multi-Label Classification. The method was formulated by modifying the total error function of the standard BP by adding a penalty term, which was realized by maximizing the distance between the positive and negative labels. Extensive experiments were conducted to compare this method against state-of-the-art multi-label methods on three popular bioinformatic benchmark datasets. The results illustrated that this proposed method is more effective for bioinformatic multi-label classification compared to commonly used techniques.
Radio-nuclide mixture identification using medium energy resolution detectors

DOEpatents

Nelson, Karl Einar

2013-09-17

According to one embodiment, a method for identifying radio-nuclides includes receiving spectral data, extracting a feature set from the spectral data comparable to a plurality of templates in a template library, and using a branch and bound method to determine a probable template match based on the feature set and templates in the template library. In another embodiment, a device for identifying unknown radio-nuclides includes a processor, a multi-channel analyzer, and a memory operatively coupled to the processor, the memory having computer readable code stored thereon. The computer readable code is configured, when executed by the processor, to receive spectral data, to extract a feature set from the spectral data comparable to a plurality of templates in a template library, and to use a branch and bound method to determine a probable template match based on the feature set and templates in the template library.
Accelerating Climate Simulations Through Hybrid Computing

NASA Technical Reports Server (NTRS)

Zhou, Shujia; Sinno, Scott; Cruz, Carlos; Purcell, Mark

2009-01-01

Unconventional multi-core processors (e.g., IBM Cell B/E and NYIDIDA GPU) have emerged as accelerators in climate simulation. However, climate models typically run on parallel computers with conventional processors (e.g., Intel and AMD) using MPI. Connecting accelerators to this architecture efficiently and easily becomes a critical issue. When using MPI for connection, we identified two challenges: (1) identical MPI implementation is required in both systems, and; (2) existing MPI code must be modified to accommodate the accelerators. In response, we have extended and deployed IBM Dynamic Application Virtualization (DAV) in a hybrid computing prototype system (one blade with two Intel quad-core processors, two IBM QS22 Cell blades, connected with Infiniband), allowing for seamlessly offloading compute-intensive functions to remote, heterogeneous accelerators in a scalable, load-balanced manner. Currently, a climate solar radiation model running with multiple MPI processes has been offloaded to multiple Cell blades with approx.10% network overhead.
Reconfigurable L-Band Radar

NASA Technical Reports Server (NTRS)

Rincon, Rafael F.

2008-01-01

The reconfigurable L-Band radar is an ongoing development at NASA/GSFC that exploits the capability inherently in phased array radar systems with a state-of-the-art data acquisition and real-time processor in order to enable multi-mode measurement techniques in a single radar architecture. The development leverages on the L-Band Imaging Scatterometer, a radar system designed for the development and testing of new radar techniques; and the custom-built DBSAR processor, a highly reconfigurable, high speed data acquisition and processing system. The radar modes currently implemented include scatterometer, synthetic aperture radar, and altimetry; and plans to add new modes such as radiometry and bi-static GNSS signals are being formulated. This development is aimed at enhancing the radar remote sensing capabilities for airborne and spaceborne applications in support of Earth Science and planetary exploration This paper describes the design of the radar and processor systems, explains the operational modes, and discusses preliminary measurements and future plans.
Cross-Layer Resilience Exploration

DTIC Science & Technology

2015-03-31

complex 563 server-class systems) and any arbitrary fault model (permanent, transient, multi-bit, etc.) System Design Analysis Using flip- flop ...level fault injection, we rank the vulnerability of each flip- flop in the processor in terms of its likelihood to propagate faults [3]. This allows the...hardened flip- flops , which are flip- flops designed to uphold the bit representation of their output circuit even under particle strikes [1, 6, 10
A Testbed Processor for Embedded Multicomputing

DTIC Science & Technology

1990-04-01

Gajski 85]. These two problems of parallel expression and performance impact the real-time response of a vehicle system and, consequently, what models...and memory access. The following discussion of these problems is primarily from Gajski and Peir [ Gajski 85]. Multi-computers are Multiple Instruction...International Symposium on Unmanned Untethered Submersible Technology, University of New Hampshire, Durham, NH, June 22-24 1987, pp. 33-43. [ Gajski 85
Superconducting Qubit with Integrated Single Flux Quantum Controller Part I: Theory and Fabrication

NASA Astrophysics Data System (ADS)

Beck, Matthew; Leonard, Edward, Jr.; Thorbeck, Ted; Zhu, Shaojiang; Howington, Caleb; Nelson, Jj; Plourde, Britton; McDermott, Robert

As the size of quantum processors grow, so do the classical control requirements. The single flux quantum (SFQ) Josephson digital logic family offers an attractive route to proximal classical control of multi-qubit processors. Here we describe coherent control of qubits via trains of SFQ pulses. We discuss the fabrication of an SFQ-based pulse generator and a superconducting transmon qubit on a single chip. Sources of excess microwave loss stemming from the complex multilayer fabrication of the SFQ circuit are discussed. We show how to mitigate this loss through judicious choice of process workflow and appropriate use of sacrificial protection layers. Present address: IBM T.J. Watson Research Center.
Co-Labeling for Multi-View Weakly Labeled Learning.

PubMed

Xu, Xinxing; Li, Wen; Xu, Dong; Tsang, Ivor W

2016-06-01

It is often expensive and time consuming to collect labeled training samples in many real-world applications. To reduce human effort on annotating training samples, many machine learning techniques (e.g., semi-supervised learning (SSL), multi-instance learning (MIL), etc.) have been studied to exploit weakly labeled training samples. Meanwhile, when the training data is represented with multiple types of features, many multi-view learning methods have shown that classifiers trained on different views can help each other to better utilize the unlabeled training samples for the SSL task. In this paper, we study a new learning problem called multi-view weakly labeled learning, in which we aim to develop a unified approach to learn robust classifiers by effectively utilizing different types of weakly labeled multi-view data from a broad range of tasks including SSL, MIL and relative outlier detection (ROD). We propose an effective approach called co-labeling to solve the multi-view weakly labeled learning problem. Specifically, we model the learning problem on each view as a weakly labeled learning problem, which aims to learn an optimal classifier from a set of pseudo-label vectors generated by using the classifiers trained from other views. Unlike traditional co-training approaches using a single pseudo-label vector for training each classifier, our co-labeling approach explores different strategies to utilize the predictions from different views, biases and iterations for generating the pseudo-label vectors, making our approach more robust for real-world applications. Moreover, to further improve the weakly labeled learning on each view, we also exploit the inherent group structure in the pseudo-label vectors generated from different strategies, which leads to a new multi-layer multiple kernel learning problem. Promising results for text-based image retrieval on the NUS-WIDE dataset as well as news classification and text categorization on several real-world multi-view datasets clearly demonstrate that our proposed co-labeling approach achieves state-of-the-art performance for various multi-view weakly labeled learning problems including multi-view SSL, multi-view MIL and multi-view ROD.
LOSITAN: a workbench to detect molecular adaptation based on a Fst-outlier method.

PubMed

Antao, Tiago; Lopes, Ana; Lopes, Ricardo J; Beja-Pereira, Albano; Luikart, Gordon

2008-07-28

Testing for selection is becoming one of the most important steps in the analysis of multilocus population genetics data sets. Existing applications are difficult to use, leaving many non-trivial, error-prone tasks to the user. Here we present LOSITAN, a selection detection workbench based on a well evaluated Fst-outlier detection method. LOSITAN greatly facilitates correct approximation of model parameters (e.g., genome-wide average, neutral Fst), provides data import and export functions, iterative contour smoothing and generation of graphics in a easy to use graphical user interface. LOSITAN is able to use modern multi-core processor architectures by locally parallelizing fdist, reducing computation time by half in current dual core machines and with almost linear performance gains in machines with more cores. LOSITAN makes selection detection feasible to a much wider range of users, even for large population genomic datasets, by both providing an easy to use interface and essential functionality to complete the whole selection detection process.
A Study about Kalman Filters Applied to Embedded Sensors

PubMed Central

Valade, Aurélien; Acco, Pascal; Grabolosa, Pierre; Fourniols, Jean-Yves

2017-01-01

Over the last decade, smart sensors have grown in complexity and can now handle multiple measurement sources. This work establishes a methodology to achieve better estimates of physical values by processing raw measurements within a sensor using multi-physical models and Kalman filters for data fusion. A driving constraint being production cost and power consumption, this methodology focuses on algorithmic complexity while meeting real-time constraints and improving both precision and reliability despite low power processors limitations. Consequently, processing time available for other tasks is maximized. The known problem of estimating a 2D orientation using an inertial measurement unit with automatic gyroscope bias compensation will be used to illustrate the proposed methodology applied to a low power STM32L053 microcontroller. This application shows promising results with a processing time of 1.18 ms at 32 MHz with a 3.8% CPU usage due to the computation at a 26 Hz measurement and estimation rate. PMID:29206187
Compilation time analysis to minimize run-time overhead in preemptive scheduling on multiprocessors

NASA Astrophysics Data System (ADS)

Wauters, Piet; Lauwereins, Rudy; Peperstraete, J.

1994-10-01

This paper describes a scheduling method for hard real-time Digital Signal Processing (DSP) applications, implemented on a multi-processor. Due to the very high operating frequencies of DSP applications (typically hundreds of kHz) runtime overhead should be kept as small as possible. Because static scheduling introduces very little run-time overhead it is used as much as possible. Dynamic pre-emption of tasks is allowed if and only if it leads to better performance in spite of the extra run-time overhead. We essentially combine static scheduling with dynamic pre-emption using static priorities. Since we are dealing with hard real-time applications we must be able to guarantee at compile-time that all timing requirements will be satisfied at run-time. We will show that our method performs at least as good as any static scheduling method. It also reduces the total amount of dynamic pre-emptions compared with run time methods like deadline monotonic scheduling.
Resting-state brain activity in the motor cortex reflects task-induced activity: A multi-voxel pattern analysis.

PubMed

Kusano, Toshiki; Kurashige, Hiroki; Nambu, Isao; Moriguchi, Yoshiya; Hanakawa, Takashi; Wada, Yasuhiro; Osu, Rieko

2015-08-01

It has been suggested that resting-state brain activity reflects task-induced brain activity patterns. In this study, we examined whether neural representations of specific movements can be observed in the resting-state brain activity patterns of motor areas. First, we defined two regions of interest (ROIs) to examine brain activity associated with two different behavioral tasks. Using multi-voxel pattern analysis with regularized logistic regression, we designed a decoder to detect voxel-level neural representations corresponding to the tasks in each ROI. Next, we applied the decoder to resting-state brain activity. We found that the decoder discriminated resting-state neural activity with accuracy comparable to that associated with task-induced neural activity. The distribution of learned weighted parameters for each ROI was similar for resting-state and task-induced activities. Large weighted parameters were mainly located on conjunctive areas. Moreover, the accuracy of detection was higher than that for a decoder whose weights were randomly shuffled, indicating that the resting-state brain activity includes multi-voxel patterns similar to the neural representation for the tasks. Therefore, these results suggest that the neural representation of resting-state brain activity is more finely organized and more complex than conventionally considered.
Parallelizing ATLAS Reconstruction and Simulation: Issues and Optimization Solutions for Scaling on Multi- and Many-CPU Platforms

NASA Astrophysics Data System (ADS)

Leggett, C.; Binet, S.; Jackson, K.; Levinthal, D.; Tatarkhanov, M.; Yao, Y.

2011-12-01

Thermal limitations have forced CPU manufacturers to shift from simply increasing clock speeds to improve processor performance, to producing chip designs with multi- and many-core architectures. Further the cores themselves can run multiple threads as a zero overhead context switch allowing low level resource sharing (Intel Hyperthreading). To maximize bandwidth and minimize memory latency, memory access has become non uniform (NUMA). As manufacturers add more cores to each chip, a careful understanding of the underlying architecture is required in order to fully utilize the available resources. We present AthenaMP and the Atlas event loop manager, the driver of the simulation and reconstruction engines, which have been rewritten to make use of multiple cores, by means of event based parallelism, and final stage I/O synchronization. However, initial studies on 8 andl6 core Intel architectures have shown marked non-linearities as parallel process counts increase, with as much as 30% reductions in event throughput in some scenarios. Since the Intel Nehalem architecture (both Gainestown and Westmere) will be the most common choice for the next round of hardware procurements, an understanding of these scaling issues is essential. Using hardware based event counters and Intel's Performance Tuning Utility, we have studied the performance bottlenecks at the hardware level, and discovered optimization schemes to maximize processor throughput. We have also produced optimization mechanisms, common to all large experiments, that address the extreme nature of today's HEP code, which due to it's size, places huge burdens on the memory infrastructure of today's processors.
UNIX: A Tool for Information Management.

ERIC Educational Resources Information Center

Frey, Dean

1989-01-01

Describes UNIX, a computer operating system that supports multi-task and multi-user operations. Characteristics that make it especially suitable for library applications are discussed, including a hierarchical file structure and utilities for text processing, database activities, and bibliographic work. Sources of information on hardware…
A general natural-language text processor for clinical radiology.

PubMed Central

Friedman, C; Alderson, P O; Austin, J H; Cimino, J J; Johnson, S B

1994-01-01

OBJECTIVE: Development of a general natural-language processor that identifies clinical information in narrative reports and maps that information into a structured representation containing clinical terms. DESIGN: The natural-language processor provides three phases of processing, all of which are driven by different knowledge sources. The first phase performs the parsing. It identifies the structure of the text through use of a grammar that defines semantic patterns and a target form. The second phase, regularization, standardizes the terms in the initial target structure via a compositional mapping of multi-word phrases. The third phase, encoding, maps the terms to a controlled vocabulary. Radiology is the test domain for the processor and the target structure is a formal model for representing clinical information in that domain. MEASUREMENTS: The impression sections of 230 radiology reports were encoded by the processor. Results of an automated query of the resultant database for the occurrences of four diseases were compared with the analysis of a panel of three physicians to determine recall and precision. RESULTS: Without training specific to the four diseases, recall and precision of the system (combined effect of the processor and query generator) were 70% and 87%. Training of the query component increased recall to 85% without changing precision. PMID:7719797
Assessing Multi-Person and Person-Machine Distributed Decision Making Using an Extended Psychological Distancing Model

DTIC Science & Technology

1990-02-01

human-to- human communication patterns during situation assessment and cooperative problem solving tasks. The research proposed for the second URRP year...Hardware development. In order to create an environment within which to study multi-channeled human-to- human communication , a multi-media observation...that machine-to- human communication can be used to increase cohesion between humans and intelligent machines and to promote human-machine team
Multi-Lingual Deep Neural Networks for Language Recognition

DTIC Science & Technology

2016-08-08

training configurations for the NIST 2011 and 2015 lan- guage recognition evaluations (LRE11 and LRE15). The best per- forming multi-lingual BN-DNN...very ef- fective approach in the NIST 2015 language recognition evaluation (LRE15) open training condition [4, 5]. In this work we evaluate the impact...language are summarized in Table 2. Two language recognition tasks are used for evaluating the multi-lingual bottleneck systems. The first is the NIST

Multi-segmental movement patterns reflect juggling complexity and skill level.

PubMed

Zago, Matteo; Pacifici, Ilaria; Lovecchio, Nicola; Galli, Manuela; Federolf, Peter Andreas; Sforza, Chiarella

2017-08-01

The juggling action of six experts and six intermediates jugglers was recorded with a motion capture system and decomposed into its fundamental components through Principal Component Analysis. The aim was to quantify trends in movement dimensionality, multi-segmental patterns and rhythmicity as a function of proficiency level and task complexity. Dimensionality was quantified in terms of Residual Variance, while the Relative Amplitude was introduced to account for individual differences in movement components. We observed that: experience-related modifications in multi-segmental actions exist, such as the progressive reduction of error-correction movements, especially in complex task condition. The systematic identification of motor patterns sensitive to the acquisition of specific experience could accelerate the learning process. Copyright © 2017 Elsevier B.V. All rights reserved.
Architecture design of the multi-functional wavelet-based ECG microprocessor for realtime detection of abnormal cardiac events.

PubMed

Cheng, Li-Fang; Chen, Tung-Chien; Chen, Liang-Gee

2012-01-01

Most of the abnormal cardiac events such as myocardial ischemia, acute myocardial infarction (AMI) and fatal arrhythmia can be diagnosed through continuous electrocardiogram (ECG) analysis. According to recent clinical research, early detection and alarming of such cardiac events can reduce the time delay to the hospital, and the clinical outcomes of these individuals can be greatly improved. Therefore, it would be helpful if there is a long-term ECG monitoring system with the ability to identify abnormal cardiac events and provide realtime warning for the users. The combination of the wireless body area sensor network (BASN) and the on-sensor ECG processor is a possible solution for this application. In this paper, we aim to design and implement a digital signal processor that is suitable for continuous ECG monitoring and alarming based on the continuous wavelet transform (CWT) through the proposed architectures--using both programmable RISC processor and application specific integrated circuits (ASIC) for performance optimization. According to the implementation results, the power consumption of the proposed processor integrated with an ASIC for CWT computation is only 79.4 mW. Compared with the single-RISC processor, about 91.6% of the power reduction is achieved.
Artificial emotion triggered stochastic behavior transitions with motivational gain effects for multi-objective robot tasks

NASA Astrophysics Data System (ADS)

Dağlarli, Evren; Temeltaş, Hakan

2007-04-01

This paper presents artificial emotional system based autonomous robot control architecture. Hidden Markov model developed as mathematical background for stochastic emotional and behavior transitions. Motivation module of architecture considered as behavioral gain effect generator for achieving multi-objective robot tasks. According to emotional and behavioral state transition probabilities, artificial emotions determine sequences of behaviors. Also motivational gain effects of proposed architecture can be observed on the executing behaviors during simulation.
Metric Identification and Protocol Development for Characterizing DNAPL Source Zone Architecture and Associated Plume Response

DTIC Science & Technology

2013-09-01

M.4.1. Two-dimensional domains cropped out of three-dimensional numerically generated realizations; (a) 3D PCE-NAPL realizations generated by UTCHEM...165 Figure R.3.2. The absolute error vs relative error scatter plots of pM and gM from SGS data set- 4 using multi-task manifold...error scatter plots of pM and gM from TP/MC data set using multi- task manifold regression
Emergency response nurse scheduling with medical support robot by multi-agent and fuzzy technique.

PubMed

Kono, Shinya; Kitamura, Akira

2015-08-01

In this paper, a new co-operative re-scheduling method corresponding the medical support tasks that the time of occurrence can not be predicted is described, assuming robot can co-operate medical activities with the nurse. Here, Multi-Agent-System (MAS) is used for the co-operative re-scheduling, in which Fuzzy-Contract-Net (FCN) is applied to the robots task assignment for the emergency tasks. As the simulation results, it is confirmed that the re-scheduling results by the proposed method can keep the patients satisfaction and decrease the work load of the nurse.
Noise Suppression Based on Multi-Model Compositions Using Multi-Pass Search with Multi-Label N-gram Models

NASA Astrophysics Data System (ADS)

Jitsuhiro, Takatoshi; Toriyama, Tomoji; Kogure, Kiyoshi

We propose a noise suppression method based on multi-model compositions and multi-pass search. In real environments, input speech for speech recognition includes many kinds of noise signals. To obtain good recognized candidates, suppressing many kinds of noise signals at once and finding target speech is important. Before noise suppression, to find speech and noise label sequences, we introduce multi-pass search with acoustic models including many kinds of noise models and their compositions, their n-gram models, and their lexicon. Noise suppression is frame-synchronously performed using the multiple models selected by recognized label sequences with time alignments. We evaluated this method using the E-Nightingale task, which contains voice memoranda spoken by nurses during actual work at hospitals. The proposed method obtained higher performance than the conventional method.
Algorithms for parallel flow solvers on message passing architectures

NASA Technical Reports Server (NTRS)

Vanderwijngaart, Rob F.

1995-01-01

The purpose of this project has been to identify and test suitable technologies for implementation of fluid flow solvers -- possibly coupled with structures and heat equation solvers -- on MIMD parallel computers. In the course of this investigation much attention has been paid to efficient domain decomposition strategies for ADI-type algorithms. Multi-partitioning derives its efficiency from the assignment of several blocks of grid points to each processor in the parallel computer. A coarse-grain parallelism is obtained, and a near-perfect load balance results. In uni-partitioning every processor receives responsibility for exactly one block of grid points instead of several. This necessitates fine-grain pipelined program execution in order to obtain a reasonable load balance. Although fine-grain parallelism is less desirable on many systems, especially high-latency networks of workstations, uni-partition methods are still in wide use in production codes for flow problems. Consequently, it remains important to achieve good efficiency with this technique that has essentially been superseded by multi-partitioning for parallel ADI-type algorithms. Another reason for the concentration on improving the performance of pipeline methods is their applicability in other types of flow solver kernels with stronger implied data dependence. Analytical expressions can be derived for the size of the dynamic load imbalance incurred in traditional pipelines. From these it can be determined what is the optimal first-processor retardation that leads to the shortest total completion time for the pipeline process. Theoretical predictions of pipeline performance with and without optimization match experimental observations on the iPSC/860 very well. Analysis of pipeline performance also highlights the effect of uncareful grid partitioning in flow solvers that employ pipeline algorithms. If grid blocks at boundaries are not at least as large in the wall-normal direction as those immediately adjacent to them, then the first processor in the pipeline will receive a computational load that is less than that of subsequent processors, magnifying the pipeline slowdown effect. Extra compensation is needed for grid boundary effects, even if all grid blocks are equally sized.
Portable parallel stochastic optimization for the design of aeropropulsion components

NASA Technical Reports Server (NTRS)

Sues, Robert H.; Rhodes, G. S.

1994-01-01

This report presents the results of Phase 1 research to develop a methodology for performing large-scale Multi-disciplinary Stochastic Optimization (MSO) for the design of aerospace systems ranging from aeropropulsion components to complete aircraft configurations. The current research recognizes that such design optimization problems are computationally expensive, and require the use of either massively parallel or multiple-processor computers. The methodology also recognizes that many operational and performance parameters are uncertain, and that uncertainty must be considered explicitly to achieve optimum performance and cost. The objective of this Phase 1 research was to initialize the development of an MSO methodology that is portable to a wide variety of hardware platforms, while achieving efficient, large-scale parallelism when multiple processors are available. The first effort in the project was a literature review of available computer hardware, as well as review of portable, parallel programming environments. The first effort was to implement the MSO methodology for a problem using the portable parallel programming language, Parallel Virtual Machine (PVM). The third and final effort was to demonstrate the example on a variety of computers, including a distributed-memory multiprocessor, a distributed-memory network of workstations, and a single-processor workstation. Results indicate the MSO methodology can be well-applied towards large-scale aerospace design problems. Nearly perfect linear speedup was demonstrated for computation of optimization sensitivity coefficients on both a 128-node distributed-memory multiprocessor (the Intel iPSC/860) and a network of workstations (speedups of almost 19 times achieved for 20 workstations). Very high parallel efficiencies (75 percent for 31 processors and 60 percent for 50 processors) were also achieved for computation of aerodynamic influence coefficients on the Intel. Finally, the multi-level parallelization strategy that will be needed for large-scale MSO problems was demonstrated to be highly efficient. The same parallel code instructions were used on both platforms, demonstrating portability. There are many applications for which MSO can be applied, including NASA's High-Speed-Civil Transport, and advanced propulsion systems. The use of MSO will reduce design and development time and testing costs dramatically.
PPM-based System for Guided Waves Communication Through Corrosion Resistant Multi-wire Cables

NASA Astrophysics Data System (ADS)

Trane, G.; Mijarez, R.; Guevara, R.; Pascacio, D.

Novel wireless communication channels are a necessity in applications surrounded by harsh environments, for instance down-hole oil reservoirs. Traditional radio frequency (RF) communication schemes are not capable of transmitting signals through metal enclosures surrounded by corrosive gases and liquids. As an alternative to RF, a pulse position modulation (PPM) guided waves communication system has been developed and evaluated using a corrosion resistant 4H18 multi-wire cable, commonly used to descend electronic gauges in down-hole oil applications, as the communication medium. The system consists of a transmitter and a receiver that utilizes a PZT crystal, for electrical/mechanical coupling, attached to each extreme of the multi-wire cable. The modulator is based on a microcontroller, which transmits60 kHz guided wave pulses, and the demodulator is based on a commercial digital signal processor (DSP) module that performs real time DSP algorithms. Experimental results are presented, which were obtained using a 1m corrosion resistant 4H18multi-wire cable, commonly used with downhole electronic gauges in the oil sector. Although there was significant dispersion and multiple mode excitations of the transmitted guided wave energy pulses, the results show that data rates on the order of 500 bits per second are readily available employing PPM and simple communications techniques.
Multi-Kepler GPU vs. multi-Intel MIC for spin systems simulations

NASA Astrophysics Data System (ADS)

Bernaschi, M.; Bisson, M.; Salvadore, F.

2014-10-01

We present and compare the performances of two many-core architectures: the Nvidia Kepler and the Intel MIC both in a single system and in cluster configuration for the simulation of spin systems. As a benchmark we consider the time required to update a single spin of the 3D Heisenberg spin glass model by using the Over-relaxation algorithm. We present data also for a traditional high-end multi-core architecture: the Intel Sandy Bridge. The results show that although on the two Intel architectures it is possible to use basically the same code, the performances of a Intel MIC change dramatically depending on (apparently) minor details. Another issue is that to obtain a reasonable scalability with the Intel Phi coprocessor (Phi is the coprocessor that implements the MIC architecture) in a cluster configuration it is necessary to use the so-called offload mode which reduces the performances of the single system. As to the GPU, the Kepler architecture offers a clear advantage with respect to the previous Fermi architecture maintaining exactly the same source code. Scalability of the multi-GPU implementation remains very good by using the CPU as a communication co-processor of the GPU. All source codes are provided for inspection and for double-checking the results.
Binaural release from masking with single- and multi-electrode stimulation in children with cochlear implantsa)

PubMed Central

Todd, Ann E.; Goupell, Matthew J.; Litovsky, Ruth Y.

2016-01-01

Cochlear implants (CIs) provide children with access to speech information from a young age. Despite bilateral cochlear implantation becoming common, use of spatial cues in free field is smaller than in normal-hearing children. Clinically fit CIs are not synchronized across the ears; thus binaural experiments must utilize research processors that can control binaural cues with precision. Research to date has used single pairs of electrodes, which is insufficient for representing speech. Little is known about how children with bilateral CIs process binaural information with multi-electrode stimulation. Toward the goal of improving binaural unmasking of speech, this study evaluated binaural unmasking with multi- and single-electrode stimulation. Results showed that performance with multi-electrode stimulation was similar to the best performance with single-electrode stimulation. This was similar to the pattern of performance shown by normal-hearing adults when presented an acoustic CI simulation. Diotic and dichotic signal detection thresholds of the children with CIs were similar to those of normal-hearing children listening to a CI simulation. The magnitude of binaural unmasking was not related to whether the children with CIs had good interaural time difference sensitivity. Results support the potential for benefits from binaural hearing and speech unmasking in children with bilateral CIs. PMID:27475132
Binaural release from masking with single- and multi-electrode stimulation in children with cochlear implants.

PubMed

Todd, Ann E; Goupell, Matthew J; Litovsky, Ruth Y

2016-07-01

Cochlear implants (CIs) provide children with access to speech information from a young age. Despite bilateral cochlear implantation becoming common, use of spatial cues in free field is smaller than in normal-hearing children. Clinically fit CIs are not synchronized across the ears; thus binaural experiments must utilize research processors that can control binaural cues with precision. Research to date has used single pairs of electrodes, which is insufficient for representing speech. Little is known about how children with bilateral CIs process binaural information with multi-electrode stimulation. Toward the goal of improving binaural unmasking of speech, this study evaluated binaural unmasking with multi- and single-electrode stimulation. Results showed that performance with multi-electrode stimulation was similar to the best performance with single-electrode stimulation. This was similar to the pattern of performance shown by normal-hearing adults when presented an acoustic CI simulation. Diotic and dichotic signal detection thresholds of the children with CIs were similar to those of normal-hearing children listening to a CI simulation. The magnitude of binaural unmasking was not related to whether the children with CIs had good interaural time difference sensitivity. Results support the potential for benefits from binaural hearing and speech unmasking in children with bilateral CIs.
A master-slave parallel hybrid multi-objective evolutionary algorithm for groundwater remediation design under general hydrogeological conditions

NASA Astrophysics Data System (ADS)

Wu, J.; Yang, Y.; Luo, Q.; Wu, J.

2012-12-01

This study presents a new hybrid multi-objective evolutionary algorithm, the niched Pareto tabu search combined with a genetic algorithm (NPTSGA), whereby the global search ability of niched Pareto tabu search (NPTS) is improved by the diversification of candidate solutions arose from the evolving nondominated sorting genetic algorithm II (NSGA-II) population. Also, the NPTSGA coupled with the commonly used groundwater flow and transport codes, MODFLOW and MT3DMS, is developed for multi-objective optimal design of groundwater remediation systems. The proposed methodology is then applied to a large-scale field groundwater remediation system for cleanup of large trichloroethylene (TCE) plume at the Massachusetts Military Reservation (MMR) in Cape Cod, Massachusetts. Furthermore, a master-slave (MS) parallelization scheme based on the Message Passing Interface (MPI) is incorporated into the NPTSGA to implement objective function evaluations in distributed processor environment, which can greatly improve the efficiency of the NPTSGA in finding Pareto-optimal solutions to the real-world application. This study shows that the MS parallel NPTSGA in comparison with the original NPTS and NSGA-II can balance the tradeoff between diversity and optimality of solutions during the search process and is an efficient and effective tool for optimizing the multi-objective design of groundwater remediation systems under complicated hydrogeologic conditions.
Parallel processing architecture for H.264 deblocking filter on multi-core platforms

NASA Astrophysics Data System (ADS)

Prasad, Durga P.; Sonachalam, Sekar; Kunchamwar, Mangesh K.; Gunupudi, Nageswara Rao

2012-03-01

Massively parallel computing (multi-core) chips offer outstanding new solutions that satisfy the increasing demand for high resolution and high quality video compression technologies such as H.264. Such solutions not only provide exceptional quality but also efficiency, low power, and low latency, previously unattainable in software based designs. While custom hardware and Application Specific Integrated Circuit (ASIC) technologies may achieve lowlatency, low power, and real-time performance in some consumer devices, many applications require a flexible and scalable software-defined solution. The deblocking filter in H.264 encoder/decoder poses difficult implementation challenges because of heavy data dependencies and the conditional nature of the computations. Deblocking filter implementations tend to be fixed and difficult to reconfigure for different needs. The ability to scale up for higher quality requirements such as 10-bit pixel depth or a 4:2:2 chroma format often reduces the throughput of a parallel architecture designed for lower feature set. A scalable architecture for deblocking filtering, created with a massively parallel processor based solution, means that the same encoder or decoder will be deployed in a variety of applications, at different video resolutions, for different power requirements, and at higher bit-depths and better color sub sampling patterns like YUV, 4:2:2, or 4:4:4 formats. Low power, software-defined encoders/decoders may be implemented using a massively parallel processor array, like that found in HyperX technology, with 100 or more cores and distributed memory. The large number of processor elements allows the silicon device to operate more efficiently than conventional DSP or CPU technology. This software programing model for massively parallel processors offers a flexible implementation and a power efficiency close to that of ASIC solutions. This work describes a scalable parallel architecture for an H.264 compliant deblocking filter for multi core platforms such as HyperX technology. Parallel techniques such as parallel processing of independent macroblocks, sub blocks, and pixel row level are examined in this work. The deblocking architecture consists of a basic cell called deblocking filter unit (DFU) and dependent data buffer manager (DFM). The DFU can be used in several instances, catering to different performance needs the DFM serves the data required for the different number of DFUs, and also manages all the neighboring data required for future data processing of DFUs. This approach achieves the scalability, flexibility, and performance excellence required in deblocking filters.
Terascale Visualization: Multi-resolution Aspirin for Big-Data Headaches

NASA Astrophysics Data System (ADS)

Duchaineau, Mark

2001-06-01

Recent experience on the Accelerated Strategic Computing Initiative (ASCI) computers shows that computational physicists are successfully producing a prodigious collection of numbers on several thousand processors. But with this wealth of numbers comes an unprecedented difficulty in processing and moving them to provide useful insight and analysis. In this talk, a few simulations are highlighted where recent advancements in multiple-resolution mathematical representations and algorithms have provided some hope of seeing most of the physics of interest while keeping within the practical limits of the post-simulation storage and interactive data-exploration resources. A whole host of visualization research activities was spawned by the 1999 Gordon Bell Prize-winning computation of a shock-tube experiment showing Richtmyer-Meshkov turbulent instabilities. This includes efforts for the entire data pipeline from running simulation to interactive display: wavelet compression of field data, multi-resolution volume rendering and slice planes, out-of-core extraction and simplification of mixing-interface surfaces, shrink-wrapping to semi-regularize the surfaces, semi-structured surface wavelet compression, and view-dependent display-mesh optimization. More recently on the 12 TeraOps ASCI platform, initial results from a 5120-processor, billion-atom molecular dynamics simulation showed that 30-to-1 reductions in storage size can be achieved with no human-observable errors for the analysis required in simulations of supersonic crack propagation. This made it possible to store the 25 trillion bytes worth of simulation numbers in the available storage, which was under 1 trillion bytes. While multi-resolution methods and related systems are still in their infancy, for the largest-scale simulations there is often no other choice should the science require detailed exploration of the results.
On-Line Temperature Estimation for Noisy Thermal Sensors Using a Smoothing Filter-Based Kalman Predictor

PubMed Central

Li, Zhi; Wei, Henglu; Zhou, Wei; Duan, Zhemin

2018-01-01

Dynamic thermal management (DTM) mechanisms utilize embedded thermal sensors to collect fine-grained temperature information for monitoring the real-time thermal behavior of multi-core processors. However, embedded thermal sensors are very susceptible to a variety of sources of noise, including environmental uncertainty and process variation. This causes the discrepancies between actual temperatures and those observed by on-chip thermal sensors, which seriously affect the efficiency of DTM. In this paper, a smoothing filter-based Kalman prediction technique is proposed to accurately estimate the temperatures from noisy sensor readings. For the multi-sensor estimation scenario, the spatial correlations among different sensor locations are exploited. On this basis, a multi-sensor synergistic calibration algorithm (known as MSSCA) is proposed to improve the simultaneous prediction accuracy of multiple sensors. Moreover, an infrared imaging-based temperature measurement technique is also proposed to capture the thermal traces of an advanced micro devices (AMD) quad-core processor in real time. The acquired real temperature data are used to evaluate our prediction performance. Simulation shows that the proposed synergistic calibration scheme can reduce the root-mean-square error (RMSE) by 1.2 ∘C and increase the signal-to-noise ratio (SNR) by 15.8 dB (with a very small average runtime overhead) compared with assuming the thermal sensor readings to be ideal. Additionally, the average false alarm rate (FAR) of the corrected sensor temperature readings can be reduced by 28.6%. These results clearly demonstrate that if our approach is used to perform temperature estimation, the response mechanisms of DTM can be triggered to adjust the voltages, frequencies, and cooling fan speeds at more appropriate times. PMID:29393862
On-Line Temperature Estimation for Noisy Thermal Sensors Using a Smoothing Filter-Based Kalman Predictor.

PubMed

Li, Xin; Ou, Xingtao; Li, Zhi; Wei, Henglu; Zhou, Wei; Duan, Zhemin

2018-02-02

Dynamic thermal management (DTM) mechanisms utilize embedded thermal sensors to collect fine-grained temperature information for monitoring the real-time thermal behavior of multi-core processors. However, embedded thermal sensors are very susceptible to a variety of sources of noise, including environmental uncertainty and process variation. This causes the discrepancies between actual temperatures and those observed by on-chip thermal sensors, which seriously affect the efficiency of DTM. In this paper, a smoothing filter-based Kalman prediction technique is proposed to accurately estimate the temperatures from noisy sensor readings. For the multi-sensor estimation scenario, the spatial correlations among different sensor locations are exploited. On this basis, a multi-sensor synergistic calibration algorithm (known as MSSCA) is proposed to improve the simultaneous prediction accuracy of multiple sensors. Moreover, an infrared imaging-based temperature measurement technique is also proposed to capture the thermal traces of an advanced micro devices (AMD) quad-core processor in real time. The acquired real temperature data are used to evaluate our prediction performance. Simulation shows that the proposed synergistic calibration scheme can reduce the root-mean-square error (RMSE) by 1.2 ∘ C and increase the signal-to-noise ratio (SNR) by 15.8 dB (with a very small average runtime overhead) compared with assuming the thermal sensor readings to be ideal. Additionally, the average false alarm rate (FAR) of the corrected sensor temperature readings can be reduced by 28.6%. These results clearly demonstrate that if our approach is used to perform temperature estimation, the response mechanisms of DTM can be triggered to adjust the voltages, frequencies, and cooling fan speeds at more appropriate times.
Scalable non-negative matrix tri-factorization.

PubMed

Čopar, Andrej; Žitnik, Marinka; Zupan, Blaž

2017-01-01

Matrix factorization is a well established pattern discovery tool that has seen numerous applications in biomedical data analytics, such as gene expression co-clustering, patient stratification, and gene-disease association mining. Matrix factorization learns a latent data model that takes a data matrix and transforms it into a latent feature space enabling generalization, noise removal and feature discovery. However, factorization algorithms are numerically intensive, and hence there is a pressing challenge to scale current algorithms to work with large datasets. Our focus in this paper is matrix tri-factorization, a popular method that is not limited by the assumption of standard matrix factorization about data residing in one latent space. Matrix tri-factorization solves this by inferring a separate latent space for each dimension in a data matrix, and a latent mapping of interactions between the inferred spaces, making the approach particularly suitable for biomedical data mining. We developed a block-wise approach for latent factor learning in matrix tri-factorization. The approach partitions a data matrix into disjoint submatrices that are treated independently and fed into a parallel factorization system. An appealing property of the proposed approach is its mathematical equivalence with serial matrix tri-factorization. In a study on large biomedical datasets we show that our approach scales well on multi-processor and multi-GPU architectures. On a four-GPU system we demonstrate that our approach can be more than 100-times faster than its single-processor counterpart. A general approach for scaling non-negative matrix tri-factorization is proposed. The approach is especially useful parallel matrix factorization implemented in a multi-GPU environment. We expect the new approach will be useful in emerging procedures for latent factor analysis, notably for data integration, where many large data matrices need to be collectively factorized.
Video image processor on the Spacelab 2 Solar Optical Universal Polarimeter /SL2 SOUP/

NASA Technical Reports Server (NTRS)

Lindgren, R. W.; Tarbell, T. D.

1981-01-01

The SOUP instrument is designed to obtain diffraction-limited digital images of the sun with high photometric accuracy. The Video Processor originated from the requirement to provide onboard real-time image processing, both to reduce the telemetry rate and to provide meaningful video displays of scientific data to the payload crew. This original concept has evolved into a versatile digital processing system with a multitude of other uses in the SOUP program. The central element in the Video Processor design is a 16-bit central processing unit based on 2900 family bipolar bit-slice devices. All arithmetic, logical and I/O operations are under control of microprograms, stored in programmable read-only memory and initiated by commands from the LSI-11. Several functions of the Video Processor are described, including interface to the High Rate Multiplexer downlink, cosmetic and scientific data processing, scan conversion for crew displays, focus and exposure testing, and use as ground support equipment.
Handheld microwave bomb-detecting imaging system

NASA Astrophysics Data System (ADS)

Gorwara, Ashok; Molchanov, Pavlo

2017-05-01

Proposed novel imaging technique will provide all weather high-resolution imaging and recognition capability for RF/Microwave signals with good penetration through highly scattered media: fog, snow, dust, smoke, even foliage, camouflage, walls and ground. Image resolution in proposed imaging system is not limited by diffraction and will be determined by processor and sampling frequency. Proposed imaging system can simultaneously cover wide field of view, detect multiple targets and can be multi-frequency, multi-function. Directional antennas in imaging system can be close positioned and installed in cell phone size handheld device, on small aircraft or distributed around protected border or object. Non-scanning monopulse system allows dramatically decrease in transmitting power and at the same time provides increased imaging range by integrating 2-3 orders more signals than regular scanning imaging systems.

Current Saturation Avoidance with Real-Time Control using DPCS

NASA Astrophysics Data System (ADS)

Ferrara, M.; Hutchinson, I.; Wolfe, S.; Stillerman, J.; Fredian, T.

2008-11-01

Tokamak ohmic-transformer and equilibrium-field coils need to be able to operate near their maximum current capabilities. However if they reach their upper limit during high-performance discharges or in the presence of a strong off-normal event, shape control is compromised, and instability, even plasma disruptions can result. On Alcator C-Mod we designed and tested an anti-saturation routine which detects the impending saturation of OH and EF currents and interpolates to a neighboring safe equilibrium in real-time. The routine was implemented with a multi-processor, multi-time-scale control scheme, which is based on a master process and multiple asynchronous slave processes. The scheme is general and can be used for any computationally-intensive algorithm. USDoE award DE- FC02-99ER545512.
Best kept secrets ... Source Data Systems, Inc. (SDS).

PubMed

Andrew, W F

1991-03-01

The SDS/MEDNET system is a cost-effective option for small- to medium-size hospitals (up to 400 beds). The parameter-driven system lets users control operations with only occasional SDS assistance. A full application set, available for modular selection to reduce upfront costs while facilitating steady growth and protecting client investment, is adaptable to multi-facility environments. The industry-standard, Intel-based multi-user processors, network communications and protocols assure high efficiency, low-cost solutions independent of any one hardware vendor. Sustained growth in both client base and product offerings point to a high level of responsiveness and healthcare industry commitment. Corporate emphasis on user involvement and open systems integration assures clients of leading-edge capabilities. SDS/MEDNET will be a strong contender in selected marketing environments.
FAST: A multi-processed environment for visualization of computational fluid dynamics

NASA Technical Reports Server (NTRS)

Bancroft, Gordon V.; Merritt, Fergus J.; Plessel, Todd C.; Kelaita, Paul G.; Mccabe, R. Kevin

1991-01-01

Three-dimensional, unsteady, multi-zoned fluid dynamics simulations over full scale aircraft are typical of the problems being investigated at NASA Ames' Numerical Aerodynamic Simulation (NAS) facility on CRAY2 and CRAY-YMP supercomputers. With multiple processor workstations available in the 10-30 Mflop range, we feel that these new developments in scientific computing warrant a new approach to the design and implementation of analysis tools. These larger, more complex problems create a need for new visualization techniques not possible with the existing software or systems available as of this writing. The visualization techniques will change as the supercomputing environment, and hence the scientific methods employed, evolves even further. The Flow Analysis Software Toolkit (FAST), an implementation of a software system for fluid mechanics analysis, is discussed.
Servicing a globally broadcast interrupt signal in a multi-threaded computer

DOEpatents

Attinella, John E.; Davis, Kristan D.; Musselman, Roy G.; Satterfield, David L.

2015-12-29

Methods, apparatuses, and computer program products for servicing a globally broadcast interrupt signal in a multi-threaded computer comprising a plurality of processor threads. Embodiments include an interrupt controller indicating in a plurality of local interrupt status locations that a globally broadcast interrupt signal has been received by the interrupt controller. Embodiments also include a thread determining that a local interrupt status location corresponding to the thread indicates that the globally broadcast interrupt signal has been received by the interrupt controller. Embodiments also include the thread processing one or more entries in a global interrupt status bit queue based on whether global interrupt status bits associated with the globally broadcast interrupt signal are locked. Each entry in the global interrupt status bit queue corresponds to a queued global interrupt.
High Performance Power Module for Hall Effect Thrusters

NASA Technical Reports Server (NTRS)

Pinero, Luis R.; Peterson, Peter Y.; Bowers, Glen E.

2002-01-01

Previous efforts to develop power electronics for Hall thruster systems have targeted the 1 to 5 kW power range and an output voltage of approximately 300 V. New Hall thrusters are being developed for higher power, higher specific impulse, and multi-mode operation. These thrusters require up to 50 kW of power and a discharge voltage in excess of 600 V. Modular power supplies can process more power with higher efficiency at the expense of complexity. A 1 kW discharge power module was designed, built and integrated with a Hall thruster. The breadboard module has a power conversion efficiency in excess of 96 percent and weighs only 0.765 kg. This module will be used to develop a kW, multi-kW, and high voltage power processors.
Collaborative filtering on a family of biological targets.

PubMed

Erhan, Dumitru; L'heureux, Pierre-Jean; Yue, Shi Yi; Bengio, Yoshua

2006-01-01

Building a QSAR model of a new biological target for which few screening data are available is a statistical challenge. However, the new target may be part of a bigger family, for which we have more screening data. Collaborative filtering or, more generally, multi-task learning, is a machine learning approach that improves the generalization performance of an algorithm by using information from related tasks as an inductive bias. We use collaborative filtering techniques for building predictive models that link multiple targets to multiple examples. The more commonalities between the targets, the better the multi-target model that can be built. We show an example of a multi-target neural network that can use family information to produce a predictive model of an undersampled target. We evaluate JRank, a kernel-based method designed for collaborative filtering. We show their performance on compound prioritization for an HTS campaign and the underlying shared representation between targets. JRank outperformed the neural network both in the single- and multi-target models.
SCORPIO: A Scalable Two-Phase Parallel I/O Library With Application To A Large Scale Subsurface Simulator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sreepathi, Sarat; Sripathi, Vamsi; Mills, Richard T

2013-01-01

Inefficient parallel I/O is known to be a major bottleneck among scientific applications employed on supercomputers as the number of processor cores grows into the thousands. Our prior experience indicated that parallel I/O libraries such as HDF5 that rely on MPI-IO do not scale well beyond 10K processor cores, especially on parallel file systems (like Lustre) with single point of resource contention. Our previous optimization efforts for a massively parallel multi-phase and multi-component subsurface simulator (PFLOTRAN) led to a two-phase I/O approach at the application level where a set of designated processes participate in the I/O process by splitting themore » I/O operation into a communication phase and a disk I/O phase. The designated I/O processes are created by splitting the MPI global communicator into multiple sub-communicators. The root process in each sub-communicator is responsible for performing the I/O operations for the entire group and then distributing the data to rest of the group. This approach resulted in over 25X speedup in HDF I/O read performance and 3X speedup in write performance for PFLOTRAN at over 100K processor cores on the ORNL Jaguar supercomputer. This research describes the design and development of a general purpose parallel I/O library, SCORPIO (SCalable block-ORiented Parallel I/O) that incorporates our optimized two-phase I/O approach. The library provides a simplified higher level abstraction to the user, sitting atop existing parallel I/O libraries (such as HDF5) and implements optimized I/O access patterns that can scale on larger number of processors. Performance results with standard benchmark problems and PFLOTRAN indicate that our library is able to maintain the same speedups as before with the added flexibility of being applicable to a wider range of I/O intensive applications.« less
High speed real-time wavefront processing system for a solid-state laser system

NASA Astrophysics Data System (ADS)

Liu, Yuan; Yang, Ping; Chen, Shanqiu; Ma, Lifang; Xu, Bing

2008-03-01

A high speed real-time wavefront processing system for a solid-state laser beam cleanup system has been built. This system consists of a core2 Industrial PC (IPC) using Linux and real-time Linux (RT-Linux) operation system (OS), a PCI image grabber, a D/A card. More often than not, the phase aberrations of the output beam from solid-state lasers vary fast with intracavity thermal effects and environmental influence. To compensate the phase aberrations of solid-state lasers successfully, a high speed real-time wavefront processing system is presented. Compared to former systems, this system can improve the speed efficiently. In the new system, the acquisition of image data, the output of control voltage data and the implementation of reconstructor control algorithm are treated as real-time tasks in kernel-space, the display of wavefront information and man-machine conversation are treated as non real-time tasks in user-space. The parallel processing of real-time tasks in Symmetric Multi Processors (SMP) mode is the main strategy of improving the speed. In this paper, the performance and efficiency of this wavefront processing system are analyzed. The opened-loop experimental results show that the sampling frequency of this system is up to 3300Hz, and this system can well deal with phase aberrations from solid-state lasers.
Hierarchical Learning of Tree Classifiers for Large-Scale Plant Species Identification.

PubMed

Fan, Jianping; Zhou, Ning; Peng, Jinye; Gao, Ling

2015-11-01

In this paper, a hierarchical multi-task structural learning algorithm is developed to support large-scale plant species identification, where a visual tree is constructed for organizing large numbers of plant species in a coarse-to-fine fashion and determining the inter-related learning tasks automatically. For a given parent node on the visual tree, it contains a set of sibling coarse-grained categories of plant species or sibling fine-grained plant species, and a multi-task structural learning algorithm is developed to train their inter-related classifiers jointly for enhancing their discrimination power. The inter-level relationship constraint, e.g., a plant image must first be assigned to a parent node (high-level non-leaf node) correctly if it can further be assigned to the most relevant child node (low-level non-leaf node or leaf node) on the visual tree, is formally defined and leveraged to learn more discriminative tree classifiers over the visual tree. Our experimental results have demonstrated the effectiveness of our hierarchical multi-task structural learning algorithm on training more discriminative tree classifiers for large-scale plant species identification.
Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms.

PubMed

Bromuri, Stefano; Zufferey, Damien; Hennebert, Jean; Schumacher, Michael

2014-10-01

This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series. We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision. Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches. The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density. Copyright © 2014 Elsevier Inc. All rights reserved.
A low power biomedical signal processor ASIC based on hardware software codesign.

PubMed

Nie, Z D; Wang, L; Chen, W G; Zhang, T; Zhang, Y T

2009-01-01

A low power biomedical digital signal processor ASIC based on hardware and software codesign methodology was presented in this paper. The codesign methodology was used to achieve higher system performance and design flexibility. The hardware implementation included a low power 32bit RISC CPU ARM7TDMI, a low power AHB-compatible bus, and a scalable digital co-processor that was optimized for low power Fast Fourier Transform (FFT) calculations. The co-processor could be scaled for 8-point, 16-point and 32-point FFTs, taking approximate 50, 100 and 150 clock circles, respectively. The complete design was intensively simulated using ARM DSM model and was emulated by ARM Versatile platform, before conducted to silicon. The multi-million-gate ASIC was fabricated using SMIC 0.18 microm mixed-signal CMOS 1P6M technology. The die area measures 5,000 microm x 2,350 microm. The power consumption was approximately 3.6 mW at 1.8 V power supply and 1 MHz clock rate. The power consumption for FFT calculations was less than 1.5 % comparing with the conventional embedded software-based solution.
Landsat image registration for agricultural applications

NASA Technical Reports Server (NTRS)

Wolfe, R. H., Jr.; Juday, R. D.; Wacker, A. G.; Kaneko, T.

1982-01-01

An image registration system has been developed at the NASA Johnson Space Center (JSC) to spatially align multi-temporal Landsat acquisitions for use in agriculture and forestry research. Working in conjunction with the Master Data Processor (MDP) at the Goddard Space Flight Center, it functionally replaces the long-standing LACIE Registration Processor as JSC's data supplier. The system represents an expansion of the techniques developed for the MDP and LACIE Registration Processor, and it utilizes the experience gained in an IBM/JSC effort evaluating the performance of the latter. These techniques are discussed in detail. Several tests were developed to evaluate the registration performance of the system. The results indicate that 1/15-pixel accuracy (about 4m for Landsat MSS) is achievable in ideal circumstances, sub-pixel accuracy (often to 0.2 pixel or better) was attained on a representative set of U.S. acquisitions, and a success rate commensurate with the LACIE Registration Processor was realized. The system has been employed in a production mode on U.S. and foreign data, and a performance similar to the earlier tests has been noted.
Orthorectification by Using Gpgpu Method

NASA Astrophysics Data System (ADS)

Sahin, H.; Kulur, S.

2012-07-01

Thanks to the nature of the graphics processing, the newly released products offer highly parallel processing units with high-memory bandwidth and computational power of more than teraflops per second. The modern GPUs are not only powerful graphic engines but also they are high level parallel programmable processors with very fast computing capabilities and high-memory bandwidth speed compared to central processing units (CPU). Data-parallel computations can be shortly described as mapping data elements to parallel processing threads. The rapid development of GPUs programmability and capabilities attracted the attentions of researchers dealing with complex problems which need high level calculations. This interest has revealed the concepts of "General Purpose Computation on Graphics Processing Units (GPGPU)" and "stream processing". The graphic processors are powerful hardware which is really cheap and affordable. So the graphic processors became an alternative to computer processors. The graphic chips which were standard application hardware have been transformed into modern, powerful and programmable processors to meet the overall needs. Especially in recent years, the phenomenon of the usage of graphics processing units in general purpose computation has led the researchers and developers to this point. The biggest problem is that the graphics processing units use different programming models unlike current programming methods. Therefore, an efficient GPU programming requires re-coding of the current program algorithm by considering the limitations and the structure of the graphics hardware. Currently, multi-core processors can not be programmed by using traditional programming methods. Event procedure programming method can not be used for programming the multi-core processors. GPUs are especially effective in finding solution for repetition of the computing steps for many data elements when high accuracy is needed. Thus, it provides the computing process more quickly and accurately. Compared to the GPUs, CPUs which perform just one computing in a time according to the flow control are slower in performance. This structure can be evaluated for various applications of computer technology. In this study covers how general purpose parallel programming and computational power of the GPUs can be used in photogrammetric applications especially direct georeferencing. The direct georeferencing algorithm is coded by using GPGPU method and CUDA (Compute Unified Device Architecture) programming language. Results provided by this method were compared with the traditional CPU programming. In the other application the projective rectification is coded by using GPGPU method and CUDA programming language. Sample images of various sizes, as compared to the results of the program were evaluated. GPGPU method can be used especially in repetition of same computations on highly dense data, thus finding the solution quickly.
Balance training with multi-task exercises improves fall-related self-efficacy, gait, balance performance and physical function in older adults with osteoporosis: a randomized controlled trial.

PubMed

Halvarsson, Alexandra; Franzén, Erika; Ståhle, Agneta

2015-04-01

To evaluate the effects of a balance training program including dual- and multi-task exercises on fall-related self-efficacy, fear of falling, gait and balance performance, and physical function in older adults with osteoporosis with an increased risk of falling and to evaluate whether additional physical activity would further improve the effects. Randomized controlled trial, including three groups: two intervention groups (Training, or Training+Physical activity) and one Control group, with a 12-week follow-up. Stockholm County, Sweden. Ninety-six older adults, aged 66-87, with verified osteoporosis. A specific and progressive balance training program including dual- and multi-task three times/week for 12 weeks, and physical activity for 30 minutes, three times/week. Fall-related self-efficacy (Falls Efficacy Scale-International), fear of falling (single-item question - 'In general, are you afraid of falling?'), gait speed with and without a cognitive dual-task at preferred pace and fast walking (GAITRite®), balance performance tests (one-leg stance, and modified figure of eight), and physical function (Late-Life Function and Disability Instrument). Both intervention groups significantly improved their fall-related self-efficacy as compared to the controls (p ≤ 0.034, 4 points) and improved their balance performance. Significant differences over time and between groups in favour of the intervention groups were found for walking speed with a dual-task (p=0.003), at fast walking speed (p=0.008), and for advanced lower extremity physical function (p=0.034). This balance training program, including dual- and multi-task, improves fall-related self-efficacy, gait speed, balance performance, and physical function in older adults with osteoporosis. © The Author(s) 2014.
Measurements of the LHCb software stack on the ARM architecture

NASA Astrophysics Data System (ADS)

Vijay Kartik, S.; Couturier, Ben; Clemencic, Marco; Neufeld, Niko

2014-06-01

The ARM architecture is a power-efficient design that is used in most processors in mobile devices all around the world today since they provide reasonable compute performance per watt. The current LHCb software stack is designed (and thus expected) to build and run on machines with the x86/x86_64 architecture. This paper outlines the process of measuring the performance of the LHCb software stack on the ARM architecture - specifically, the ARMv7 architecture on Cortex-A9 processors from NVIDIA and on full-fledged ARM servers with chipsets from Calxeda - and makes comparisons with the performance on x86_64 architectures on the Intel Xeon L5520/X5650 and AMD Opteron 6272. The paper emphasises the aspects of performance per core with respect to the power drawn by the compute nodes for the given performance - this ensures a fair real-world comparison with much more 'powerful' Intel/AMD processors. The comparisons of these real workloads in the context of LHCb are also complemented with the standard synthetic benchmarks HEPSPEC and Coremark. The pitfalls and solutions for the non-trivial task of porting the source code to build for the ARMv7 instruction set are presented. The specific changes in the build process needed for ARM-specific portions of the software stack are described, to serve as pointers for further attempts taken up by other groups in this direction. Cases where architecture-specific tweaks at the assembler lever (both in ROOT and the LHCb software stack) were needed for a successful compile are detailed - these cases are good indicators of where/how the software stack as well as the build system can be made more portable and multi-arch friendly. The experience gained from the tasks described in this paper are intended to i) assist in making an informed choice about ARM-based server solutions as a feasible low-power alternative to the current compute nodes, and ii) revisit the software design and build system for portability and generic improvements.
Analysis and simulation tools for solar array power systems

NASA Astrophysics Data System (ADS)

Pongratananukul, Nattorn

This dissertation presents simulation tools developed specifically for the design of solar array power systems. Contributions are made in several aspects of the system design phases, including solar source modeling, system simulation, and controller verification. A tool to automate the study of solar array configurations using general purpose circuit simulators has been developed based on the modeling of individual solar cells. Hierarchical structure of solar cell elements, including semiconductor properties, allows simulation of electrical properties as well as the evaluation of the impact of environmental conditions. A second developed tool provides a co-simulation platform with the capability to verify the performance of an actual digital controller implemented in programmable hardware such as a DSP processor, while the entire solar array including the DC-DC power converter is modeled in software algorithms running on a computer. This "virtual plant" allows developing and debugging code for the digital controller, and also to improve the control algorithm. One important task in solar arrays is to track the maximum power point on the array in order to maximize the power that can be delivered. Digital controllers implemented with programmable processors are particularly attractive for this task because sophisticated tracking algorithms can be implemented and revised when needed to optimize their performance. The proposed co-simulation tools are thus very valuable in developing and optimizing the control algorithm, before the system is built. Examples that demonstrate the effectiveness of the proposed methodologies are presented. The proposed simulation tools are also valuable in the design of multi-channel arrays. In the specific system that we have designed and tested, the control algorithm is implemented on a single digital signal processor. In each of the channels the maximum power point is tracked individually. In the prototype we built, off-the-shelf commercial DC-DC converters were utilized. At the end, the overall performance of the entire system was evaluated using solar array simulators capable of simulating various I-V characteristics, and also by using an electronic load. Experimental results are presented.
Overview of NASA Multi-dimensional Stirling Convertor Code Development and Validation Effort

NASA Technical Reports Server (NTRS)

Tew, Roy C.; Cairelli, James E.; Ibrahim, Mounir B.; Simon, Terrence W.; Gedeon, David

2002-01-01

A NASA grant has been awarded to Cleveland State University (CSU) to develop a multi-dimensional (multi-D) Stirling computer code with the goals of improving loss predictions and identifying component areas for improvements. The University of Minnesota (UMN) and Gedeon Associates are teamed with CSU. Development of test rigs at UMN and CSU and validation of the code against test data are part of the effort. The one-dimensional (1-D) Stirling codes used for design and performance prediction do not rigorously model regions of the working space where abrupt changes in flow area occur (such as manifolds and other transitions between components). Certain hardware experiences have demonstrated large performance gains by varying manifolds and heat exchanger designs to improve flow distributions in the heat exchangers. 1-D codes were not able to predict these performance gains. An accurate multi-D code should improve understanding of the effects of area changes along the main flow axis, sensitivity of performance to slight changes in internal geometry, and, in general, the understanding of various internal thermodynamic losses. The commercial CFD-ACE code has been chosen for development of the multi-D code. This 2-D/3-D code has highly developed pre- and post-processors, and moving boundary capability. Preliminary attempts at validation of CFD-ACE models of MIT gas spring and "two space" test rigs were encouraging. Also, CSU's simulations of the UMN oscillating-flow fig compare well with flow visualization results from UMN. A complementary Department of Energy (DOE) Regenerator Research effort is aiding in development of regenerator matrix models that will be used in the multi-D Stirling code. This paper reports on the progress and challenges of this
Deployable Fuel Cell Power Generator - Multi-Fuel Processor

DTIC Science & Technology

2009-02-01

and the system operating pressure, while the separation efficiency depends on the evaporator design. Desulfurizer – A flow-through gas -solid or gas ...meeting the Executive Order (EO) 13423 and the Energy Policy Act of 2005 to improve energy efficiency and reduce greenhouse gas emissions 3 percent...use available fuel such as natural gas (methane) or propane. The ability to reform multitude of fuels can accelerate the introduction of more
Development of an Autonomous Navigation Technology Test Vehicle

DTIC Science & Technology

2004-08-01

as an independent thread on processors using the Linux operating system. The computer hardware selected for the nodes that host the MRS threads...communications system design. Linux was chosen as the operating system for all of the single board computers used on the Mule. Linux was specifically...used for system analysis and development. The simple realization of multi-thread processing and inter-process communications in Linux made it a
LIBS data analysis using a predictor-corrector based digital signal processor algorithm

NASA Astrophysics Data System (ADS)

Sanders, Alex; Griffin, Steven T.; Robinson, Aaron

2012-06-01

There are many accepted sensor technologies for generating spectra for material classification. Once the spectra are generated, communication bandwidth limitations favor local material classification with its attendant reduction in data transfer rates and power consumption. Transferring sensor technologies such as Cavity Ring-Down Spectroscopy (CRDS) and Laser Induced Breakdown Spectroscopy (LIBS) require effective material classifiers. A result of recent efforts has been emphasis on Partial Least Squares - Discriminant Analysis (PLS-DA) and Principle Component Analysis (PCA). Implementation of these via general purpose computers is difficult in small portable sensor configurations. This paper addresses the creation of a low mass, low power, robust hardware spectra classifier for a limited set of predetermined materials in an atmospheric matrix. Crucial to this is the incorporation of PCA or PLS-DA classifiers into a predictor-corrector style implementation. The system configuration guarantees rapid convergence. Software running on multi-core Digital Signal Processor (DSPs) simulates a stream-lined plasma physics model estimator, reducing Analog-to-Digital (ADC) power requirements. This paper presents the results of a predictorcorrector model implemented on a low power multi-core DSP to perform substance classification. This configuration emphasizes the hardware system and software design via a predictor corrector model that simultaneously decreases the sample rate while performing the classification.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.