Science.gov

Sample records for parallel processing strategies

  1. Parallel strategies for SAR processing

    NASA Astrophysics Data System (ADS)

    Segoviano, Jesus A.

    2004-12-01

    This article proposes a series of strategies for improving the computer process of the Synthetic Aperture Radar (SAR) signal treatment, following the three usual lines of action to speed up the execution of any computer program. On the one hand, it is studied the optimization of both, the data structures and the application architecture used on it. On the other hand it is considered a hardware improvement. For the former, they are studied both, the usually employed SAR process data structures, proposing the use of parallel ones and the way the parallelization of the algorithms employed on the process is implemented. Besides, the parallel application architecture classifies processes between fine/coarse grain. These are assigned to individual processors or separated in a division among processors, all of them in their corresponding architectures. For the latter, it is studied the hardware employed on the computer parallel process used in the SAR handling. The improvement here refers to several kinds of platforms in which the SAR process is implemented, shared memory multicomputers, and distributed memory multiprocessors. A comparison between them gives us some guidelines to follow in order to get a maximum throughput with a minimum latency and a maximum effectiveness with a minimum cost, all together with a limited complexness. It is concluded and described, that the approach consisting of the processing of the algorithms in a GNU/Linux environment, together with a Beowulf cluster platform offers, under certain conditions, the best compromise between performance and cost, and promises the major development in the future for the Synthetic Aperture Radar computer power thirsty applications in the next years.

  2. Parallel Processing Strategies of the Primate Visual System

    PubMed Central

    Nassi, Jonathan J.; Callaway, Edward M.

    2009-01-01

    Preface Incoming sensory information is sent to the brain along modality-specific channels corresponding to the five senses. Each of these channels further parses the incoming signals into parallel streams to provide a compact, efficient input to the brain. Ultimately, these parallel input signals must be elaborated upon and integrated within the cortex to provide a unified and coherent percept. Recent studies in the primate visual cortex have greatly contributed to our understanding of how this goal is accomplished. Multiple strategies including retinal tiling, hierarchical and parallel processing and modularity, defined spatially and by cell type-specific connectivity, are all used by the visual system to recover the rich detail of our visual surroundings. PMID:19352403

  3. Lossy hyperspectral image compression on a graphics processing unit: parallelization strategy and performance evaluation

    NASA Astrophysics Data System (ADS)

    Santos, Lucana; Magli, Enrico; Vitulli, Raffaele; Núñez, Antonio; López, José F.; Sarmiento, Roberto

    2013-01-01

    There is an intense necessity for the development of new hardware architectures for the implementation of algorithms for hyperspectral image compression on board satellites. Graphics processing units (GPUs) represent a very attractive opportunity, offering the possibility to dramatically increase the computation speed in applications that are data and task parallel. An algorithm for the lossy compression of hyperspectral images is implemented on a GPU using Nvidia computer unified device architecture (CUDA) parallel computing architecture. The parallelization strategy is explained, with emphasis on the entropy coding and bit packing phases, for which a more sophisticated strategy is necessary due to the existing data dependencies. Experimental results are obtained by comparing the performance of the GPU implementation with a single-threaded CPU implementation, showing high speedups of up to 15.41. A profiling of the algorithm is provided, demonstrating the high performance of the designed parallel entropy coding phase. The accuracy of the GPU implementation is presented, as well as the effect of the configuration parameters on performance. The convenience of using GPUs for on-board processing is demonstrated, and solutions to the potential difficulties encountered when accelerating hyperspectral compression algorithms are proposed, if space-qualified GPUs become a reality in the near future.

  4. Parallel Information Processing.

    ERIC Educational Resources Information Center

    Rasmussen, Edie M.

    1992-01-01

    Examines parallel computer architecture and the use of parallel processors for text. Topics discussed include parallel algorithms; performance evaluation; parallel information processing; parallel access methods for text; parallel and distributed information retrieval systems; parallel hardware for text; and network models for information…

  5. Special parallel processing workshop

    SciTech Connect

    1994-12-01

    This report contains viewgraphs from the Special Parallel Processing Workshop. These viewgraphs deal with topics such as parallel processing performance, message passing, queue structure, and other basic concept detailing with parallel processing.

  6. Parallel-processing with surface plasmons, a new strategy for converting the broad solar spectrum

    NASA Technical Reports Server (NTRS)

    Anderson, L. M.

    1982-01-01

    A new strategy for efficient solar-energy conversion is based on parallel processing with surface plasmons: guided electromagnetic waves supported on thin films of common metals like aluminum or silver. The approach is unique in identifying a broadband carrier with suitable range for energy transport and an inelastic tunneling process which can be used to extract more energy from the more energetic carriers without requiring different materials for each frequency band. The aim is to overcome the fundamental 56-percent loss associated with mismatch between the broad solar spectrum and the monoenergetic conduction electrons used to transport energy in conventional silicon solar cells. This paper presents a qualitative discussion of the unknowns and barrier problems, including ideas for coupling surface plasmons into the tunnels, a step which has been the weak link in the efficiency chain.

  7. A new strategy for efficient solar energy conversion: Parallel-processing with surface plasmons

    NASA Technical Reports Server (NTRS)

    Anderson, L. M.

    1982-01-01

    This paper introduces an advanced concept for direct conversion of sunlight to electricity, which aims at high efficiency by tailoring the conversion process to separate energy bands within the broad solar spectrum. The objective is to obtain a high level of spectrum-splitting without sequential losses or unique materials for each frequency band. In this concept, sunlight excites a spectrum of surface plasma waves which are processed in parallel on the same metal film. The surface plasmons transport energy to an array of metal-barrier-semiconductor diodes, where energy is extracted by inelastic tunneling. Diodes are tuned to different frequency bands by selecting the operating voltage and geometry, but all diodes share the same materials.

  8. Graphics applications utilizing parallel processing

    NASA Technical Reports Server (NTRS)

    Rice, John R.

    1990-01-01

    The results are presented of research conducted to develop a parallel graphic application algorithm to depict the numerical solution of the 1-D wave equation, the vibrating string. The research was conducted on a Flexible Flex/32 multiprocessor and a Sequent Balance 21000 multiprocessor. The wave equation is implemented using the finite difference method. The synchronization issues that arose from the parallel implementation and the strategies used to alleviate the effects of the synchronization overhead are discussed.

  9. Tightly integrated single- and multi-crystal data collection strategy calculation and parallelized data processing in JBluIce beamline control system

    DOE PAGESBeta

    Pothineni, Sudhir Babu; Venugopalan, Nagarajan; Ogata, Craig M.; Hilgart, Mark C.; Stepanov, Sergey; Sanishvili, Ruslan; Becker, Michael; Winter, Graeme; Sauter, Nicholas K.; Smith, Janet L.; et al

    2014-11-18

    The calculation of single- and multi-crystal data collection strategies and a data processing pipeline have been tightly integrated into the macromolecular crystallographic data acquisition and beamline control software JBluIce. Both tasks employ wrapper scripts around existing crystallographic software. JBluIce executes scripts through a distributed resource management system to make efficient use of all available computing resources through parallel processing. The JBluIce single-crystal data collection strategy feature uses a choice of strategy programs to help users rank sample crystals and collect data. The strategy results can be conveniently exported to a data collection run. The JBluIce multi-crystal strategy feature calculates amore » collection strategy to optimize coverage of reciprocal space in cases where incomplete data are available from previous samples. The JBluIce data processing runs simultaneously with data collection using a choice of data reduction wrappers for integration and scaling of newly collected data, with an option for merging with pre-existing data. Data are processed separately if collected from multiple sites on a crystal or from multiple crystals, then scaled and merged. Results from all strategy and processing calculations are displayed in relevant tabs of JBluIce.« less

  10. Tightly integrated single- and multi-crystal data collection strategy calculation and parallelized data processing in JBluIce beamline control system.

    PubMed

    Pothineni, Sudhir Babu; Venugopalan, Nagarajan; Ogata, Craig M; Hilgart, Mark C; Stepanov, Sergey; Sanishvili, Ruslan; Becker, Michael; Winter, Graeme; Sauter, Nicholas K; Smith, Janet L; Fischetti, Robert F

    2014-12-01

    The calculation of single- and multi-crystal data collection strategies and a data processing pipeline have been tightly integrated into the macromolecular crystallographic data acquisition and beamline control software JBluIce. Both tasks employ wrapper scripts around existing crystallographic software. JBluIce executes scripts through a distributed resource management system to make efficient use of all available computing resources through parallel processing. The JBluIce single-crystal data collection strategy feature uses a choice of strategy programs to help users rank sample crystals and collect data. The strategy results can be conveniently exported to a data collection run. The JBluIce multi-crystal strategy feature calculates a collection strategy to optimize coverage of reciprocal space in cases where incomplete data are available from previous samples. The JBluIce data processing runs simultaneously with data collection using a choice of data reduction wrappers for integration and scaling of newly collected data, with an option for merging with pre-existing data. Data are processed separately if collected from multiple sites on a crystal or from multiple crystals, then scaled and merged. Results from all strategy and processing calculations are displayed in relevant tabs of JBluIce.

  11. Tightly integrated single- and multi-crystal data collection strategy calculation and parallelized data processing in JBluIce beamline control system.

    PubMed

    Pothineni, Sudhir Babu; Venugopalan, Nagarajan; Ogata, Craig M; Hilgart, Mark C; Stepanov, Sergey; Sanishvili, Ruslan; Becker, Michael; Winter, Graeme; Sauter, Nicholas K; Smith, Janet L; Fischetti, Robert F

    2014-12-01

    The calculation of single- and multi-crystal data collection strategies and a data processing pipeline have been tightly integrated into the macromolecular crystallographic data acquisition and beamline control software JBluIce. Both tasks employ wrapper scripts around existing crystallographic software. JBluIce executes scripts through a distributed resource management system to make efficient use of all available computing resources through parallel processing. The JBluIce single-crystal data collection strategy feature uses a choice of strategy programs to help users rank sample crystals and collect data. The strategy results can be conveniently exported to a data collection run. The JBluIce multi-crystal strategy feature calculates a collection strategy to optimize coverage of reciprocal space in cases where incomplete data are available from previous samples. The JBluIce data processing runs simultaneously with data collection using a choice of data reduction wrappers for integration and scaling of newly collected data, with an option for merging with pre-existing data. Data are processed separately if collected from multiple sites on a crystal or from multiple crystals, then scaled and merged. Results from all strategy and processing calculations are displayed in relevant tabs of JBluIce. PMID:25484844

  12. Tightly integrated single- and multi-crystal data collection strategy calculation and parallelized data processing in JBluIce beamline control system

    SciTech Connect

    Pothineni, Sudhir Babu; Venugopalan, Nagarajan; Ogata, Craig M.; Hilgart, Mark C.; Stepanov, Sergey; Sanishvili, Ruslan; Becker, Michael; Winter, Graeme; Sauter, Nicholas K.; Smith, Janet L.; Fischetti, Robert F.

    2014-11-18

    The calculation of single- and multi-crystal data collection strategies and a data processing pipeline have been tightly integrated into the macromolecular crystallographic data acquisition and beamline control software JBluIce. Both tasks employ wrapper scripts around existing crystallographic software. JBluIce executes scripts through a distributed resource management system to make efficient use of all available computing resources through parallel processing. The JBluIce single-crystal data collection strategy feature uses a choice of strategy programs to help users rank sample crystals and collect data. The strategy results can be conveniently exported to a data collection run. The JBluIce multi-crystal strategy feature calculates a collection strategy to optimize coverage of reciprocal space in cases where incomplete data are available from previous samples. The JBluIce data processing runs simultaneously with data collection using a choice of data reduction wrappers for integration and scaling of newly collected data, with an option for merging with pre-existing data. Data are processed separately if collected from multiple sites on a crystal or from multiple crystals, then scaled and merged. Results from all strategy and processing calculations are displayed in relevant tabs of JBluIce.

  13. Tightly integrated single- and multi-crystal data collection strategy calculation and parallelized data processing in JBluIce beamline control system

    PubMed Central

    Pothineni, Sudhir Babu; Venugopalan, Nagarajan; Ogata, Craig M.; Hilgart, Mark C.; Stepanov, Sergey; Sanishvili, Ruslan; Becker, Michael; Winter, Graeme; Sauter, Nicholas K.; Smith, Janet L.; Fischetti, Robert F.

    2014-01-01

    The calculation of single- and multi-crystal data collection strategies and a data processing pipeline have been tightly integrated into the macromolecular crystallographic data acquisition and beamline control software JBluIce. Both tasks employ wrapper scripts around existing crystallographic software. JBluIce executes scripts through a distributed resource management system to make efficient use of all available computing resources through parallel processing. The JBluIce single-crystal data collection strategy feature uses a choice of strategy programs to help users rank sample crystals and collect data. The strategy results can be conveniently exported to a data collection run. The JBluIce multi-crystal strategy feature calculates a collection strategy to optimize coverage of reciprocal space in cases where incomplete data are available from previous samples. The JBluIce data processing runs simultaneously with data collection using a choice of data reduction wrappers for integration and scaling of newly collected data, with an option for merging with pre-existing data. Data are processed separately if collected from multiple sites on a crystal or from multiple crystals, then scaled and merged. Results from all strategy and processing calculations are displayed in relevant tabs of JBluIce. PMID:25484844

  14. Parallel processing and expert systems

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Lau, Sonie

    1991-01-01

    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 90's cannot enjoy an increased level of autonomy without the efficient use of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real time demands are met for large expert systems. Speed-up via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial labs in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems was surveyed. The survey is divided into three major sections: (1) multiprocessors for parallel expert systems; (2) parallel languages for symbolic computations; and (3) measurements of parallelism of expert system. Results to date indicate that the parallelism achieved for these systems is small. In order to obtain greater speed-ups, data parallelism and application parallelism must be exploited.

  15. Experimental Parallel-Processing Computer

    NASA Technical Reports Server (NTRS)

    Mcgregor, J. W.; Salama, M. A.

    1986-01-01

    Master processor supervises slave processors, each with its own memory. Computer with parallel processing serves as inexpensive tool for experimentation with parallel mathematical algorithms. Speed enhancement obtained depends on both nature of problem and structure of algorithm used. In parallel-processing architecture, "bank select" and control signals determine which one, if any, of N slave processor memories accessible to master processor at any given moment. When so selected, slave memory operates as part of master computer memory. When not selected, slave memory operates independently of main memory. Slave processors communicate with each other via input/output bus.

  16. Parallel processing and expert systems

    NASA Technical Reports Server (NTRS)

    Lau, Sonie; Yan, Jerry C.

    1991-01-01

    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited.

  17. Parallel processing in finite element structural analysis

    NASA Technical Reports Server (NTRS)

    Noor, Ahmed K.

    1987-01-01

    A brief review is made of the fundamental concepts and basic issues of parallel processing. Discussion focuses on parallel numerical algorithms, performance evaluation of machines and algorithms, and parallelism in finite element computations. A computational strategy is proposed for maximizing the degree of parallelism at different levels of the finite element analysis process including: 1) formulation level (through the use of mixed finite element models); 2) analysis level (through additive decomposition of the different arrays in the governing equations into the contributions to a symmetrized response plus correction terms); 3) numerical algorithm level (through the use of operator splitting techniques and application of iterative processes); and 4) implementation level (through the effective combination of vectorization, multitasking and microtasking, whenever available).

  18. Parallel processing in immune networks

    NASA Astrophysics Data System (ADS)

    Agliari, Elena; Barra, Adriano; Bartolucci, Silvia; Galluzzi, Andrea; Guerra, Francesco; Moauro, Francesco

    2013-04-01

    In this work, we adopt a statistical-mechanics approach to investigate basic, systemic features exhibited by adaptive immune systems. The lymphocyte network made by B cells and T cells is modeled by a bipartite spin glass, where, following biological prescriptions, links connecting B cells and T cells are sparse. Interestingly, the dilution performed on links is shown to make the system able to orchestrate parallel strategies to fight several pathogens at the same time; this multitasking capability constitutes a remarkable, key property of immune systems as multiple antigens are always present within the host. We also define the stochastic process ruling the temporal evolution of lymphocyte activity and show its relaxation toward an equilibrium measure allowing statistical-mechanics investigations. Analytical results are compared with Monte Carlo simulations and signal-to-noise outcomes showing overall excellent agreement. Finally, within our model, a rationale for the experimentally well-evidenced correlation between lymphocytosis and autoimmunity is achieved; this sheds further light on the systemic features exhibited by immune networks.

  19. Parallel processing spacecraft communication system

    NASA Technical Reports Server (NTRS)

    Bolotin, Gary S. (Inventor); Donaldson, James A. (Inventor); Luong, Huy H. (Inventor); Wood, Steven H. (Inventor)

    1998-01-01

    An uplink controlling assembly speeds data processing using a special parallel codeblock technique. A correct start sequence initiates processing of a frame. Two possible start sequences can be used; and the one which is used determines whether data polarity is inverted or non-inverted. Processing continues until uncorrectable errors are found. The frame ends by intentionally sending a block with an uncorrectable error. Each of the codeblocks in the frame has a channel ID. Each channel ID can be separately processed in parallel. This obviates the problem of waiting for error correction processing. If that channel number is zero, however, it indicates that the frame of data represents a critical command only. That data is handled in a special way, independent of the software. Otherwise, the processed data further handled using special double buffering techniques to avoid problems from overrun. When overrun does occur, the system takes action to lose only the oldest data.

  20. Parallel Computing Strategies for Irregular Algorithms

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

    2002-01-01

    Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.

  1. Massively parallel femtosecond laser processing.

    PubMed

    Hasegawa, Satoshi; Ito, Haruyasu; Toyoda, Haruyoshi; Hayasaki, Yoshio

    2016-08-01

    Massively parallel femtosecond laser processing with more than 1000 beams was demonstrated. Parallel beams were generated by a computer-generated hologram (CGH) displayed on a spatial light modulator (SLM). The key to this technique is to optimize the CGH in the laser processing system using a scheme called in-system optimization. It was analytically demonstrated that the number of beams is determined by the horizontal number of pixels in the SLM NSLM that is imaged at the pupil plane of an objective lens and a distance parameter pd obtained by dividing the distance between adjacent beams by the diffraction-limited beam diameter. A performance limitation of parallel laser processing in our system was estimated at NSLM of 250 and pd of 7.0. Based on these parameters, the maximum number of beams in a hexagonal close-packed structure was calculated to be 1189 by using an analytical equation. PMID:27505815

  2. Bitplane Image Coding With Parallel Coefficient Processing.

    PubMed

    Auli-Llinas, Francesc; Enfedaque, Pablo; Moure, Juan C; Sanchez, Victor

    2016-01-01

    Image coding systems have been traditionally tailored for multiple instruction, multiple data (MIMD) computing. In general, they partition the (transformed) image in codeblocks that can be coded in the cores of MIMD-based processors. Each core executes a sequential flow of instructions to process the coefficients in the codeblock, independently and asynchronously from the others cores. Bitplane coding is a common strategy to code such data. Most of its mechanisms require sequential processing of the coefficients. The last years have seen the upraising of processing accelerators with enhanced computational performance and power efficiency whose architecture is mainly based on the single instruction, multiple data (SIMD) principle. SIMD computing refers to the execution of the same instruction to multiple data in a lockstep synchronous way. Unfortunately, current bitplane coding strategies cannot fully profit from such processors due to inherently sequential coding task. This paper presents bitplane image coding with parallel coefficient (BPC-PaCo) processing, a coding method that can process many coefficients within a codeblock in parallel and synchronously. To this end, the scanning order, the context formation, the probability model, and the arithmetic coder of the coding engine have been re-formulated. The experimental results suggest that the penalization in coding performance of BPC-PaCo with respect to the traditional strategies is almost negligible.

  3. Parallel Processing at the High School Level.

    ERIC Educational Resources Information Center

    Sheary, Kathryn Anne

    This study investigated the ability of high school students to cognitively understand and implement parallel processing. Data indicates that most parallel processing is being taught at the university level. Instructional modules on C, Linux, and the parallel processing language, P4, were designed to show that high school students are highly…

  4. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1991-01-01

    The main contribution of the effort in the last two years is the introduction of the MOPPS system. After doing extensive literature search, we introduced the system which is described next. MOPPS employs a new solution to the problem of managing programs which solve scientific and engineering applications on a distributed processing environment. Autonomous computers cooperate efficiently in solving large scientific problems with this solution. MOPPS has the advantage of not assuming the presence of any particular network topology or configuration, computer architecture, or operating system. It imposes little overhead on network and processor resources while efficiently managing programs concurrently. The core of MOPPS is an intelligent program manager that builds a knowledge base of the execution performance of the parallel programs it is managing under various conditions. The manager applies this knowledge to improve the performance of future runs. The program manager learns from experience.

  5. Dual compile strategy for parallel heterogeneous execution.

    SciTech Connect

    Smith, Tyler Barratt; Perry, James Thomas

    2012-06-01

    The purpose of the Dual Compile Strategy is to increase our trust in the Compute Engine during its execution of instructions. This is accomplished by introducing a heterogeneous Monitor Engine that checks the execution of the Compute Engine. This leads to the production of a second and custom set of instructions designed for monitoring the execution of the Compute Engine at runtime. This use of multiple engines differs from redundancy in that one engine is working on the application while the other engine is monitoring and checking in parallel instead of both applications (and engines) performing the same work at the same time.

  6. Parallel Activation in Bilingual Phonological Processing

    ERIC Educational Resources Information Center

    Lee, Su-Yeon

    2011-01-01

    In bilingual language processing, the parallel activation hypothesis suggests that bilinguals activate their two languages simultaneously during language processing. Support for the parallel activation mainly comes from studies of lexical (word-form) processing, with relatively less attention to phonological (sound) processing. According to…

  7. Software For Diagnosis Of Parallel Processing

    NASA Technical Reports Server (NTRS)

    Hontalas, Philip; Yan, Jerry; Fineman, Charles

    1995-01-01

    Ames Instrumentation System (AIMS) computer program package of software tools measuring and analyzing performances of parallel-processing application programs. Helps programmer to debug and refine, and to monitor and visualize execution of, parallel-processing application software for Intel iPSC/860 (or equivalent) multicomputer. Performance data collected displayed graphically on computer workstations supporting X-Windows.

  8. Parallel processing of numerical transport algorithms

    SciTech Connect

    Wienke, B.R.; Hiromoto, R.E.

    1984-01-01

    The multigroup, discrete ordinates representation for the linear transport equation enjoys widespread computational use and popularity. Serial solution schemes and numerical algorithms developed over the years provide a timely framework for parallel extension. On the Denelcor HEP, we investigate the parallel structure and extension of a number of standard S/sub n/ approaches. Concurrent inner sweeps, coupled acceleration techniques, synchronized inner-outer loops, and chaotic iteration are described, and results of computations are contrasted. The multigroup representation and serial iteration methods are also detailed. The basic iterative S/sub n/ method lends itself to parallel tasking, portably affording an effective medium for performing transport calculations on future architectures. This analysis represents a first attempt to extend serial S/sub n/ algorithms to parallel environments and provides good baseline estimates on ease of parallel implementation, relative algorithm efficiency, comparative speedup, and some future directions. We find basic inner-outer and chaotic iteration strategies both easily support comparably high degrees of parallelism. Both accommodate parallel rebalance and diffusion acceleration and appear as robust and viable parallel techniques for S/sub n/ production work.

  9. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1995-01-01

    The scope of this project dealt with the investigation of the requirements to support distributed computing of scientific computations over a cluster of cooperative workstations. Various experiments on computations for the solution of simultaneous linear equations were performed in the early phase of the project to gain experience in the general nature and requirements of scientific applications. A specification of a distributed integrated computing environment, DICE, based on a distributed shared memory communication paradigm has been developed and evaluated. The distributed shared memory model facilitates porting existing parallel algorithms that have been designed for shared memory multiprocessor systems to the new environment. The potential of this new environment is to provide supercomputing capability through the utilization of the aggregate power of workstations cooperating in a cluster interconnected via a local area network. Workstations, generally, do not have the computing power to tackle complex scientific applications, making them primarily useful for visualization, data reduction, and filtering as far as complex scientific applications are concerned. There is a tremendous amount of computing power that is left unused in a network of workstations. Very often a workstation is simply sitting idle on a desk. A set of tools can be developed to take advantage of this potential computing power to create a platform suitable for large scientific computations. The integration of several workstations into a logical cluster of distributed, cooperative, computing stations presents an alternative to shared memory multiprocessor systems. In this project we designed and evaluated such a system.

  10. Advanced parallel processing with supercomputer architectures

    SciTech Connect

    Hwang, K.

    1987-10-01

    This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers.

  11. Parallel processing of a rotating shaft simulation

    NASA Technical Reports Server (NTRS)

    Arpasi, Dale J.

    1989-01-01

    A FORTRAN program describing the vibration modes of a rotor-bearing system is analyzed for parellelism in this simulation using a Pascal-like structured language. Potential vector operations are also identified. A critical path through the simulation is identified and used in conjunction with somewhat fictitious processor characteristics to determine the time to calculate the problem on a parallel processing system having those characteristics. A parallel processing overhead time is included as a parameter for proper evaluation of the gain over serial calculation. The serial calculation time is determined for the same fictitious system. An improvement of up to 640 percent is possible depending on the value of the overhead time. Based on the analysis, certain conclusions are drawn pertaining to the development needs of parallel processing technology, and to the specification of parallel processing systems to meet computational needs.

  12. Applications of Parallel Processing in Configuration Analyses

    NASA Technical Reports Server (NTRS)

    Sundaram, Ppchuraman; Hager, James O.; Biedron, Robert T.

    1999-01-01

    The paper presents the recent progress made towards developing an efficient and user-friendly parallel environment for routine analysis of large CFD problems. The coarse-grain parallel version of the CFL3D Euler/Navier-Stokes analysis code, CFL3Dhp, has been ported onto most available parallel platforms. The CFL3Dhp solution accuracy on these parallel platforms has been verified with the CFL3D sequential analyses. User-friendly pre- and post-processing tools that enable a seamless transfer from sequential to parallel processing have been written. Static load balancing tool for CFL3Dhp analysis has also been implemented for achieving good parallel efficiency. For large problems, load balancing efficiency as high as 95% can be achieved even when large number of processors are used. Linear scalability of the CFL3Dhp code with increasing number of processors has also been shown using a large installed transonic nozzle boattail analysis. To highlight the fast turn-around time of parallel processing, the TCA full configuration in sideslip Navier-Stokes drag polar at supersonic cruise has been obtained in a day. CFL3Dhp is currently being used as a production analysis tool.

  13. Knowledge representation into Ada parallel processing

    NASA Technical Reports Server (NTRS)

    Masotto, Tom; Babikyan, Carol; Harper, Richard

    1990-01-01

    The Knowledge Representation into Ada Parallel Processing project is a joint NASA and Air Force funded project to demonstrate the execution of intelligent systems in Ada on the Charles Stark Draper Laboratory fault-tolerant parallel processor (FTPP). Two applications were demonstrated - a portion of the adaptive tactical navigator and a real time controller. Both systems are implemented as Activation Framework Objects on the Activation Framework intelligent scheduling mechanism developed by Worcester Polytechnic Institute. The implementations, results of performance analyses showing speedup due to parallelism and initial efficiency improvements are detailed and further areas for performance improvements are suggested.

  14. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-08-12

    Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  15. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J; Blocksome, Michael E; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Endpoint-based parallel data processing in a parallel active messaging interface ('PAMI') of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective opeartion through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  16. Extensions of ADA for SIMD parallel processing

    SciTech Connect

    Cline, C.; Siegel, H.J.

    1983-01-01

    In order to program SIMD (single instruction stream-multiple data stream) parallel machines used for tasks such as speech and image processing, a language with explicit parallel constructs is often desirable. The language ADA, developed by the Department of Defense, is used as a basis for such a language. Extensions of ADA which allow the user to specify such things as interprocessor communications and activation of processors are proposed. 25 references.

  17. Parallel algorithms for high-speed SAR processing

    NASA Astrophysics Data System (ADS)

    Mallorqui, Jordi J.; Bara, Marc; Broquetas, Antoni; Wis, Mariano; Martinez, Antonio; Nogueira, Leonardo; Moreno, Victoriano

    1998-11-01

    The mass production of SAR products and its usage on monitoring emergency situations (oil spill detection, floods, etc.) requires high-speed SAR processors. Two different parallel strategies for near real time SAR processing based on a multiblock version of the Chirp Scaling Algorithm (CSA) have been studied. The first one is useful for small companies that would like to reduce computation times with no extra investment. It uses a cluster of heterogeneous UNIX workstations as a parallel computer. The second one is oriented to institutions, which have to process large amounts of data in short times and can afford the cost of large parallel computers. The parallel programming has reduced in both cases the computational times when compared with the sequential versions.

  18. Parallel algorithm strategies for circuit simulation.

    SciTech Connect

    Thornquist, Heidi K.; Schiek, Richard Louis; Keiter, Eric Richard

    2010-01-01

    Circuit simulation tools (e.g., SPICE) have become invaluable in the development and design of electronic circuits. However, they have been pushed to their performance limits in addressing circuit design challenges that come from the technology drivers of smaller feature scales and higher integration. Improving the performance of circuit simulation tools through exploiting new opportunities in widely-available multi-processor architectures is a logical next step. Unfortunately, not all traditional simulation applications are inherently parallel, and quickly adapting mature application codes (even codes designed to parallel applications) to new parallel paradigms can be prohibitively difficult. In general, performance is influenced by many choices: hardware platform, runtime environment, languages and compilers used, algorithm choice and implementation, and more. In this complicated environment, the use of mini-applications small self-contained proxies for real applications is an excellent approach for rapidly exploring the parameter space of all these choices. In this report we present a multi-core performance study of Xyce, a transistor-level circuit simulation tool, and describe the future development of a mini-application for circuit simulation.

  19. FORTRAN Extensions for Modular Parallel Processing

    1996-01-12

    FORTRAN M is a small set of extensions to FORTRAN that supports a modular approach to the construction of sequential and parallel programs. FORTRAN M programs use channels to plug together processes which may be written in FORTRAN M or FORTRAN 77. Processes communicate by sending and receiving messages on channels. Channels and processes can be created dynamically, but programs remain deterministic unless specialized nondeterministic constructs are used.

  20. Efficient multitasking: parallel versus serial processing of multiple tasks

    PubMed Central

    Fischer, Rico; Plessow, Franziska

    2015-01-01

    In the context of performance optimizations in multitasking, a central debate has unfolded in multitasking research around whether cognitive processes related to different tasks proceed only sequentially (one at a time), or can operate in parallel (simultaneously). This review features a discussion of theoretical considerations and empirical evidence regarding parallel versus serial task processing in multitasking. In addition, we highlight how methodological differences and theoretical conceptions determine the extent to which parallel processing in multitasking can be detected, to guide their employment in future research. Parallel and serial processing of multiple tasks are not mutually exclusive. Therefore, questions focusing exclusively on either task-processing mode are too simplified. We review empirical evidence and demonstrate that shifting between more parallel and more serial task processing critically depends on the conditions under which multiple tasks are performed. We conclude that efficient multitasking is reflected by the ability of individuals to adjust multitasking performance to environmental demands by flexibly shifting between different processing strategies of multiple task-component scheduling. PMID:26441742

  1. Efficient multitasking: parallel versus serial processing of multiple tasks.

    PubMed

    Fischer, Rico; Plessow, Franziska

    2015-01-01

    In the context of performance optimizations in multitasking, a central debate has unfolded in multitasking research around whether cognitive processes related to different tasks proceed only sequentially (one at a time), or can operate in parallel (simultaneously). This review features a discussion of theoretical considerations and empirical evidence regarding parallel versus serial task processing in multitasking. In addition, we highlight how methodological differences and theoretical conceptions determine the extent to which parallel processing in multitasking can be detected, to guide their employment in future research. Parallel and serial processing of multiple tasks are not mutually exclusive. Therefore, questions focusing exclusively on either task-processing mode are too simplified. We review empirical evidence and demonstrate that shifting between more parallel and more serial task processing critically depends on the conditions under which multiple tasks are performed. We conclude that efficient multitasking is reflected by the ability of individuals to adjust multitasking performance to environmental demands by flexibly shifting between different processing strategies of multiple task-component scheduling. PMID:26441742

  2. Efficient multitasking: parallel versus serial processing of multiple tasks.

    PubMed

    Fischer, Rico; Plessow, Franziska

    2015-01-01

    In the context of performance optimizations in multitasking, a central debate has unfolded in multitasking research around whether cognitive processes related to different tasks proceed only sequentially (one at a time), or can operate in parallel (simultaneously). This review features a discussion of theoretical considerations and empirical evidence regarding parallel versus serial task processing in multitasking. In addition, we highlight how methodological differences and theoretical conceptions determine the extent to which parallel processing in multitasking can be detected, to guide their employment in future research. Parallel and serial processing of multiple tasks are not mutually exclusive. Therefore, questions focusing exclusively on either task-processing mode are too simplified. We review empirical evidence and demonstrate that shifting between more parallel and more serial task processing critically depends on the conditions under which multiple tasks are performed. We conclude that efficient multitasking is reflected by the ability of individuals to adjust multitasking performance to environmental demands by flexibly shifting between different processing strategies of multiple task-component scheduling.

  3. Parallel and Serial Processes in Visual Search

    ERIC Educational Resources Information Center

    Thornton, Thomas L.; Gilden, David L.

    2007-01-01

    A long-standing issue in the study of how people acquire visual information centers around the scheduling and deployment of attentional resources: Is the process serial, or is it parallel? A substantial empirical effort has been dedicated to resolving this issue. However, the results remain largely inconclusive because the methodologies that have…

  4. Hypercluster parallel processing library user's manual

    NASA Technical Reports Server (NTRS)

    Quealy, Angela

    1990-01-01

    This User's Manual describes the Hypercluster Parallel Processing Library, composed of FORTRAN-callable subroutines which enable a FORTRAN programmer to manipulate and transfer information throughout the Hypercluster at NASA Lewis Research Center. Each subroutine and its parameters are described in detail. A simple heat flow application using Laplace's equation is included to demonstrate the use of some of the library's subroutines. The manual can be used initially as an introduction to the parallel features provided by the library. Thereafter it can be used as a reference when programming an application.

  5. Parallel Programming Strategies for Irregular Adaptive Applications

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak; Biegel, Bryan (Technical Monitor)

    2001-01-01

    Achieving scalable performance for dynamic irregular applications is eminently challenging. Traditional message-passing approaches have been making steady progress towards this goal; however, they suffer from complex implementation requirements. The use of a global address space greatly simplifies the programming task, but can degrade the performance for such computations. In this work, we examine two typical irregular adaptive applications, Dynamic Remeshing and N-Body, under competing programming methodologies and across various parallel architectures. The Dynamic Remeshing application simulates flow over an airfoil, and refines localized regions of the underlying unstructured mesh. The N-Body experiment models two neighboring Plummer galaxies that are about to undergo a merger. Both problems demonstrate dramatic changes in processor workloads and interprocessor communication with time; thus, dynamic load balancing is a required component.

  6. Parallel processing and medium-scale multiprocessors

    SciTech Connect

    Wouk, A.

    1989-01-01

    For some time, the community interested in large-scale scientific computing has been attempting to come to terms with parallel computation using a number of processors sufficient to make their concurrent utilization interesting, challenging, and, in the long run, beneficial. Unexpected consequences of parallelization have been discovered. It is possible to obtain reduced performance, both relative and absolute, from an increased number of processors, as a result of inappropriate use of resources in a multiprocessor environment. This exemplifies one of the paradoxes which result from our cultural bias towards sequential thought processes. As a consequence there is a bias for sequential styles of program development in a multiprocessor environment. The authors have learned that the problem of automatic optimization in compilation of parallel programs is computationally hard. Early hopes that automatic, optimal parallelization of sequentially conceived programs would be as achievable as earlier automatic vectorization had been, have been dashed. The authors lack the insights and folklore which are needed to develop useful methodologies and heuristics in the area of parallel computation. The authors are embarked on a voyage of exploration of this new territory, and the work described in this volume can provide helpful guidance. The authors have to explore fully the differences between distributed memory systems, shared memory systems, and combinations, as well as the relative applicability of SIMD and MIMD architectures. Based on the information obtained in such exploration, useful steps towards efficient utilization of many processors should become possible. This paper covers several areas: systems programming, parallel/language/programming systems, and applications programming.

  7. Parallel processing for nonlinear dynamics simulations of structures including rotating bladed-disk assemblies

    NASA Technical Reports Server (NTRS)

    Hsieh, Shang-Hsien

    1993-01-01

    The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.

  8. Parallel asynchronous systems and image processing algorithms

    NASA Technical Reports Server (NTRS)

    Coon, D. D.; Perera, A. G. U.

    1989-01-01

    A new hardware approach to implementation of image processing algorithms is described. The approach is based on silicon devices which would permit an independent analog processing channel to be dedicated to evey pixel. A laminar architecture consisting of a stack of planar arrays of the device would form a two-dimensional array processor with a 2-D array of inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in neuronlike asynchronous pulse coded form through the laminar processor. Such systems would integrate image acquisition and image processing. Acquisition and processing would be performed concurrently as in natural vision systems. The research is aimed at implementation of algorithms, such as the intensity dependent summation algorithm and pyramid processing structures, which are motivated by the operation of natural vision systems. Implementation of natural vision algorithms would benefit from the use of neuronlike information coding and the laminar, 2-D parallel, vision system type architecture. Besides providing a neural network framework for implementation of natural vision algorithms, a 2-D parallel approach could eliminate the serial bottleneck of conventional processing systems. Conversion to serial format would occur only after raw intensity data has been substantially processed. An interesting challenge arises from the fact that the mathematical formulation of natural vision algorithms does not specify the means of implementation, so that hardware implementation poses intriguing questions involving vision science.

  9. A parallel Jacobson-Oksman optimization algorithm. [parallel processing (computers)

    NASA Technical Reports Server (NTRS)

    Straeter, T. A.; Markos, A. T.

    1975-01-01

    A gradient-dependent optimization technique which exploits the vector-streaming or parallel-computing capabilities of some modern computers is presented. The algorithm, derived by assuming that the function to be minimized is homogeneous, is a modification of the Jacobson-Oksman serial minimization method. In addition to describing the algorithm, conditions insuring the convergence of the iterates of the algorithm and the results of numerical experiments on a group of sample test functions are presented. The results of these experiments indicate that this algorithm will solve optimization problems in less computing time than conventional serial methods on machines having vector-streaming or parallel-computing capabilities.

  10. A parallel processing VLSI BAM engine.

    PubMed

    Hasan, S R; Siong, N K

    1997-01-01

    In this paper emerging parallel/distributed architectures are explored for the digital VLSI implementation of adaptive bidirectional associative memory (BAM) neural network. A single instruction stream many data stream (SIMD)-based parallel processing architecture, is developed for the adaptive BAM neural network, taking advantage of the inherent parallelism in BAM. This novel neural processor architecture is named the sliding feeder BAM array processor (SLiFBAM). The SLiFBAM processor can be viewed as a two-stroke neural processing engine, It has four operating modes: learn pattern, evaluate pattern, read weight, and write weight. Design of a SLiFBAM VLSI processor chip is also described. By using 2-mum scalable CMOS technology, a SLiFBAM processor chip with 4+4 neurons and eight modules of 256x5 bit local weight-storage SRAM, was integrated on a 6.9x7.4 mm(2) prototype die. The system architecture is highly flexible and modular, enabling the construction of larger BAM networks of up to 252 neurons using multiple SLiFBAM chips.

  11. A multiarchitecture parallel-processing development environment

    NASA Technical Reports Server (NTRS)

    Townsend, Scott; Blech, Richard; Cole, Gary

    1993-01-01

    A description is given of the hardware and software of a multiprocessor test bed - the second generation Hypercluster system. The Hypercluster architecture consists of a standard hypercube distributed-memory topology, with multiprocessor shared-memory nodes. By using standard, off-the-shelf hardware, the system can be upgraded to use rapidly improving computer technology. The Hypercluster's multiarchitecture nature makes it suitable for researching parallel algorithms in computational field simulation applications (e.g., computational fluid dynamics). The dedicated test-bed environment of the Hypercluster and its custom-built software allows experiments with various parallel-processing concepts such as message passing algorithms, debugging tools, and computational 'steering'. Such research would be difficult, if not impossible, to achieve on shared, commercial systems.

  12. Oxytocin: parallel processing in the social brain?

    PubMed

    Dölen, Gül

    2015-06-01

    Early studies attempting to disentangle the network complexity of the brain exploited the accessibility of sensory receptive fields to reveal circuits made up of synapses connected both in series and in parallel. More recently, extension of this organisational principle beyond the sensory systems has been made possible by the advent of modern molecular, viral and optogenetic approaches. Here, evidence supporting parallel processing of social behaviours mediated by oxytocin is reviewed. Understanding oxytocinergic signalling from this perspective has significant implications for the design of oxytocin-based therapeutic interventions aimed at disorders such as autism, where disrupted social function is a core clinical feature. Moreover, identification of opportunities for novel technology development will require a better appreciation of the complexity of the circuit-level organisation of the social brain.

  13. Oxytocin: parallel processing in the social brain?

    PubMed

    Dölen, Gül

    2015-06-01

    Early studies attempting to disentangle the network complexity of the brain exploited the accessibility of sensory receptive fields to reveal circuits made up of synapses connected both in series and in parallel. More recently, extension of this organisational principle beyond the sensory systems has been made possible by the advent of modern molecular, viral and optogenetic approaches. Here, evidence supporting parallel processing of social behaviours mediated by oxytocin is reviewed. Understanding oxytocinergic signalling from this perspective has significant implications for the design of oxytocin-based therapeutic interventions aimed at disorders such as autism, where disrupted social function is a core clinical feature. Moreover, identification of opportunities for novel technology development will require a better appreciation of the complexity of the circuit-level organisation of the social brain. PMID:25912257

  14. Parallel processing for digital picture comparison

    NASA Technical Reports Server (NTRS)

    Cheng, H. D.; Kou, L. T.

    1987-01-01

    In picture processing an important problem is to identify two digital pictures of the same scene taken under different lighting conditions. This kind of problem can be found in remote sensing, satellite signal processing and the related areas. The identification can be done by transforming the gray levels so that the gray level histograms of the two pictures are closely matched. The transformation problem can be solved by using the packing method. Researchers propose a VLSI architecture consisting of m x n processing elements with extensive parallel and pipelining computation capabilities to speed up the transformation with the time complexity 0(max(m,n)), where m and n are the numbers of the gray levels of the input picture and the reference picture respectively. If using uniprocessor and a dynamic programming algorithm, the time complexity will be 0(m(3)xn). The algorithm partition problem, as an important issue in VLSI design, is discussed. Verification of the proposed architecture is also given.

  15. Parallelization strategy for large-scale vibronic coupling calculations.

    PubMed

    Rabidoux, Scott M; Eijkhout, Victor; Stanton, John F

    2014-12-26

    The vibronic coupling model of Köppel, Domcke, and Cederbaum is a powerful means to understand, predict, and analyze electronic spectra of molecules, especially those that exhibit phenomena that involve breakdown of the Born-Oppenheimer approximation. In this work, we describe a new parallel algorithm for carrying out such calculations. The algorithm is conceptually founded upon a "stencil" representation of the required computational steps, which motivates an efficient strategy for coarse-grained parallelization. The equations involved in the direct-CI type diagonalization of the model Hamiltonian are presented, the parallelization strategy is discussed in detail, and the method is illustrated by calculations involving direct-product basis sets with as many as 17 vibrational modes and 130 billion basis functions. PMID:25295469

  16. Cloud parallel processing of tandem mass spectrometry based proteomics data.

    PubMed

    Mohammed, Yassene; Mostovenko, Ekaterina; Henneman, Alex A; Marissen, Rob J; Deelder, André M; Palmblad, Magnus

    2012-10-01

    Data analysis in mass spectrometry based proteomics struggles to keep pace with the advances in instrumentation and the increasing rate of data acquisition. Analyzing this data involves multiple steps requiring diverse software, using different algorithms and data formats. Speed and performance of the mass spectral search engines are continuously improving, although not necessarily as needed to face the challenges of acquired big data. Improving and parallelizing the search algorithms is one possibility; data decomposition presents another, simpler strategy for introducing parallelism. We describe a general method for parallelizing identification of tandem mass spectra using data decomposition that keeps the search engine intact and wraps the parallelization around it. We introduce two algorithms for decomposing mzXML files and recomposing resulting pepXML files. This makes the approach applicable to different search engines, including those relying on sequence databases and those searching spectral libraries. We use cloud computing to deliver the computational power and scientific workflow engines to interface and automate the different processing steps. We show how to leverage these technologies to achieve faster data analysis in proteomics and present three scientific workflows for parallel database as well as spectral library search using our data decomposition programs, X!Tandem and SpectraST.

  17. Prereaders' Story Processing Strategies.

    ERIC Educational Resources Information Center

    Harlin, Rebecca P.

    A study examined prereaders' story processing strategies by assessing their performance on tasks that tapped their ability to (1) use story grammar and role playing, (2) retell a wordless picture book, (3) read a predictable book, (4) retell an oral story, (5) sequence pictured story events, and (6) fingerpoint-read a nursery rhyme. Parent…

  18. Parallel digital signal processing architectures for image processing

    NASA Astrophysics Data System (ADS)

    Kshirsagar, Shirish P.; Hartley, David A.; Harvey, David M.; Hobson, Clifford A.

    1994-10-01

    This paper describes research into a high speed image processing system using parallel digital signal processors for the processing of electro-optic images. The objective of the system is to reduce the processing time of non-contact type inspection problems including industrial and medical applications. A single processor can not deliver sufficient processing power required for the use of applications hence, a MIMD system is designed and constructed to enable fast processing of electro-optic images. The Texas Instruments TMS320C40 digital signal processor is used due to its high speed floating point CPU and the support for the parallel processing environment. A custom designed VISION bus is provided to transfer images between processors. The system is being applied for solder joint inspection of high technology printed circuit boards.

  19. A Parallel Processing Algorithm for Gravity Inversion

    NASA Astrophysics Data System (ADS)

    Frasheri, Neki; Bushati, Salvatore; Frasheri, Alfred

    2013-04-01

    The paper presents results of using MPI parallel processing for the 3D inversion of gravity anomalies. The work is done under the FP7 project HP-SEE (http://www.hp-see.eu/). The inversion of geophysical anomalies remains a challenge, and the use of parallel processing can be a tool to achieve better results, "compensating" the complexity of the ill-posed problem of inversion with the increase of volume of calculations. We considered the gravity as the simplest case of physical fields and experimented an algorithm based in the methodology known as CLEAN and developed by Högbom in 1974. The 3D geosection was discretized in finite cuboid elements and represented by a 3D array of nodes, while the ground surface where the anomaly is observed as a 2D array of points. Starting from a geosection with mass density zero in all nodes, iteratively the algorithm defines the 3D node that offers the best anomaly shape that approximates the observed anomaly minimizing the least squares error; the mass density in the best 3D node is modified with a prefixed density step and the related effect subtracted from the observed anomaly; the process continues until some criteria is fulfilled. Theoretical complexity of he algorithm was evaluated on the basis of iterations and run-time for a geosection discretized in different scales. We considered the average number N of nodes in one edge of the 3D array. The order of number of iterations was evaluated O(N^3); and the order of run-time was evaluated O(N^8). We used several different methods for the identification of the 3D node which effect offers the best least squares error in approximating the observed anomaly: unweighted least squares error for the whole 2D array of anomalous points; weighting least squares error by the inverted value of observed anomaly over each 3D node; and limiting the area of 2D anomalous points where least squares are calculated over shallow 3D nodes. By comparing results from the inversion of single body and two

  20. Enjoying Sad Music: Paradox or Parallel Processes?

    PubMed

    Schubert, Emery

    2016-01-01

    Enjoyment of negative emotions in music is seen by many as a paradox. This article argues that the paradox exists because it is difficult to view the process that generates enjoyment as being part of the same system that also generates the subjective negative feeling. Compensation theories explain the paradox as the compensation of a negative emotion by the concomitant presence of one or more positive emotions. But compensation brings us no closer to explaining the paradox because it does not explain how experiencing sadness itself is enjoyed. The solution proposed is that an emotion is determined by three critical processes-labeled motivational action tendency (MAT), subjective feeling (SF) and Appraisal. For many emotions the MAT and SF processes are coupled in valence. For example, happiness has positive MAT and positive SF, annoyance has negative MAT and negative SF. However, it is argued that in an aesthetic context, such as listening to music, emotion processes can become decoupled. The decoupling is controlled by the Appraisal process, which can assess if the context of the sadness is real-life (where coupling occurs) or aesthetic (where decoupling can occur). In an aesthetic context sadness retains its negative SF but the aversive, negative MAT is inhibited, leaving sadness to still be experienced as a negative valanced emotion, while contributing to the overall positive MAT. Individual differences, mood and previous experiences mediate the degree to which the aversive aspects of MAT are inhibited according to this Parallel Processing Hypothesis (PPH). The reason for hesitancy in considering or testing PPH, as well as the preponderance of research on sadness at the exclusion of other negative emotions, are discussed. PMID:27445752

  1. Enjoying Sad Music: Paradox or Parallel Processes?

    PubMed

    Schubert, Emery

    2016-01-01

    Enjoyment of negative emotions in music is seen by many as a paradox. This article argues that the paradox exists because it is difficult to view the process that generates enjoyment as being part of the same system that also generates the subjective negative feeling. Compensation theories explain the paradox as the compensation of a negative emotion by the concomitant presence of one or more positive emotions. But compensation brings us no closer to explaining the paradox because it does not explain how experiencing sadness itself is enjoyed. The solution proposed is that an emotion is determined by three critical processes-labeled motivational action tendency (MAT), subjective feeling (SF) and Appraisal. For many emotions the MAT and SF processes are coupled in valence. For example, happiness has positive MAT and positive SF, annoyance has negative MAT and negative SF. However, it is argued that in an aesthetic context, such as listening to music, emotion processes can become decoupled. The decoupling is controlled by the Appraisal process, which can assess if the context of the sadness is real-life (where coupling occurs) or aesthetic (where decoupling can occur). In an aesthetic context sadness retains its negative SF but the aversive, negative MAT is inhibited, leaving sadness to still be experienced as a negative valanced emotion, while contributing to the overall positive MAT. Individual differences, mood and previous experiences mediate the degree to which the aversive aspects of MAT are inhibited according to this Parallel Processing Hypothesis (PPH). The reason for hesitancy in considering or testing PPH, as well as the preponderance of research on sadness at the exclusion of other negative emotions, are discussed.

  2. Enjoying Sad Music: Paradox or Parallel Processes?

    PubMed Central

    Schubert, Emery

    2016-01-01

    Enjoyment of negative emotions in music is seen by many as a paradox. This article argues that the paradox exists because it is difficult to view the process that generates enjoyment as being part of the same system that also generates the subjective negative feeling. Compensation theories explain the paradox as the compensation of a negative emotion by the concomitant presence of one or more positive emotions. But compensation brings us no closer to explaining the paradox because it does not explain how experiencing sadness itself is enjoyed. The solution proposed is that an emotion is determined by three critical processes—labeled motivational action tendency (MAT), subjective feeling (SF) and Appraisal. For many emotions the MAT and SF processes are coupled in valence. For example, happiness has positive MAT and positive SF, annoyance has negative MAT and negative SF. However, it is argued that in an aesthetic context, such as listening to music, emotion processes can become decoupled. The decoupling is controlled by the Appraisal process, which can assess if the context of the sadness is real-life (where coupling occurs) or aesthetic (where decoupling can occur). In an aesthetic context sadness retains its negative SF but the aversive, negative MAT is inhibited, leaving sadness to still be experienced as a negative valanced emotion, while contributing to the overall positive MAT. Individual differences, mood and previous experiences mediate the degree to which the aversive aspects of MAT are inhibited according to this Parallel Processing Hypothesis (PPH). The reason for hesitancy in considering or testing PPH, as well as the preponderance of research on sadness at the exclusion of other negative emotions, are discussed. PMID:27445752

  3. Serial Order: A Parallel Distributed Processing Approach.

    ERIC Educational Resources Information Center

    Jordan, Michael I.

    Human behavior shows a variety of serially ordered action sequences. This paper presents a theory of serial order which describes how sequences of actions might be learned and performed. In this theory, parallel interactions across time (coarticulation) and parallel interactions across space (dual-task interference) are viewed as two aspects of a…

  4. A data parallel strategy for aligning multiple biological sequences on multi-core computers.

    PubMed

    Zhu, Xiangyuan; Li, Kenli; Salah, Ahmad

    2013-05-01

    In this paper, we address the large-scale biological sequence alignment problem, which has an increasing demand in computational biology. We employ data parallelism paradigm that is suitable for handling large-scale processing on multi-core computers to achieve a high degree of parallelism. Using the data parallelism paradigm, we propose a general strategy which can be used to speed up any multiple sequence alignment method. We applied five different clustering algorithms in our strategy and implemented rigorous tests on an 8-core computer using four traditional benchmarks and artificially generated sequences. The results show that our multi-core-based implementations can achieve up to 151-fold improvements in execution time while losing 2.19% accuracy on average. The source code of the proposed strategy, together with the test sets used in our analysis, is available on request.

  5. An intelligent allocation algorithm for parallel processing

    NASA Technical Reports Server (NTRS)

    Carroll, Chester C.; Homaifar, Abdollah; Ananthram, Kishan G.

    1988-01-01

    The problem of allocating nodes of a program graph to processors in a parallel processing architecture is considered. The algorithm is based on critical path analysis, some allocation heuristics, and the execution granularity of nodes in a program graph. These factors, and the structure of interprocessor communication network, influence the allocation. To achieve realistic estimations of the executive durations of allocations, the algorithm considers the fact that nodes in a program graph have to communicate through varying numbers of tokens. Coarse and fine granularities have been implemented, with interprocessor token-communication duration, varying from zero up to values comparable to the execution durations of individual nodes. The effect on allocation of communication network structures is demonstrated by performing allocations for crossbar (non-blocking) and star (blocking) networks. The algorithm assumes the availability of as many processors as it needs for the optimal allocation of any program graph. Hence, the focus of allocation has been on varying token-communication durations rather than varying the number of processors. The algorithm always utilizes as many processors as necessary for the optimal allocation of any program graph, depending upon granularity and characteristics of the interprocessor communication network.

  6. Fault tolerant massively parallel processing architecture

    SciTech Connect

    Balasubramanian, V.; Banerjee, P.

    1987-08-01

    This paper presents two massively parallel processing architectures suitable for solving a wide variety of algorithms of divide-and-conquer type for problems such as the discrete Fourier transform, production systems, design automation, and others. The first architecture, called the Chain-structured Butterfly ARchitecture (CBAR), consists of a two-dimensional array of N-L . (log/sub 2/(L)+1) processing elements (PE) organized as L levels of log/sub 2/(L)+1 stages, and which has the butterfly connection between PEs in consecutive stages with straight-through feedback between PEs in the last and first stages. This connection system has the desirable property of allowing thousands of PEs to be connected with O(N) connection cost, O(log/sub 2/(N/log/sub 2/N)) communication paths, and a small number (=4) of I/O ports per PE. However, this architecture is not fault tolerant. The authors, therefore, propose a second architecture, called the REconfigurable Chain-structured Butterfly ARchitecture (RECBAR), which is a modified version of the CBAR. The RECBAR possesses all the desirable features of the CBAR, with the number of I/O ports per PE increased to six, and uses O(log/sub 2/N)/N) overhead in PEs and approximately 50% overhead in links to achieve single-level fault tolerance. Reliability improvements of the RECBAR over the CBAR are studied. This paper also presents a distributed diagnostic and structuring algorithm for the RECBAR that enables the architecture to detect faults and structure itself accordingly within 2 . log/sub 2/(L)+1 time steps, thus making it a truly fault tolerant architecture.

  7. Dynamic Load Balancing Strategies for Parallel Reacting Flow Simulations

    NASA Astrophysics Data System (ADS)

    Pisciuneri, Patrick; Meneses, Esteban; Givi, Peyman

    2014-11-01

    Load balancing in parallel computing aims at distributing the work as evenly as possible among the processors. This is a critical issue in the performance of parallel, time accurate, flow simulators. The constraint of time accuracy requires that all processes must be finished with their calculation for a given time step before any process can begin calculation of the next time step. Thus, an irregularly balanced compute load will result in idle time for many processes for each iteration and thus increased walltimes for calculations. Two existing, dynamic load balancing approaches are applied to the simplified case of a partially stirred reactor for methane combustion. The first is Zoltan, a parallel partitioning, load balancing, and data management library developed at the Sandia National Laboratories. The second is Charm++, which is its own machine independent parallel programming system developed at the University of Illinois at Urbana-Champaign. The performance of these two approaches is compared, and the prospects for their application to full 3D, reacting flow solvers is assessed.

  8. Parallel Processing with Digital Signal Processing Hardware and Software

    NASA Technical Reports Server (NTRS)

    Swenson, Cory V.

    1995-01-01

    The assembling and testing of a parallel processing system is described which will allow a user to move a Digital Signal Processing (DSP) application from the design stage to the execution/analysis stage through the use of several software tools and hardware devices. The system will be used to demonstrate the feasibility of the Algorithm To Architecture Mapping Model (ATAMM) dataflow paradigm for static multiprocessor solutions of DSP applications. The individual components comprising the system are described followed by the installation procedure, research topics, and initial program development.

  9. Experience in highly parallel processing using DAP

    NASA Technical Reports Server (NTRS)

    Parkinson, D.

    1987-01-01

    Distributed Array Processors (DAP) have been in day to day use for ten years and a large amount of user experience has been gained. The profile of user applications is similar to that of the Massively Parallel Processor (MPP) working group. Experience has shown that contrary to expectations, highly parallel systems provide excellent performance on so-called dirty problems such as the physics part of meteorological codes. The reasons for this observation are discussed. The arguments against replacing bit processors with floating point processors are also discussed.

  10. Parallel Processing in Visual Search Asymmetry

    ERIC Educational Resources Information Center

    Dosher, Barbara Anne; Han, Songmei; Lu, Zhong-Lin

    2004-01-01

    The difficulty of visual search may depend on assignment of the same visual elements as targets and distractors-search asymmetry. Easy C-in-O searches and difficult O-in-C searches are often associated with parallel and serial search, respectively. Here, the time course of visual search was measured for both tasks with speed-accuracy methods. The…

  11. Parallel processing of ADS40 images on PC network

    NASA Astrophysics Data System (ADS)

    Qiu, Feng; Duan, Yansong; Zhang, Jianqing

    2009-10-01

    In this paper, we aim to design a parallel processing system based on economic hardware environment to optimize photogrammetric process of Leica ADS40 images considering ideas and methods of parallel computing. We adopt parallel computing PCAM principle to design and implement a test system for parallel processing of ADS40 images. The test system consists of common personal computers and local gigabits network. It can make full use of network computing and storage resources under a economical and practical cost to deal with ADS40 images. Experiment shows that it achieves significant improvement of processing efficiency. Furthermore, the robustness and compatibility of this system is much higher than stand alone computer system because of system's redundancy based on network. In conclusion, parallel processing system based on PC network brings us a much more efficiency solution of ADS40's photogrammetric production.

  12. Parallel versus Sequential Processing of Pictures and Words

    ERIC Educational Resources Information Center

    Snodgrass, Joan Gay; Antone, George

    1974-01-01

    The purpose of this experiment was to test a proposal by Paivio (1971) that visual memory images are specialized for parallel or spatiol processing, whereas verbal memory codes are specialized for sequential or temporal processing. (Author)

  13. Hypercluster - Parallel processing for computational mechanics

    NASA Technical Reports Server (NTRS)

    Blech, Richard A.

    1988-01-01

    An account is given of the development status, performance capabilities and implications for further development of NASA-Lewis' testbed 'hypercluster' parallel computer network, in which multiple processors communicate through a shared memory. Processors have local as well as shared memory; the hypercluster is expanded in the same manner as the hypercube, with processor clusters replacing the normal single processor node. The NASA-Lewis machine has three nodes with a vector personality and one node with a scalar personality. Each of the vector nodes uses four board-level vector processors, while the scalar node uses four general-purpose microcomputer boards.

  14. Bipartite memory network architectures for parallel processing

    SciTech Connect

    Smith, W.; Kale, L.V. . Dept. of Computer Science)

    1990-01-01

    Parallel architectures are boradly classified as either shared memory or distributed memory architectures. In this paper, the authors propose a third family of architectures, called bipartite memory network architectures. In this architecture, processors and memory modules constitute a bipartite graph, where each processor is allowed to access a small subset of the memory modules, and each memory module allows access from a small set of processors. The architecture is particularly suitable for computations requiring dynamic load balancing. The authors explore the properties of this architecture by examining the Perfect Difference set based topology for the graph. Extensions of this topology are also suggested.

  15. Parafrase restructuring of FORTRAN code for parallel processing

    NASA Technical Reports Server (NTRS)

    Wadhwa, Atul

    1988-01-01

    Parafrase transforms a FORTRAN code, subroutine by subroutine, into a parallel code for a vector and/or shared-memory multiprocessor system. Parafrase is not a compiler; it transforms a code and provides information for a vector or concurrent process. Parafrase uses a data dependency to reveal parallelism among instructions. The data dependency test distinguishes between recurrences and statements that can be directly vectorized or parallelized. A number of transformations are required to build a data dependency graph.

  16. Parallel firing strategy on Petri nets: A review

    NASA Astrophysics Data System (ADS)

    Mavlankulov, Gairatzhan; Turaev, Sherzod; Zhumabaeva, Laula; Zhukabayeva, Tamara

    2015-05-01

    In this paper we review the recent results related on Petri net controlled grammars and the close related topics. Though the theme of regulated grammars is one of the classic topics in formal language theory, a Petri net controlled grammar is still interesting subject for the investigation for many reasons. This type of grammars can successfully be used in modeling new problems emerging in manufacturing systems, systems biology and other areas. Moreover, the graphically illustrability, the ability to represent both a grammar and its control in one structure, and the possibility to unify different regulated rewritings make this formalization attractive for the study. We also summarize the obtained results and propose a new conception such as parallel firing strategy on Petri Nets.

  17. Strategy Process in Higher Education

    ERIC Educational Resources Information Center

    Kettunen, Juha

    2010-01-01

    Higher education institutions educate those who are the most talented and best able to secure the future for the next generation. This study examines an efficient strategy process in higher education and emphasises the importance of sufficient dialogue during the process. The study describes the strategy process of the Turku University of Applied…

  18. Parallel processing research in the former Soviet Union

    SciTech Connect

    Dongarra, J.J.; Snyder, L.; Wolcott, P.

    1992-03-01

    This technical assessment report examines strengths and weaknesses of parallel processing research and development in the Soviet Union from the 1980s to June 1991. The assessment was carried out by panel of US scientists who are experts on parallel processing hardware, software, algorithms, and applications, and on Soviet computing. Soviet computer research and development organizations have pursued many of the major avenues of inquiry related to parallel processing that the West has chosen to explore. But, the limited size and substantial breadth of their effort have limited the collective depth of Soviet activity. Even more serious limitations (and delays) of Soviet achievement in parallel processing research can be traced to shortcomings of the Soviet computer industry, which was unable to supply adequate, reliable computer components. Without the ability to build, demonstrate, and test embodiments of their ideas in actual high-performance parallel hardware, both the scope of activity and the success of Soviet parallel processing researchers were severely limited. The quality of the Soviet parallel processing research assessed varied from very sound and interesting to pedestrian, with most of the groups at the major hardware and software centers to which the work is largely confined doing good (or at least serious) research. In a few instances, interesting and competent parallel language development work was found at institutions not associated with hardware development efforts. Unlike Soviet mainframe and minicomputer developers, Soviet parallel processing researchers have not concentrated their efforts on reverse- engineering specific Western systems. No evidence was found of successful Soviet attempts to use breakthroughs in parallel processing technology to leapfrog'' impediments and limitations that Soviet industrial weakness in microelectronics and other computer manufacturing areas impose on the performance of high-end Soviet computers.

  19. Parallel processing research in the former Soviet Union

    SciTech Connect

    Dongarra, J.J.; Snyder, L.; Wolcott, P.

    1992-03-01

    This technical assessment report examines strengths and weaknesses of parallel processing research and development in the Soviet Union from the 1980s to June 1991. The assessment was carried out by panel of US scientists who are experts on parallel processing hardware, software, algorithms, and applications, and on Soviet computing. Soviet computer research and development organizations have pursued many of the major avenues of inquiry related to parallel processing that the West has chosen to explore. But, the limited size and substantial breadth of their effort have limited the collective depth of Soviet activity. Even more serious limitations (and delays) of Soviet achievement in parallel processing research can be traced to shortcomings of the Soviet computer industry, which was unable to supply adequate, reliable computer components. Without the ability to build, demonstrate, and test embodiments of their ideas in actual high-performance parallel hardware, both the scope of activity and the success of Soviet parallel processing researchers were severely limited. The quality of the Soviet parallel processing research assessed varied from very sound and interesting to pedestrian, with most of the groups at the major hardware and software centers to which the work is largely confined doing good (or at least serious) research. In a few instances, interesting and competent parallel language development work was found at institutions not associated with hardware development efforts. Unlike Soviet mainframe and minicomputer developers, Soviet parallel processing researchers have not concentrated their efforts on reverse- engineering specific Western systems. No evidence was found of successful Soviet attempts to use breakthroughs in parallel processing technology to ``leapfrog`` impediments and limitations that Soviet industrial weakness in microelectronics and other computer manufacturing areas impose on the performance of high-end Soviet computers.

  20. [CMACPAR an modified parallel neuro-controller for control processes].

    PubMed

    Ramos, E; Surós, R

    1999-01-01

    CMACPAR is a Parallel Neurocontroller oriented to real time systems as for example Control Processes. Its characteristics are mainly a fast learning algorithm, a reduced number of calculations, great generalization capacity, local learning and intrinsic parallelism. This type of neurocontroller is used in real time applications required by refineries, hydroelectric centers, factories, etc. In this work we present the analysis and the parallel implementation of a modified scheme of the Cerebellar Model CMAC for the n-dimensional space projection using a mean granularity parallel neurocontroller. The proposed memory management allows for a significant memory reduction in training time and required memory size.

  1. Applying Parallel Processing Techniques to Tether Dynamics Simulation

    NASA Technical Reports Server (NTRS)

    Wells, B. Earl

    1996-01-01

    The focus of this research has been to determine the effectiveness of applying parallel processing techniques to a sizable real-world problem, the simulation of the dynamics associated with a tether which connects two objects in low earth orbit, and to explore the degree to which the parallelization process can be automated through the creation of new software tools. The goal has been to utilize this specific application problem as a base to develop more generally applicable techniques.

  2. Hybrid interconnection structures for real-time parallel processing

    NASA Technical Reports Server (NTRS)

    Kim, K. H.; Samson, John R., Jr.

    1989-01-01

    The use of hybrid interconnection structures that combine link connections and bus connections for real-time parallel processing is discussed. Idealistic parallel computation models for two real-time computing applications are described with attention given to a tightly coupled network model for object tracking and a network model for image processing. Consideration is given to the following different interconnection structures: the crossbar, the hypercube, the circular linked array, and the bus array.

  3. CRBLASTER: A Parallel-Processing Computational Framework for Embarrassingly Parallel Image-Analysis Algorithms

    NASA Astrophysics Data System (ADS)

    Mighell, Kenneth John

    2010-10-01

    The development of parallel-processing image-analysis codes is generally a challenging task that requires complicated choreography of interprocessor communications. If, however, the image-analysis algorithm is embarrassingly parallel, then the development of a parallel-processing implementation of that algorithm can be a much easier task to accomplish because, by definition, there is little need for communication between the compute processes. I describe the design, implementation, and performance of a parallel-processing image-analysis application, called crblaster, which does cosmic-ray rejection of CCD images using the embarrassingly parallel l.a.cosmic algorithm. crblaster is written in C using the high-performance computing industry standard Message Passing Interface (MPI) library. crblaster uses a two-dimensional image partitioning algorithm that partitions an input image into N rectangular subimages of nearly equal area; the subimages include sufficient additional pixels along common image partition edges such that the need for communication between computer processes is eliminated. The code has been designed to be used by research scientists who are familiar with C as a parallel-processing computational framework that enables the easy development of parallel-processing image-analysis programs based on embarrassingly parallel algorithms. The crblaster source code is freely available at the official application Web site at the National Optical Astronomy Observatory. Removing cosmic rays from a single 800 × 800 pixel Hubble Space Telescope WFPC2 image takes 44 s with the IRAF script lacos_im.cl running on a single core of an Apple Mac Pro computer with two 2.8 GHz quad-core Intel Xeon processors. crblaster is 7.4 times faster when processing the same image on a single core on the same machine. Processing the same image with crblaster simultaneously on all eight cores of the same machine takes 0.875 s—which is a speedup factor of 50.3 times faster than the

  4. Repartitioning Strategies for Massively Parallel Simulation of Reacting Flow

    NASA Astrophysics Data System (ADS)

    Pisciuneri, Patrick; Zheng, Angen; Givi, Peyman; Labrinidis, Alexandros; Chrysanthis, Panos

    2015-11-01

    The majority of parallel CFD simulators partition the domain into equal regions and assign the calculations for a particular region to a unique processor. This type of domain decomposition is vital to the efficiency of the solver. However, as the simulation develops, the workload among the partitions often become uneven (e.g. by adaptive mesh refinement, or chemically reacting regions) and a new partition should be considered. The process of repartitioning adjusts the current partition to evenly distribute the load again. We compare two repartitioning tools: Zoltan, an architecture-agnostic graph repartitioner developed at the Sandia National Laboratories; and Paragon, an architecture-aware graph repartitioner developed at the University of Pittsburgh. The comparative assessment is conducted via simulation of the Taylor-Green vortex flow with chemical reaction.

  5. Parallel Signal Processing and System Simulation using aCe

    NASA Technical Reports Server (NTRS)

    Dorband, John E.; Aburdene, Maurice F.

    2003-01-01

    Recently, networked and cluster computation have become very popular for both signal processing and system simulation. A new language is ideally suited for parallel signal processing applications and system simulation since it allows the programmer to explicitly express the computations that can be performed concurrently. In addition, the new C based parallel language (ace C) for architecture-adaptive programming allows programmers to implement algorithms and system simulation applications on parallel architectures by providing them with the assurance that future parallel architectures will be able to run their applications with a minimum of modification. In this paper, we will focus on some fundamental features of ace C and present a signal processing application (FFT).

  6. Processing data communications events by awakening threads in parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2016-03-15

    Processing data communications events in a parallel active messaging interface (`PAMI`) of a parallel computer that includes compute nodes that execute a parallel application, with the PAMI including data communications endpoints, and the endpoints are coupled for data communications through the PAMI and through other data communications resources, including determining by an advance function that there are no actionable data communications events pending for its context, placing by the advance function its thread of execution into a wait state, waiting for a subsequent data communications event for the context; responsive to occurrence of a subsequent data communications event for the context, awakening by the thread from the wait state; and processing by the advance function the subsequent data communications event now pending for the context.

  7. Parallel processing for pitch splitting decomposition

    NASA Astrophysics Data System (ADS)

    Barnes, Levi; Li, Yong; Wadkins, David; Biederman, Steve; Miloslavsky, Alex; Cork, Chris

    2009-10-01

    Decomposition of an input pattern in preparation for a double patterning process is an inherently global problem in which the influence of a local decomposition decision can be felt across an entire pattern. In spite of this, a large portion of the work can be massively distributed. Here, we discuss the advantages of geometric distribution for polygon operations with limited range of influence. Further, we have found that even the naturally global "coloring" step can, in large part, be handled in a geometrically local manner. In some practical cases, up to 70% of the work can be distributed geometrically. We also describe the methods for partitioning the problem into local pieces and present scaling data up to 100 CPUs. These techniques reduce DPT decomposition runtime by orders of magnitude.

  8. FPGA-Based Filterbank Implementation for Parallel Digital Signal Processing

    NASA Technical Reports Server (NTRS)

    Berner, Stephan; DeLeon, Phillip

    1999-01-01

    One approach to parallel digital signal processing decomposes a high bandwidth signal into multiple lower bandwidth (rate) signals by an analysis bank. After processing, the subband signals are recombined into a fullband output signal by a synthesis bank. This paper describes an implementation of the analysis and synthesis banks using (Field Programmable Gate Arrays) FPGAs.

  9. High-speed parallel-processing networks for advanced architectures

    SciTech Connect

    Morgan, D.R.

    1988-06-01

    This paper describes various parallel-processing architecture networks that are candidates for eventual airborne use. An attempt at projecting which type of network is suitable or optimum for specific metafunction or stand-alone applications is made. However, specific algorithms will need to be developed and bench marks executed before firm conclusions can be drawn. Also, a conceptual projection of how these processors can be built in small, flyable units through the use of wafer-scale integration is offered. The use of the PAVE PILLAR system architecture to provide system level support for these tightly coupled networks is described. The author concludes that: (1) extremely high processing speeds implemented in flyable hardware is possible through parallel-processing networks if development programs are pursued; (2) dramatic speed enhancements through parallel processing requires an excellent match between the algorithm and computer-network architecture; (3) matching several high speed parallel oriented algorithms across the aircraft system to a limited set of hardware modules may be the most cost-effective approach to achieving speed enhancements; and (4) software-development tools and improved operating systems will need to be developed to support efficient parallel-processor use.

  10. Mapping Pixel Windows To Vectors For Parallel Processing

    NASA Technical Reports Server (NTRS)

    Duong, Tuan A.

    1996-01-01

    Mapping performed by matrices of transistor switches. Arrays of transistor switches devised for use in forming simultaneous connections from square subarray (window) of n x n pixels within electronic imaging device containing np x np array of pixels to linear array of n(sup2) input terminals of electronic neural network or other parallel-processing circuit. Method helps to realize potential for rapidity in parallel processing for such applications as enhancement of images and recognition of patterns. In providing simultaneous connections, overcomes timing bottleneck or older multiplexing, serial-switching, and sample-and-hold methods.

  11. Active Storage Processing in a Parallel File System

    SciTech Connect

    Felix, Evan J.; Fox, Kevin M.; Regimbal, Kevin M.; Nieplocha, Jarek

    2006-01-01

    By creating a processing system within a parallel file system one can harness the power of unused processing power on servers that have very fast access to the disks they are serving. By inserting a module the Lustre file system the Active Storage Concept is able to perform processing with the file system architecture. Results of using this technology are presented as the results of the Supercomputing StorCloud Challenge Application are reviewed.

  12. Parallel processing of atmospheric chemistry calculations: Preliminary considerations

    SciTech Connect

    Elliott, S.; Jones, P.

    1995-01-01

    Global climate calculations are already saturating the class modern vector supercomputers with only a few central processing units. Increased resolution and inclusion of routines to deal with biogeochemical portions of the terrestrial climate system will soon demand massively parallel approaches. The atmospheric photochemistry ensemble is intimately linked to climate through the trace greenhouse gases ozone and methane and modules for representing it are being attached to global three dimensional transport and GCM frameworks. Atmospheric kinetics involve dozens of highly interactive tracers and so will accentuate the need for parallel processing of earth system simulations. In the present text we lay some of the groundwork for addition of atmospheric kinetics packages to GCM and global scale atmospheric models on multiply parallel computers. The discussion is tailored for consumption by the photochemical modelling community. After a review of numerical atmospheric chemistry methods, we examine how kinetics can be implemented on a parallel computer. We concentrate especially on data layout and flexibility and how these can be implemented in various programming models. We conclude that chemistry can be implemented rather easily within existing frameworks of several parallel atmospheric models. However, memory limitations may preclude high resolution studies of global chemistry.

  13. Real-Time Reconfigurable Interconnections for Parallel Optical Processing

    NASA Astrophysics Data System (ADS)

    McArdle, Neil; Taghizadeh, Mohammad R.

    1995-06-01

    In this letter we describe the advantages of a dynamic optical interconnection system for parallel information processing applications. The system is based on a liquid crystal television which acts as a binary phase-only spatial light modulator. We describe example algorithms where reconfigurable interconnects would be useful and present results of several interconnection topologies which have been implemented.

  14. Using Motivational Interviewing Techniques to Address Parallel Process in Supervision

    ERIC Educational Resources Information Center

    Giordano, Amanda; Clarke, Philip; Borders, L. DiAnne

    2013-01-01

    Supervision offers a distinct opportunity to experience the interconnection of counselor-client and counselor-supervisor interactions. One product of this network of interactions is parallel process, a phenomenon by which counselors unconsciously identify with their clients and subsequently present to their supervisors in a similar fashion…

  15. An Image Database on a Parallel Processing Network.

    ERIC Educational Resources Information Center

    Philip, G.; And Others

    1991-01-01

    Describes the design and development of an image database for photographs in the Ulster Museum (Northern Ireland) that used parallelism from a transputer network. Topics addressed include image processing techniques; documentation needed for the photographs, including indexing, classifying, and cataloging; problems; hardware and software aspects;…

  16. Parallel Processing of Objects in a Naming Task

    ERIC Educational Resources Information Center

    Meyer, Antje S.; Ouellet, Marc; Hacker, Christine

    2008-01-01

    The authors investigated whether speakers who named several objects processed them sequentially or in parallel. Speakers named object triplets, arranged in a triangle, in the order left, right, and bottom object. The left object was easy or difficult to identify and name. During the saccade from the left to the right object, the right object shown…

  17. Parallel Alternate Curriculum--A Mainstreaming Implementation Program at the Secondary Level: Alternative Teaching Strategies Combined with Basic Skills.

    ERIC Educational Resources Information Center

    Smith, Gayle

    The Parallel Alternate Curriculum (PAC), a model providng regular content courses, in regular classes, to secondary students with learning problems, combines basic skill instruction with alternative teaching strategies. PAC, a mainstreaming implementation program, is designed to provide inservice training emphasizing the process of how students…

  18. Automating the parallel processing of fluid and structural dynamics calculations

    NASA Technical Reports Server (NTRS)

    Arpasi, Dale J.; Cole, Gary L.

    1987-01-01

    The NASA Lewis Research Center is actively involved in the development of expert system technology to assist users in applying parallel processing to computational fluid and structural dynamic analysis. The goal of this effort is to eliminate the necessity for the physical scientist to become a computer scientist in order to effectively use the computer as a research tool. Programming and operating software utilities have previously been developed to solve systems of ordinary nonlinear differential equations on parallel scalar processors. Current efforts are aimed at extending these capabilties to systems of partial differential equations, that describe the complex behavior of fluids and structures within aerospace propulsion systems. This paper presents some important considerations in the redesign, in particular, the need for algorithms and software utilities that can automatically identify data flow patterns in the application program and partition and allocate calculations to the parallel processors. A library-oriented multiprocessing concept for integrating the hardware and software functions is described.

  19. Automating the parallel processing of fluid and structural dynamics calculations

    NASA Technical Reports Server (NTRS)

    Arpasi, Dale J.; Cole, Gary L.

    1987-01-01

    The NASA Lewis Research Center is actively involved in the development of expert system technology to assist users in applying parallel processing to computational fluid and structural dynamic analysis. The goal of this effort is to eliminate the necessity for the physical scientist to become a computer scientist in order to effectively use the computer as a research tool. Programming and operating software utilities have previously been developed to solve systems of ordinary nonlinear differential equations on parallel scalar processors. Current efforts are aimed at extending these capabilities to systems of partial differential equations, that describe the complex behavior of fluids and structures within aerospace propulsion systems. This paper presents some important considerations in the redesign, in particular, the need for algorithms and software utilities that can automatically identify data flow patterns in the application program and partition and allocate calculations to the parallel processors. A library-oriented multiprocessing concept for integrating the hardware and software functions is described.

  20. Parallelizing the Cellular Potts Model on graphics processing units

    NASA Astrophysics Data System (ADS)

    Tapia, José Juan; D'Souza, Roshan M.

    2011-04-01

    The Cellular Potts Model (CPM) is a lattice based modeling technique used for simulating cellular structures in computational biology. The computational complexity of the model means that current serial implementations restrict the size of simulation to a level well below biological relevance. Parallelization on computing clusters enables scaling the size of the simulation but marginally addresses computational speed due to the limited memory bandwidth between nodes. In this paper we present new data-parallel algorithms and data structures for simulating the Cellular Potts Model on graphics processing units. Our implementations handle most terms in the Hamiltonian, including cell-cell adhesion constraint, cell volume constraint, cell surface area constraint, and cell haptotaxis. We use fine level checkerboards with lock mechanisms using atomic operations to enable consistent updates while maintaining a high level of parallelism. A new data-parallel memory allocation algorithm has been developed to handle cell division. Tests show that our implementation enables simulations of >10 cells with lattice sizes of up to 256 3 on a single graphics card. Benchmarks show that our implementation runs ˜80× faster than serial implementations, and ˜5× faster than previous parallel implementations on computing clusters consisting of 25 nodes. The wide availability and economy of graphics cards mean that our techniques will enable simulation of realistically sized models at a fraction of the time and cost of previous implementations and are expected to greatly broaden the scope of CPM applications.

  1. Parallel-Processing Software for Correlating Stereo Images

    NASA Technical Reports Server (NTRS)

    Klimeck, Gerhard; Deen, Robert; Mcauley, Michael; DeJong, Eric

    2007-01-01

    A computer program implements parallel- processing algorithms for cor relating images of terrain acquired by stereoscopic pairs of digital stereo cameras on an exploratory robotic vehicle (e.g., a Mars rove r). Such correlations are used to create three-dimensional computatio nal models of the terrain for navigation. In this program, the scene viewed by the cameras is segmented into subimages. Each subimage is assigned to one of a number of central processing units (CPUs) opera ting simultaneously.

  2. Parallel Note-Taking: A Strategy for Effective Use of Webnotes

    ERIC Educational Resources Information Center

    Pardini, Eleanor A.; Domizi, Denise P.; Forbes, Daniel A.; Pettis, Gretchen V.

    2005-01-01

    Many instructors supply online lecture notes but little attention has been given to how students can make the best use of this resource. Based on observations of student difficulties with these notes, a strategy called parallel note-taking was developed for using online notes. The strategy is a hybrid of research-proven strategies for effective…

  3. Parallel Processing of Broad-Band PPM Signals

    NASA Technical Reports Server (NTRS)

    Gray, Andrew; Kang, Edward; Lay, Norman; Vilnrotter, Victor; Srinivasan, Meera; Lee, Clement

    2010-01-01

    A parallel-processing algorithm and a hardware architecture to implement the algorithm have been devised for timeslot synchronization in the reception of pulse-position-modulated (PPM) optical or radio signals. As in the cases of some prior algorithms and architectures for parallel, discrete-time, digital processing of signals other than PPM, an incoming broadband signal is divided into multiple parallel narrower-band signals by means of sub-sampling and filtering. The number of parallel streams is chosen so that the frequency content of the narrower-band signals is low enough to enable processing by relatively-low speed complementary metal oxide semiconductor (CMOS) electronic circuitry. The algorithm and architecture are intended to satisfy requirements for time-varying time-slot synchronization and post-detection filtering, with correction of timing errors independent of estimation of timing errors. They are also intended to afford flexibility for dynamic reconfiguration and upgrading. The architecture is implemented in a reconfigurable CMOS processor in the form of a field-programmable gate array. The algorithm and its hardware implementation incorporate three separate time-varying filter banks for three distinct functions: correction of sub-sample timing errors, post-detection filtering, and post-detection estimation of timing errors. The design of the filter bank for correction of timing errors, the method of estimating timing errors, and the design of a feedback-loop filter are governed by a host of parameters, the most critical one, with regard to processing very broadband signals with CMOS hardware, being the number of parallel streams (equivalently, the rate-reduction parameter).

  4. Completion Probabilities and Parallel Restart Strategies under an Imposed Deadline

    PubMed Central

    Lorenz, Jan-Hendrik

    2016-01-01

    Let A be any fixed cut-off restart algorithm running in parallel on multiple processors. If the algorithm is only allowed to run for up to time D, then it is no longer guaranteed that a result can be found. In this case, the probability of finding a solution within the time D becomes a measure for the quality of the algorithm. In this paper we address this issue and provide upper and lower bounds for the probability of A finding a solution before a deadline passes under varying assumptions. We also show that the optimal restart times for a fixed cut-off algorithm running in parallel is identical for the optimal restart times for the algorithm running on a single processor. Finally, we conclude that the odds of finding a solution scale superlinearly in the number of processors. PMID:27732631

  5. Highly scalable parallel processing of extracellular recordings of Multielectrode Arrays.

    PubMed

    Gehring, Tiago V; Vasilaki, Eleni; Giugliano, Michele

    2015-01-01

    Technological advances of Multielectrode Arrays (MEAs) used for multisite, parallel electrophysiological recordings, lead to an ever increasing amount of raw data being generated. Arrays with hundreds up to a few thousands of electrodes are slowly seeing widespread use and the expectation is that more sophisticated arrays will become available in the near future. In order to process the large data volumes resulting from MEA recordings there is a pressing need for new software tools able to process many data channels in parallel. Here we present a new tool for processing MEA data recordings that makes use of new programming paradigms and recent technology developments to unleash the power of modern highly parallel hardware, such as multi-core CPUs with vector instruction sets or GPGPUs. Our tool builds on and complements existing MEA data analysis packages. It shows high scalability and can be used to speed up some performance critical pre-processing steps such as data filtering and spike detection, helping to make the analysis of larger data sets tractable. PMID:26737215

  6. Highly scalable parallel processing of extracellular recordings of Multielectrode Arrays.

    PubMed

    Gehring, Tiago V; Vasilaki, Eleni; Giugliano, Michele

    2015-01-01

    Technological advances of Multielectrode Arrays (MEAs) used for multisite, parallel electrophysiological recordings, lead to an ever increasing amount of raw data being generated. Arrays with hundreds up to a few thousands of electrodes are slowly seeing widespread use and the expectation is that more sophisticated arrays will become available in the near future. In order to process the large data volumes resulting from MEA recordings there is a pressing need for new software tools able to process many data channels in parallel. Here we present a new tool for processing MEA data recordings that makes use of new programming paradigms and recent technology developments to unleash the power of modern highly parallel hardware, such as multi-core CPUs with vector instruction sets or GPGPUs. Our tool builds on and complements existing MEA data analysis packages. It shows high scalability and can be used to speed up some performance critical pre-processing steps such as data filtering and spike detection, helping to make the analysis of larger data sets tractable.

  7. Semi-automatic process partitioning for parallel computation

    NASA Technical Reports Server (NTRS)

    Koelbel, Charles; Mehrotra, Piyush; Vanrosendale, John

    1988-01-01

    On current multiprocessor architectures one must carefully distribute data in memory in order to achieve high performance. Process partitioning is the operation of rewriting an algorithm as a collection of tasks, each operating primarily on its own portion of the data, to carry out the computation in parallel. A semi-automatic approach to process partitioning is considered in which the compiler, guided by advice from the user, automatically transforms programs into such an interacting task system. This approach is illustrated with a picture processing example written in BLAZE, which is transformed into a task system maximizing locality of memory reference.

  8. A dataflow analysis tool for parallel processing of algorithms

    NASA Technical Reports Server (NTRS)

    Jones, Robert L., III

    1993-01-01

    A graph-theoretic design process and software tool is presented for selecting a multiprocessing scheduling solution for a class of computational problems. The problems of interest are those that can be described using a dataflow graph and are intended to be executed repetitively on a set of identical parallel processors. Typical applications include signal processing and control law problems. Graph analysis techniques are introduced and shown to effectively determine performance bounds, scheduling constraints, and resource requirements. The software tool is shown to facilitate the application of the design process to a given problem.

  9. Digital intermediate frequency QAM modulator using parallel processing

    DOEpatents

    Pao, Hsueh-Yuan; Tran, Binh-Nien

    2008-05-27

    The digital Intermediate Frequency (IF) modulator applies to various modulation types and offers a simple and low cost method to implement a high-speed digital IF modulator using field programmable gate arrays (FPGAs). The architecture eliminates multipliers and sequential processing by storing the pre-computed modulated cosine and sine carriers in ROM look-up-tables (LUTs). The high-speed input data stream is parallel processed using the corresponding LUTs, which reduces the main processing speed, allowing the use of low cost FPGAs.

  10. Transactional memories: A new abstraction for parallel processing

    SciTech Connect

    Fasel, J.H.; Lubeck, O.M.; Agrawal, D.; Bruno, J.L.; El Abbadi, A.

    1997-12-01

    This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at Los Alamos National Laboratory (LANL). Current distributed memory multiprocessor computer systems make the development of parallel programs difficult. From a programmer`s perspective, it would be most desirable if the underlying hardware and software could provide the programming abstraction commonly referred to as sequential consistency--a single address space and multiple threads; but enforcement of sequential consistency limits opportunities for architectural and operating system performance optimizations, leading to poor performance. Recently, Herlihy and Moss have introduced a new abstraction called transactional memories for parallel programming. The programming model is shared memory with multiple threads. However, data consistency is obtained through the use of transactions rather than mutual exclusion based on locking. The transaction approach permits the underlying system to exploit the potential parallelism in transaction processing. The authors explore the feasibility of designing parallel programs using the transaction paradigm for data consistency and a barrier type of thread synchronization.

  11. Parallel Visualization Co-Processing of Overnight CFD Propulsion Applications

    NASA Technical Reports Server (NTRS)

    Edwards, David E.; Haimes, Robert

    1999-01-01

    An interactive visualization system pV3 is being developed for the investigation of advanced computational methodologies employing visualization and parallel processing for the extraction of information contained in large-scale transient engineering simulations. Visual techniques for extracting information from the data in terms of cutting planes, iso-surfaces, particle tracing and vector fields are included in this system. This paper discusses improvements to the pV3 system developed under NASA's Affordable High Performance Computing project.

  12. A Multi-Core Parallelization Strategy for Statistical Significance Testing in Learning Classifier Systems.

    PubMed

    Rudd, James; Moore, Jason H; Urbanowicz, Ryan J

    2013-11-01

    Permutation-based statistics for evaluating the significance of class prediction, predictive attributes, and patterns of association have only appeared within the learning classifier system (LCS) literature since 2012. While still not widely utilized by the LCS research community, formal evaluations of test statistic confidence are imperative to large and complex real world applications such as genetic epidemiology where it is standard practice to quantify the likelihood that a seemingly meaningful statistic could have been obtained purely by chance. LCS algorithms are relatively computationally expensive on their own. The compounding requirements for generating permutation-based statistics may be a limiting factor for some researchers interested in applying LCS algorithms to real world problems. Technology has made LCS parallelization strategies more accessible and thus more popular in recent years. In the present study we examine the benefits of externally parallelizing a series of independent LCS runs such that permutation testing with cross validation becomes more feasible to complete on a single multi-core workstation. We test our python implementation of this strategy in the context of a simulated complex genetic epidemiological data mining problem. Our evaluations indicate that as long as the number of concurrent processes does not exceed the number of CPU cores, the speedup achieved is approximately linear. PMID:24358057

  13. External sorting: I/O analysis and parallel processing techniques

    SciTech Connect

    Kwan, S.C.

    1986-01-01

    This thesis deals with sorting of data that are much too large to fit in main memory or external sorting. The author focuses on two aspects of external sorting: I/O analysis and parallel processing techniques. Storage device models are defined and applied to analyze the I/O complexities of multi-way merge sort and tag sort (or key sort). It is shown that using higher-merge order, through, reduces the number of merge passes, causes excessive random I/O accesses and degrades the overall I/O performance of multi-way merge sort. Techniques are developed for producing long runs in merge sort and for rearranging the records in tag sort after their ranks are determined. A lower bound for the I/O access time or rearranging the records in tag sort is derived. Two methods are explored for implementing distribution sort on parallel computers. The first method, multi-pass distribution sort, determines the bucket ranges with one read pass over the input file, and uses subsequent passes to distribute the data into buckets and sort them. The distribution and sorting of the buckets are processed in parallel using a two-stage pipeline. The second method, one-pass distribution sort, coalesces the bucket partition, bucket distribution, and sort-bucket phases all together so that the input file needs to be processed only once.

  14. Development of a parallelization strategy for the VARIANT code

    SciTech Connect

    Hanebutte, U.R.; Khalil, H.S.; Palmiotti, G.; Tatsumi, M.

    1996-12-31

    The VARIANT code solves the multigroup steady-state neutron diffusion and transport equation in three-dimensional Cartesian and hexagonal geometries using the variational nodal method. VARIANT consists of four major parts that must be executed sequentially: input handling, calculation of response matrices, solution algorithm (i.e. inner-outer iteration), and output of results. The objective of the parallelization effort was to reduce the overall computing time by distributing the work of the two computationally intensive (sequential) tasks, the coupling coefficient calculation and the iterative solver, equally among a group of processors. This report describes the code`s calculations and gives performance results on one of the benchmark problems used to test the code. The performance analysis in the IBM SPx system shows good efficiency for well-load-balanced programs. Even for relatively small problem sizes, respectable efficiencies are seen for the SPx. An extension to achieve a higher degree of parallelism will be addressed in future work. 7 refs., 1 tab.

  15. Parallel-Processing Equalizers for Multi-Gbps Communications

    NASA Technical Reports Server (NTRS)

    Gray, Andrew; Ghuman, Parminder; Hoy, Scott; Satorius, Edgar H.

    2004-01-01

    Architectures have been proposed for the design of frequency-domain least-mean-square complex equalizers that would be integral parts of parallel- processing digital receivers of multi-gigahertz radio signals and other quadrature-phase-shift-keying (QPSK) or 16-quadrature-amplitude-modulation (16-QAM) of data signals at rates of multiple gigabits per second. Equalizers as used here denotes receiver subsystems that compensate for distortions in the phase and frequency responses of the broad-band radio-frequency channels typically used to convey such signals. The proposed architectures are suitable for realization in very-large-scale integrated (VLSI) circuitry and, in particular, complementary metal oxide semiconductor (CMOS) application- specific integrated circuits (ASICs) operating at frequencies lower than modulation symbol rates. A digital receiver of the type to which the proposed architecture applies (see Figure 1) would include an analog-to-digital converter (A/D) operating at a rate, fs, of 4 samples per symbol period. To obtain the high speed necessary for sampling, the A/D and a 1:16 demultiplexer immediately following it would be constructed as GaAs integrated circuits. The parallel-processing circuitry downstream of the demultiplexer, including a demodulator followed by an equalizer, would operate at a rate of only fs/16 (in other words, at 1/4 of the symbol rate). The output from the equalizer would be four parallel streams of in-phase (I) and quadrature (Q) samples.

  16. Fiona: a parallel and automatic strategy for read error correction

    PubMed Central

    Schulz, Marcel H.; Weese, David; Holtgrewe, Manuel; Dimitrova, Viktoria; Niu, Sijia; Reinert, Knut; Richard, Hugues

    2014-01-01

    Motivation: Automatic error correction of high-throughput sequencing data can have a dramatic impact on the amount of usable base pairs and their quality. It has been shown that the performance of tasks such as de novo genome assembly and SNP calling can be dramatically improved after read error correction. While a large number of methods specialized for correcting substitution errors as found in Illumina data exist, few methods for the correction of indel errors, common to technologies like 454 or Ion Torrent, have been proposed. Results: We present Fiona, a new stand-alone read error–correction method. Fiona provides a new statistical approach for sequencing error detection and optimal error correction and estimates its parameters automatically. Fiona is able to correct substitution, insertion and deletion errors and can be applied to any sequencing technology. It uses an efficient implementation of the partial suffix array to detect read overlaps with different seed lengths in parallel. We tested Fiona on several real datasets from a variety of organisms with different read lengths and compared its performance with state-of-the-art methods. Fiona shows a constantly higher correction accuracy over a broad range of datasets from 454 and Ion Torrent sequencers, without compromise in speed. Conclusion: Fiona is an accurate parameter-free read error–correction method that can be run on inexpensive hardware and can make use of multicore parallelization whenever available. Fiona was implemented using the SeqAn library for sequence analysis and is publicly available for download at http://www.seqan.de/projects/fiona. Contact: mschulz@mmci.uni-saarland.de or hugues.richard@upmc.fr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25161220

  17. Probabilistic structural mechanics research for parallel processing computers

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Martin, William R.

    1991-01-01

    Aerospace structures and spacecraft are a complex assemblage of structural components that are subjected to a variety of complex, cyclic, and transient loading conditions. Significant modeling uncertainties are present in these structures, in addition to the inherent randomness of material properties and loads. To properly account for these uncertainties in evaluating and assessing the reliability of these components and structures, probabilistic structural mechanics (PSM) procedures must be used. Much research has focused on basic theory development and the development of approximate analytic solution methods in random vibrations and structural reliability. Practical application of PSM methods was hampered by their computationally intense nature. Solution of PSM problems requires repeated analyses of structures that are often large, and exhibit nonlinear and/or dynamic response behavior. These methods are all inherently parallel and ideally suited to implementation on parallel processing computers. New hardware architectures and innovative control software and solution methodologies are needed to make solution of large scale PSM problems practical.

  18. Reducing neural network training time with parallel processing

    NASA Technical Reports Server (NTRS)

    Rogers, James L., Jr.; Lamarsh, William J., II

    1995-01-01

    Obtaining optimal solutions for engineering design problems is often expensive because the process typically requires numerous iterations involving analysis and optimization programs. Previous research has shown that a near optimum solution can be obtained in less time by simulating a slow, expensive analysis with a fast, inexpensive neural network. A new approach has been developed to further reduce this time. This approach decomposes a large neural network into many smaller neural networks that can be trained in parallel. Guidelines are developed to avoid some of the pitfalls when training smaller neural networks in parallel. These guidelines allow the engineer: to determine the number of nodes on the hidden layer of the smaller neural networks; to choose the initial training weights; and to select a network configuration that will capture the interactions among the smaller neural networks. This paper presents results describing how these guidelines are developed.

  19. Regional-scale calculation of the LS factor using parallel processing

    NASA Astrophysics Data System (ADS)

    Liu, Kai; Tang, Guoan; Jiang, Ling; Zhu, A.-Xing; Yang, Jianyi; Song, Xiaodong

    2015-05-01

    With the increase of data resolution and the increasing application of USLE over large areas, the existing serial implementation of algorithms for computing the LS factor is becoming a bottleneck. In this paper, a parallel processing model based on message passing interface (MPI) is presented for the calculation of the LS factor, so that massive datasets at a regional scale can be processed efficiently. The parallel model contains algorithms for calculating flow direction, flow accumulation, drainage network, slope, slope length and the LS factor. According to the existence of data dependence, the algorithms are divided into local algorithms and global algorithms. Parallel strategy are designed according to the algorithm characters including the decomposition method for maintaining the integrity of the results, optimized workflow for reducing the time taken for exporting the unnecessary intermediate data and a buffer-communication-computation strategy for improving the communication efficiency. Experiments on a multi-node system show that the proposed parallel model allows efficient calculation of the LS factor at a regional scale with a massive dataset.

  20. Extraction of Hydrological Proximity Measures from DEMs using Parallel Processing

    SciTech Connect

    Tesfa, Teklu K.; Tarboton, David G.; Watson, Daniel W.; Schreuders, Kimberly A.; Baker, Matthew M.; Wallace, Robert M.

    2011-12-01

    Land surface topography is one of the most important terrain properties which impact hydrological, geomorphological, and ecological processes active on a landscape. In our previous efforts to develop a soil depth model based upon topographic and land cover variables, we extracted a set of hydrological proximity measures (HPMs) from a Digital Elevation Model (DEM) as potential explanatory variables for soil depth. These HPMs may also have other, more general modeling applicability in hydrology, geomorphology and ecology, and so are described here from a general perspective. The HPMs we derived are variations of the distance up to ridge points (cells with no incoming flow) and variations of the distance down to stream points (cells with a contributing area greater than a threshold), following the flow path. These HPMs were computed using the D-infinity flow model that apportions flow between adjacent neighbors based on the direction of steepest downward slope on the eight triangular facets constructed in a 3 x 3 grid cell window using the center cell and each pair of adjacent neighboring grid cells in turn. The D-infinity model typically results in multiple flow paths between 2 points on the topography, with the result that distances may be computed as the minimum, maximum or average of the individual flow paths. In addition, each of the HPMs, are calculated vertically, horizontally, and along the land surface. Previously, these HPMs were calculated using recursive serial algorithms which suffered from stack overflow problems when used to process large datasets, limiting the size of DEMs that could be analyzed using that method to approximately 7000 x 7000 cells. To overcome this limitation, we developed a message passing interface (MPI) parallel approach for calculating these HPMs. The parallel algorithms of the HPMs spatially partition the input grid into stripes which are each assigned to separate processes for computation. Each of those processes then uses a

  1. A simple hyperbolic model for communication in parallel processing environments

    NASA Technical Reports Server (NTRS)

    Stoica, Ion; Sultan, Florin; Keyes, David

    1994-01-01

    We introduce a model for communication costs in parallel processing environments called the 'hyperbolic model,' which generalizes two-parameter dedicated-link models in an analytically simple way. Dedicated interprocessor links parameterized by a latency and a transfer rate that are independent of load are assumed by many existing communication models; such models are unrealistic for workstation networks. The communication system is modeled as a directed communication graph in which terminal nodes represent the application processes that initiate the sending and receiving of the information and in which internal nodes, called communication blocks (CBs), reflect the layered structure of the underlying communication architecture. The direction of graph edges specifies the flow of the information carried through messages. Each CB is characterized by a two-parameter hyperbolic function of the message size that represents the service time needed for processing the message. The parameters are evaluated in the limits of very large and very small messages. Rules are given for reducing a communication graph consisting of many to an equivalent two-parameter form, while maintaining an approximation for the service time that is exact in both large and small limits. The model is validated on a dedicated Ethernet network of workstations by experiments with communication subprograms arising in scientific applications, for which a tight fit of the model predictions with actual measurements of the communication and synchronization time between end processes is demonstrated. The model is then used to evaluate the performance of two simple parallel scientific applications from partial differential equations: domain decomposition and time-parallel multigrid. In an appropriate limit, we also show the compatibility of the hyperbolic model with the recently proposed LogP model.

  2. Parallel processing of objects in a naming task.

    PubMed

    Meyer, Antje S; Ouellet, Marc; Häcker, Christine

    2008-07-01

    The authors investigated whether speakers who named several objects processed them sequentially or in parallel. Speakers named object triplets, arranged in a triangle, in the order left, right, and bottom object. The left object was easy or difficult to identify and name. During the saccade from the left to the right object, the right object shown at trial onset (the interloper) was replaced by a new object (the target), which the speakers named. Interloper and target were identical or unrelated objects, or they were conceptually unrelated objects with the same name (e.g., bat [animal] and [baseball] bat). The mean duration of the gazes to the target was shorter when interloper and target were identical or had the same name than when they were unrelated. The facilitatory effects of identical and homophonous interlopers were significantly larger when the left object was easy to process than when it was difficult to process. This interaction demonstrates that the speakers processed the left and right objects in parallel.

  3. Smoldyn on graphics processing units: massively parallel Brownian dynamics simulations.

    PubMed

    Dematté, Lorenzo

    2012-01-01

    Space is a very important aspect in the simulation of biochemical systems; recently, the need for simulation algorithms able to cope with space is becoming more and more compelling. Complex and detailed models of biochemical systems need to deal with the movement of single molecules and particles, taking into consideration localized fluctuations, transportation phenomena, and diffusion. A common drawback of spatial models lies in their complexity: models can become very large, and their simulation could be time consuming, especially if we want to capture the systems behavior in a reliable way using stochastic methods in conjunction with a high spatial resolution. In order to deliver the promise done by systems biology to be able to understand a system as whole, we need to scale up the size of models we are able to simulate, moving from sequential to parallel simulation algorithms. In this paper, we analyze Smoldyn, a widely diffused algorithm for stochastic simulation of chemical reactions with spatial resolution and single molecule detail, and we propose an alternative, innovative implementation that exploits the parallelism of Graphics Processing Units (GPUs). The implementation executes the most computational demanding steps (computation of diffusion, unimolecular, and bimolecular reaction, as well as the most common cases of molecule-surface interaction) on the GPU, computing them in parallel on each molecule of the system. The implementation offers good speed-ups and real time, high quality graphics output

  4. Parallel processing at the SSC: The fact and the fiction

    SciTech Connect

    Bourianoff, G.; Cole, B.

    1991-10-01

    Accurately modelling the behavior of particles circulating in accelerators is a computationally demanding task. The particle tracking code currently in use at SSC is based upon a thin element'' analysis (TEAPOT). In this model each magnet in the lattice is described by a thin element at which the particle experiences an impulsive kick. Each kick requires approximately 200 floating point operations ( FLOP''). For the SSC collider lattice consisting of 10{sup 4} elements, performing a tracking of study for a set of 100 particles for 10{sup 7} turns would require 2 {times} 10{sup 15} FLOPS. Even on a machine capable of 100 MFLOP/sec (MFLOPS), this would require 2 {times} 10{sup 7} seconds, and many such runs are necessary. It should be noted that the accuracy with which the kicks are to be calculated is important: the large number of iterations involved will magnify the effects of small errors. The inability of current computational resources to effectively perform the full calculation motivates the migration of this calculation to the most powerful computers available. A survey of the current research into new technologies for superconducting reveals that the supercomputers of the future will be parallel in nature. Further, numerous such machines exist today, and are being used to solve other difficult problems. Thus it seems clear that it is not early to begin developing the capability to develop tracking codes for parallel architectures. This report discusses implementing parallel processing on the SCC.

  5. Parallel processing at the SSC: The fact and the fiction

    SciTech Connect

    Bourianoff, G.; Cole, B.

    1991-10-01

    Accurately modelling the behavior of particles circulating in accelerators is a computationally demanding task. The particle tracking code currently in use at SSC is based upon a ``thin element`` analysis (TEAPOT). In this model each magnet in the lattice is described by a thin element at which the particle experiences an impulsive kick. Each kick requires approximately 200 floating point operations (``FLOP``). For the SSC collider lattice consisting of 10{sup 4} elements, performing a tracking of study for a set of 100 particles for 10{sup 7} turns would require 2 {times} 10{sup 15} FLOPS. Even on a machine capable of 100 MFLOP/sec (MFLOPS), this would require 2 {times} 10{sup 7} seconds, and many such runs are necessary. It should be noted that the accuracy with which the kicks are to be calculated is important: the large number of iterations involved will magnify the effects of small errors. The inability of current computational resources to effectively perform the full calculation motivates the migration of this calculation to the most powerful computers available. A survey of the current research into new technologies for superconducting reveals that the supercomputers of the future will be parallel in nature. Further, numerous such machines exist today, and are being used to solve other difficult problems. Thus it seems clear that it is not early to begin developing the capability to develop tracking codes for parallel architectures. This report discusses implementing parallel processing on the SCC.

  6. Application of parallel distributed processing to space based systems

    NASA Technical Reports Server (NTRS)

    Macdonald, J. R.; Heffelfinger, H. L.

    1987-01-01

    The concept of using Parallel Distributed Processing (PDP) to enhance automated experiment monitoring and control is explored. Recent very large scale integration (VLSI) advances have made such applications an achievable goal. The PDP machine has demonstrated the ability to automatically organize stored information, handle unfamiliar and contradictory input data and perform the actions necessary. The PDP machine has demonstrated that it can perform inference and knowledge operations with greater speed and flexibility and at lower cost than traditional architectures. In applications where the rule set governing an expert system's decisions is difficult to formulate, PDP can be used to extract rules by associating the information an expert receives with the actions taken.

  7. Parallel-Processing Software for Creating Mosaic Images

    NASA Technical Reports Server (NTRS)

    Klimeck, Gerhard; Deen, Robert; McCauley, Michael; DeJong, Eric

    2008-01-01

    A computer program implements parallel processing for nearly real-time creation of panoramic mosaics of images of terrain acquired by video cameras on an exploratory robotic vehicle (e.g., a Mars rover). Because the original images are typically acquired at various camera positions and orientations, it is necessary to warp the images into the reference frame of the mosaic before stitching them together to create the mosaic. [Also see "Parallel-Processing Software for Correlating Stereo Images," Software Supplement to NASA Tech Briefs, Vol. 31, No. 9 (September 2007) page 26.] The warping algorithm in this computer program reflects the considerations that (1) for every pixel in the desired final mosaic, a good corresponding point must be found in one or more of the original images and (2) for this purpose, one needs a good mathematical model of the cameras and a good correlation of individual pixels with respect to their positions in three dimensions. The desired mosaic is divided into slices, each of which is assigned to one of a number of central processing units (CPUs) operating simultaneously. The results from the CPUs are gathered and placed into the final mosaic. The time taken to create the mosaic depends upon the number of CPUs, the speed of each CPU, and whether a local or a remote data-staging mechanism is used.

  8. Parallel deterioration to language processing in a bilingual speaker.

    PubMed

    Druks, Judit; Weekes, Brendan Stuart

    2013-01-01

    The convergence hypothesis [Green, D. W. (2003). The neural basis of the lexicon and the grammar in L2 acquisition: The convergence hypothesis. In R. van Hout, A. Hulk, F. Kuiken, & R. Towell (Eds.), The interface between syntax and the lexicon in second language acquisition (pp. 197-218). Amsterdam: John Benjamins] assumes that the neural substrates of language representations are shared between the languages of a bilingual speaker. One prediction of this hypothesis is that neurodegenerative disease should produce parallel deterioration to lexical and grammatical processing in bilingual aphasia. We tested this prediction with a late bilingual Hungarian (first language, L1)-English (second language, L2) speaker J.B. who had nonfluent progressive aphasia (NFPA). J.B. had acquired L2 in adolescence but was premorbidly proficient and used English as his dominant language throughout adult life. Our investigations showed comparable deterioration to lexical and grammatical knowledge in both languages during a one-year period. Parallel deterioration to language processing in a bilingual speaker with NFPA challenges the assumption that L1 and L2 rely on different brain mechanisms as assumed in some theories of bilingual language processing [Ullman, M. T. (2001). The neural basis of lexicon and grammar in first and second language: The declarative/procedural model. Bilingualism: Language and Cognition, 4(1), 105-122]. PMID:24527801

  9. Parallel deterioration to language processing in a bilingual speaker.

    PubMed

    Druks, Judit; Weekes, Brendan Stuart

    2013-01-01

    The convergence hypothesis [Green, D. W. (2003). The neural basis of the lexicon and the grammar in L2 acquisition: The convergence hypothesis. In R. van Hout, A. Hulk, F. Kuiken, & R. Towell (Eds.), The interface between syntax and the lexicon in second language acquisition (pp. 197-218). Amsterdam: John Benjamins] assumes that the neural substrates of language representations are shared between the languages of a bilingual speaker. One prediction of this hypothesis is that neurodegenerative disease should produce parallel deterioration to lexical and grammatical processing in bilingual aphasia. We tested this prediction with a late bilingual Hungarian (first language, L1)-English (second language, L2) speaker J.B. who had nonfluent progressive aphasia (NFPA). J.B. had acquired L2 in adolescence but was premorbidly proficient and used English as his dominant language throughout adult life. Our investigations showed comparable deterioration to lexical and grammatical knowledge in both languages during a one-year period. Parallel deterioration to language processing in a bilingual speaker with NFPA challenges the assumption that L1 and L2 rely on different brain mechanisms as assumed in some theories of bilingual language processing [Ullman, M. T. (2001). The neural basis of lexicon and grammar in first and second language: The declarative/procedural model. Bilingualism: Language and Cognition, 4(1), 105-122].

  10. Parallel approach to incorporating face image information into dialogue processing

    NASA Astrophysics Data System (ADS)

    Ren, Fuji

    2000-10-01

    There are many kinds of so-called irregular expressions in natural dialogues. Even if the content of a conversation is the same in words, different meanings can be interpreted by a person's feeling or face expression. To have a good understanding of dialogues, it is required in a flexible dialogue processing system to infer the speaker's view properly. However, it is difficult to obtain the meaning of the speaker's sentences in various scenes using traditional methods. In this paper, a new approach for dialogue processing that incorporates information from the speaker's face is presented. We first divide conversation statements into several simple tasks. Second, we process each simple task using an independent processor. Third, we employ some speaker's face information to estimate the view of the speakers to solve ambiguities in dialogues. The approach presented in this paper can work efficiently, because independent processors run in parallel, writing partial results to a shared memory, incorporating partial results at appropriate points, and complementing each other. A parallel algorithm and a method for employing the face information in a dialogue machine translation will be discussed, and some results will be included in this paper.

  11. An evaluation of parallelization strategies for low-frequency electromagnetic induction simulators using staggered grid discretizations

    NASA Astrophysics Data System (ADS)

    Weiss, C. J.; Schultz, A.

    2011-12-01

    The high computational cost of the forward solution for modeling low-frequency electromagnetic induction phenomena is one of the primary impediments against broad-scale adoption by the geoscience community of exploration techniques, such as magnetotellurics and geomagnetic depth sounding, that rely on fast and cheap forward solutions to make tractable the inverse problem. As geophysical observables, electromagnetic fields are direct indicators of Earth's electrical conductivity - a physical property independent of (but in some cases correlative with) seismic wavespeed. Electrical conductivity is known to be a function of Earth's physiochemical state and temperature, and to be especially sensitive to the presence of fluids, melts and volatiles. Hence, electromagnetic methods offer a critical and independent constraint on our understanding of Earth's interior processes. Existing methods for parallelization of time-harmonic electromagnetic simulators, as applied to geophysics, have relied heavily on a combination of strategies: coarse-grained decompositions of the model domain; and/or, a high-order functional decomposition across spectral components, which in turn can be domain-decomposed themselves. Hence, in terms of scaling, both approaches are ultimately limited by the growing communication cost as the granularity of the forward problem increases. In this presentation we examine alternate parallelization strategies based on OpenMP shared-memory parallelization and CUDA-based GPU parallelization. As a test case, we use two different numerical simulation packages, each based on a staggered Cartesian grid: FDM3D (Weiss, 2006) which solves the curl-curl equation directly in terms of the scattered electric field (available under the LGPL at www.openem.org); and APHID, the A-Phi Decomposition based on mixed vector and scalar potentials, in which the curl-curl operator is replaced operationally by the vector Laplacian. We describe progress made in modifying the code to

  12. Hippocampal-prefrontal dynamics in spatial working memory: interactions and independent parallel processing.

    PubMed

    Churchwell, John C; Kesner, Raymond P

    2011-12-01

    Memory processes may be independent, compete, operate in parallel, or interact. In accordance with this view, behavioral studies suggest that the hippocampus (HPC) and prefrontal cortex (PFC) may act as an integrated circuit during performance of tasks that require working memory over longer delays, whereas during short delays the HPC and PFC may operate in parallel or have completely dissociable functions. In the present investigation we tested rats in a spatial delayed non-match to sample working memory task using short and long time delays to evaluate the hypothesis that intermediate CA1 region of the HPC (iCA1) and medial PFC (mPFC) interact and operate in parallel under different temporal working memory constraints. In order to assess the functional role of these structures, we used an inactivation strategy in which each subject received bilateral chronic cannula implantation of the iCA1 and mPFC, allowing us to perform bilateral, contralateral, ipsilateral, and combined bilateral inactivation of structures and structure pairs within each subject. This novel approach allowed us to test for circuit-level systems interactions, as well as independent parallel processing, while we simultaneously parametrically manipulated the temporal dimension of the task. The current results suggest that, at longer delays, iCA1 and mPFC interact to coordinate retrospective and prospective memory processes in anticipation of obtaining a remote goal, whereas at short delays either structure may independently represent spatial information sufficient to successfully complete the task. PMID:21839780

  13. Parallel Latent Semantic Analysis using a Graphics Processing Unit

    SciTech Connect

    Cui, Xiaohui; Potok, Thomas E; Cavanagh, Joseph M

    2009-01-01

    Latent Semantic Analysis (LSA) can be used to reduce the dimensions of large Term-Document datasets using Singular Value Decomposition. However, with the ever expanding size of data sets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. The Graphics Processing Unit (GPU) can solve some highly parallel problems much faster than the traditional sequential processor (CPU). Thus, a deployable system using a GPU to speedup large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a computer cluster. In this paper, we presented a parallel LSA implementation on the GPU, using NVIDIA Compute Unified Device Architecture (CUDA) and Compute Unified Basic Linear Algebra Subprograms (CUBLAS). The performance of this implementation is compared to traditional LSA implementation on CPU using an optimized Basic Linear Algebra Subprograms library. For large matrices that have dimensions divisible by 16, the GPU algorithm ran five to six times faster than the CPU version.

  14. Time dependent processing in a parallel pipeline architecture.

    PubMed

    Biddiscombe, John; Geveci, Berk; Martin, Ken; Moreland, Kenneth; Thompson, David

    2007-01-01

    Pipeline architectures provide a versatile and efficient mechanism for constructing visualizations, and they have been implemented in numerous libraries and applications over the past two decades. In addition to allowing developers and users to freely combine algorithms, visualization pipelines have proven to work well when streaming data and scale well on parallel distributed-memory computers. However, current pipeline visualization frameworks have a critical flaw: they are unable to manage time varying data. As data flows through the pipeline, each algorithm has access to only a single snapshot in time of the data. This prevents the implementation of algorithms that do any temporal processing such as particle tracing; plotting over time; or interpolation, fitting, or smoothing of time series data. As data acquisition technology improves, as simulation time-integration techniques become more complex, and as simulations save less frequently and regularly, the ability to analyze the time-behavior of data becomes more important. This paper describes a modification to the traditional pipeline architecture that allows it to accommodate temporal algorithms. Furthermore, the architecture allows temporal algorithms to be used in conjunction with algorithms expecting a single time snapshot, thus simplifying software design and allowing adoption into existing pipeline frameworks. Our architecture also continues to work well in parallel distributed-memory environments. We demonstrate our architecture by modifying the popular VTK framework and exposing the functionality to the ParaView application. We use this framework to apply time-dependent algorithms on large data with a parallel cluster computer and thereby exercise a functionality that previously did not exist.

  15. Parallel Processing of Large Scale Microphone Arrays for Sound Capture

    NASA Astrophysics Data System (ADS)

    Jan, Ea-Ee.

    1995-01-01

    Performance of microphone sound pick up is degraded by deleterious properties of the acoustic environment, such as multipath distortion (reverberation) and ambient noise. The degradation becomes more prominent in a teleconferencing environment in which the microphone is positioned far away from the speaker. Besides, the ideal teleconference should feel as easy and natural as face-to-face communication with another person. This suggests hands-free sound capture with no tether or encumbrance by hand-held or body-worn sound equipment. Microphone arrays for this application represent an appropriate approach. This research develops new microphone array and signal processing techniques for high quality hands-free sound capture in noisy, reverberant enclosures. The new techniques combine matched-filtering of individual sensors and parallel processing to provide acute spatial volume selectivity which is capable of mitigating the deleterious effects of noise interference and multipath distortion. The new method outperforms traditional delay-and-sum beamformers which provide only directional spatial selectivity. The research additionally explores truncated matched-filtering and random distribution of transducers to reduce complexity and improve sound capture quality. All designs are first established by computer simulation of array performance in reverberant enclosures. The simulation is achieved by a room model which can efficiently calculate the acoustic multipath in a rectangular enclosure up to a prescribed order of images. It also calculates the incident angle of the arriving signal. Experimental arrays were constructed and their performance was measured in real rooms. Real room data were collected in a hard-walled laboratory and a controllable variable acoustics enclosure of similar size, approximately 6 x 6 x 3 m. An extensive speech database was also collected in these two enclosures for future research on microphone arrays. The simulation results are shown to be

  16. MASSIVELY PARALLEL LATENT SEMANTIC ANALYSES USING A GRAPHICS PROCESSING UNIT

    SciTech Connect

    Cavanagh, J.; Cui, S.

    2009-01-01

    Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using Singular Value Decomposition. However, with the ever-expanding size of datasets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. A graphics processing unit (GPU) can solve some highly parallel problems much faster than a traditional sequential processor or central processing unit (CPU). Thus, a deployable system using a GPU to speed up large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a PC cluster. Due to the GPU’s application-specifi c architecture, harnessing the GPU’s computational prowess for LSA is a great challenge. We presented a parallel LSA implementation on the GPU, using NVIDIA® Compute Unifi ed Device Architecture and Compute Unifi ed Basic Linear Algebra Subprograms software. The performance of this implementation is compared to traditional LSA implementation on a CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1 000x1 000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran fi ve to six times faster than the CPU version. The large variation is due to architectural benefi ts of the GPU for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  17. Massively Parallel Processing for Fast and Accurate Stamping Simulations

    NASA Astrophysics Data System (ADS)

    Gress, Jeffrey J.; Xu, Siguang; Joshi, Ramesh; Wang, Chuan-tao; Paul, Sabu

    2005-08-01

    The competitive automotive market drives automotive manufacturers to speed up the vehicle development cycles and reduce the lead-time. Fast tooling development is one of the key areas to support fast and short vehicle development programs (VDP). In the past ten years, the stamping simulation has become the most effective validation tool in predicting and resolving all potential formability and quality problems before the dies are physically made. The stamping simulation and formability analysis has become an critical business segment in GM math-based die engineering process. As the simulation becomes as one of the major production tools in engineering factory, the simulation speed and accuracy are the two of the most important measures for stamping simulation technology. The speed and time-in-system of forming analysis becomes an even more critical to support the fast VDP and tooling readiness. Since 1997, General Motors Die Center has been working jointly with our software vendor to develop and implement a parallel version of simulation software for mass production analysis applications. By 2001, this technology was matured in the form of distributed memory processing (DMP) of draw die simulations in a networked distributed memory computing environment. In 2004, this technology was refined to massively parallel processing (MPP) and extended to line die forming analysis (draw, trim, flange, and associated spring-back) running on a dedicated computing environment. The evolution of this technology and the insight gained through the implementation of DM0P/MPP technology as well as performance benchmarks are discussed in this publication.

  18. Parallel information processing channels created in the retina.

    PubMed

    Schiller, Peter H

    2010-10-01

    In the retina, several parallel channels originate that extract different attributes from the visual scene. This review describes how these channels arise and what their functions are. Following the introduction four sections deal with these channels. The first discusses the "ON" and "OFF" channels that have arisen for the purpose of rapidly processing images in the visual scene that become visible by virtue of either light increment or light decrement; the ON channel processes images that become visible by virtue of light increment and the OFF channel processes images that become visible by virtue of light decrement. The second section examines the midget and parasol channels. The midget channel processes fine detail, wavelength information, and stereoscopic depth cues; the parasol channel plays a central role in processing motion and flicker as well as motion parallax cues for depth perception. Both these channels have ON and OFF subdivisions. The third section describes the accessory optic system that receives input from the retinal ganglion cells of Dogiel; these cells play a central role, in concert with the vestibular system, in stabilizing images on the retina to prevent the blurring of images that would otherwise occur when an organism is in motion. The last section provides a brief overview of several additional channels that originate in the retina.

  19. Parallel information processing channels created in the retina

    PubMed Central

    Schiller, Peter H.

    2010-01-01

    In the retina, several parallel channels originate that extract different attributes from the visual scene. This review describes how these channels arise and what their functions are. Following the introduction four sections deal with these channels. The first discusses the “ON” and “OFF” channels that have arisen for the purpose of rapidly processing images in the visual scene that become visible by virtue of either light increment or light decrement; the ON channel processes images that become visible by virtue of light increment and the OFF channel processes images that become visible by virtue of light decrement. The second section examines the midget and parasol channels. The midget channel processes fine detail, wavelength information, and stereoscopic depth cues; the parasol channel plays a central role in processing motion and flicker as well as motion parallax cues for depth perception. Both these channels have ON and OFF subdivisions. The third section describes the accessory optic system that receives input from the retinal ganglion cells of Dogiel; these cells play a central role, in concert with the vestibular system, in stabilizing images on the retina to prevent the blurring of images that would otherwise occur when an organism is in motion. The last section provides a brief overview of several additional channels that originate in the retina. PMID:20876118

  20. A parallel strategy for implementing real-time expert systems using CLIPS

    NASA Technical Reports Server (NTRS)

    Ilyes, Laszlo A.; Villaseca, F. Eugenio; Delaat, John

    1994-01-01

    As evidenced by current literature, there appears to be a continued interest in the study of real-time expert systems. It is generally recognized that speed of execution is only one consideration when designing an effective real-time expert system. Some other features one must consider are the expert system's ability to perform temporal reasoning, handle interrupts, prioritize data, contend with data uncertainty, and perform context focusing as dictated by the incoming data to the expert system. This paper presents a strategy for implementing a real time expert system on the iPSC/860 hypercube parallel computer using CLIPS. The strategy takes into consideration not only the execution time of the software, but also those features which define a true real-time expert system. The methodology is then demonstrated using a practical implementation of an expert system which performs diagnostics on the Space Shuttle Main Engine (SSME). This particular implementation uses an eight node hypercube to process ten sensor measurements in order to simultaneously diagnose five different failure modes within the SSME. The main program is written in ANSI C and embeds CLIPS to better facilitate and debug the rule based expert system.

  1. Efficient biased random bit generation for parallel processing

    SciTech Connect

    Slone, D.M.

    1994-09-28

    A lattice gas automaton was implemented on a massively parallel machine (the BBN TC2000) and a vector supercomputer (the CRAY C90). The automaton models Burgers equation {rho}t + {rho}{rho}{sub x} = {nu}{rho}{sub xx} in 1 dimension. The lattice gas evolves by advecting and colliding pseudo-particles on a 1-dimensional, periodic grid. The specific rules for colliding particles are stochastic in nature and require the generation of many billions of random numbers to create the random bits necessary for the lattice gas. The goal of the thesis was to speed up the process of generating the random bits and thereby lessen the computational bottleneck of the automaton.

  2. Development of an aerodynamics algorithm for parallel-processing supercomputers

    NASA Technical Reports Server (NTRS)

    Swisshelm, Julie M.; Johnson, Gary M.

    1988-01-01

    An explicit flow solver, applicable to the hierarchy of model equations ranging from Euler to full Navier-Stokes, is combined with several techniques designed to reduce computational expense. The computational domain consists of local grid refinements embedded in a global coarse mesh, where the locations of these refinements are defined by the physics of the flow. Flow characteristics are also used to determine which set of model equations is appropriate for solution in each region, thereby reducing not only the number of grid points at which the solution must be obtained, but also the computational effort required to get that solution. Acceleration to steady-state is achieved by applying multigrid on each of the subgrids, regardless of the particular model equations being solved. Since each of these components is explicit, advantage can readily be taken of the vector- and parallel-processing capabilities of machines such as the Cray X-MP and Cray-2.

  3. Safety-oriented global analysis and parallel processing

    SciTech Connect

    Dinca, L.G.; Aldemir, T.

    1994-12-31

    The objective of safety-oriented global analysis (SOGA) is to determine the conditions under which the evolution of a dynamic system in time remains within the imposed constraints in view of the uncertainties on the system parameters and/or the observed systern state. Often the only generally applicable SOGA method for nonlinear systems is the direct integration of the governing equations, which can be computationally prohibitive. An alternative SOGA methodology has been under dridopment at the Ohio State University (OSU), and its application to reactor dynamics has been illustrated in previous presentations. In spite of the computational advantage of the OSU methodology over direct integration, the computational time and storage requirements are still limiting factors in implementation. A procedure to reduce the storage requirements was presented earlier. This paper describes how the computational time can be reduced using parallel processing.

  4. Multiple-spot parallel processing for laser micronanofabrication

    NASA Astrophysics Data System (ADS)

    Kato, Jun-ichi; Takeyasu, Nobuyuki; Adachi, Yoshihiro; Sun, Hong-Bo; Kawata, Satoshi

    2005-01-01

    A tightly focused femtosecond laser has been established as a unique tool for micronanostructure fabrication due to its intrinsic three-dimensional processing. In this letter, we utilize a microlens array to produce multiple spots for parallel fabrication, giving rise to a revolutionary augmentation for our previously developed single-beam two-photon photopolymerization technology [S. Kawata, H.-B. Sun, T. Tanaka, and K. Takada, Nature (London) 412, 697 (2001)]. Two- and three-dimensional multiple structures, such as microletter set and self-standing microspring array, are demonstrated as examples of mass production. More than 200 spot simultaneous fabrication has been realized by optimizing the exposure condition for the photopolymerizable resin, i.e., a two-order increase of yield efficiency. Potential applications of this technique are discussed.

  5. Parallel Processing of Adaptive Meshes with Load Balancing

    NASA Technical Reports Server (NTRS)

    Das, Sajal K.; Harvey, Daniel J.; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

    2001-01-01

    Many scientific applications involve grids that lack a uniform underlying structure. These applications are often also dynamic in nature in that the grid structure significantly changes between successive phases of execution. In parallel computing environments, mesh adaptation of unstructured grids through selective refinement/coarsening has proven to be an effective approach. However, achieving load balance while minimizing interprocessor communication and redistribution costs is a difficult problem. Traditional dynamic load balancers are mostly inadequate because they lack a global view of system loads across processors. In this paper, we propose a novel and general-purpose load balancer that utilizes symmetric broadcast networks (SBN) as the underlying communication topology, and compare its performance with a successful global load balancing environment, called PLUM, specifically created to handle adaptive unstructured applications. Our experimental results on an IBM SP2 demonstrate that the SBN-based load balancer achieves lower redistribution costs than that under PLUM by overlapping processing and data migration.

  6. Applying the Extended Parallel Process Model to workplace safety messages.

    PubMed

    Basil, Michael; Basil, Debra; Deshpande, Sameer; Lavack, Anne M

    2013-01-01

    The extended parallel process model (EPPM) proposes fear appeals are most effective when they combine threat and efficacy. Three studies conducted in the workplace safety context examine the use of various EPPM factors and their effects, especially multiplicative effects. Study 1 was a content analysis examining the use of EPPM factors in actual workplace safety messages. Study 2 experimentally tested these messages with 212 construction trainees. Study 3 replicated this experiment with 1,802 men across four English-speaking countries-Australia, Canada, the United Kingdom, and the United States. The results of these three studies (1) demonstrate the inconsistent use of EPPM components in real-world work safety communications, (2) support the necessity of self-efficacy for the effective use of threat, (3) show a multiplicative effect where communication effectiveness is maximized when all model components are present (severity, susceptibility, and efficacy), and (4) validate these findings with gory appeals across four English-speaking countries.

  7. Parallel processing of layout data with selective data distribution

    NASA Astrophysics Data System (ADS)

    Pereira, Mark; Bhat, Nitin; Srinivas, Preethi

    2006-10-01

    With the increase in layout data (GDSII) size due to finer geometries and resolution enhancement techniques such as Optical Proximity Correction (OPC) and Phase Shift Mask (PSM), layout data is proving to be too voluminous to process by single CPU machines. Post-layout tools have now moved towards distributed computing techniques to process this data more efficiently in terms of speed. Typical distributed computing architectures involve distributing the layout data to various workstations and then each workstation processing its part of the data in parallel. This approach will work well provided the amount of data that is to be distributed is not too large. As the size of the layout data is increasing significantly, the time taken to transfer the layout data between the workstations is turning out to be a major bottleneck. This bottleneck gets further highlighted because the time taken for actual operations gets almost linearly scaled down through employing higher number of workstations in the distributed computing environment and also because the clock speed of the workstations get continuously improved. The focus of this paper is on a smart way of distributing the layout data so that the amount of redundant data transfer is significantly reduced. This is achieved by selective data distribution wherein the layout data is fragmented and each workstation is provided with minimal and sufficient layout information for it to determine the actual fragments required for its processing.

  8. Parallel asynchronous hardware implementation of image processing algorithms

    NASA Technical Reports Server (NTRS)

    Coon, Darryl D.; Perera, A. G. U.

    1990-01-01

    Research is being carried out on hardware for a new approach to focal plane processing. The hardware involves silicon injection mode devices. These devices provide a natural basis for parallel asynchronous focal plane image preprocessing. The simplicity and novel properties of the devices would permit an independent analog processing channel to be dedicated to every pixel. A laminar architecture built from arrays of the devices would form a two-dimensional (2-D) array processor with a 2-D array of inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in neuron-like asynchronous pulse-coded form through the laminar processor. No multiplexing, digitization, or serial processing would occur in the preprocessing state. High performance is expected, based on pulse coding of input currents down to one picoampere with noise referred to input of about 10 femtoamperes. Linear pulse coding has been observed for input currents ranging up to seven orders of magnitude. Low power requirements suggest utility in space and in conjunction with very large arrays. Very low dark current and multispectral capability are possible because of hardware compatibility with the cryogenic environment of high performance detector arrays. The aforementioned hardware development effort is aimed at systems which would integrate image acquisition and image processing.

  9. Massively Parallel Latent Semantic Analyzes using a Graphics Processing Unit

    SciTech Connect

    Cavanagh, Joseph M; Cui, Xiaohui

    2009-01-01

    Latent Semantic Indexing (LSA) aims to reduce the dimensions of large Term-Document datasets using Singular Value Decomposition. However, with the ever expanding size of data sets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. The Graphics Processing Unit (GPU) can solve some highly parallel problems much faster than the traditional sequential processor (CPU). Thus, a deployable system using a GPU to speedup large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a computer cluster. Due to the GPU s application-specific architecture, harnessing the GPU s computational prowess for LSA is a great challenge. We present a parallel LSA implementation on the GPU, using NVIDIA Compute Unified Device Architecture and Compute Unified Basic Linear Algebra Subprograms. The performance of this implementation is compared to traditional LSA implementation on CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1000x1000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran five to six times faster than the CPU version. The large variation is due to architectural benefits the GPU has for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  10. Configuration Management Process Assessment Strategy

    NASA Technical Reports Server (NTRS)

    Henry, Thad

    2014-01-01

    Purpose: To propose a strategy for assessing the development and effectiveness of configuration management systems within Programs, Projects, and Design Activities performed by technical organizations and their supporting development contractors. Scope: Various entities CM Systems will be assessed dependent on Project Scope (DDT&E), Support Services and Acquisition Agreements. Approach: Model based structured against assessing organizations CM requirements including best practices maturity criteria. The model is tailored to the entity being assessed dependent on their CM system. The assessment approach provides objective feedback to Engineering and Project Management of the observed CM system maturity state versus the ideal state of the configuration management processes and outcomes(system). center dot Identifies strengths and risks versus audit gotcha's (findings/observations). center dot Used "recursively and iteratively" throughout program lifecycle at select points of need. (Typical assessments timing is Post PDR/Post CDR) center dot Ideal state criteria and maturity targets are reviewed with the assessed entity prior to an assessment (Tailoring) and is dependent on the assessed phase of the CM system. center dot Supports exit success criteria for Preliminary and Critical Design Reviews. center dot Gives a comprehensive CM system assessment which ultimately supports configuration verification activities.*

  11. Life Management Skills. Teacher's Guide [and Student Workbook]. Parallel Alternative Strategies for Students (PASS).

    ERIC Educational Resources Information Center

    Goldstein, Jeren; Walford, Sylvia

    This teacher's guide and student workbook are part of a series of supplementary curriculum packages presenting alternative methods and activities designed to meet the needs of Florida secondary students with mild disabilities or other special learning needs. The Life Management Skills PASS (Parallel Alternative Strategies for Students) teacher's…

  12. Introduction to Computers: Parallel Alternative Strategies for Students. Course No. 0200000.

    ERIC Educational Resources Information Center

    Chauvenne, Sherry; And Others

    Parallel Alternative Strategies for Students (PASS) is a content-centered package of alternative methods and materials designed to assist secondary teachers to meet the needs of mainstreamed learning-disabled and emotionally-handicapped students of various achievement levels in the basic education content courses. This supplementary text and…

  13. Mobile Devices and GPU Parallelism in Ionospheric Data Processing

    NASA Astrophysics Data System (ADS)

    Mascharka, D.; Pankratius, V.

    2015-12-01

    Scientific data acquisition in the field is often constrained by data transfer backchannels to analysis environments. Geoscientists are therefore facing practical bottlenecks with increasing sensor density and variety. Mobile devices, such as smartphones and tablets, offer promising solutions to key problems in scientific data acquisition, pre-processing, and validation by providing advanced capabilities in the field. This is due to affordable network connectivity options and the increasing mobile computational power. This contribution exemplifies a scenario faced by scientists in the field and presents the "Mahali TEC Processing App" developed in the context of the NSF-funded Mahali project. Aimed at atmospheric science and the study of ionospheric Total Electron Content (TEC), this app is able to gather data from various dual-frequency GPS receivers. It demonstrates parsing of full-day RINEX files on mobile devices and on-the-fly computation of vertical TEC values based on satellite ephemeris models that are obtained from NASA. Our experiments show how parallel computing on the mobile device GPU enables fast processing and visualization of up to 2 million datapoints in real-time using OpenGL. GPS receiver bias is estimated through minimum TEC approximations that can be interactively adjusted by scientists in the graphical user interface. Scientists can also perform approximate computations for "quickviews" to reduce CPU processing time and memory consumption. In the final stage of our mobile processing pipeline, scientists can upload data to the cloud for further processing. Acknowledgements: The Mahali project (http://mahali.mit.edu) is funded by the NSF INSPIRE grant no. AGS-1343967 (PI: V. Pankratius). We would like to acknowledge our collaborators at Boston College, Virginia Tech, Johns Hopkins University, Colorado State University, as well as the support of UNAVCO for loans of dual-frequency GPS receivers for use in this project, and Intel for loans of

  14. Radon-Based Image Processing In A Parallel Pipeline Architecture

    NASA Astrophysics Data System (ADS)

    Hinkle, Eric B.; Sanz, Jorge L. C.; Jain, Anil K.

    1986-04-01

    This paper deals with a novel architecture that makes real-time projection-based algorithms a reality. The design is founded on raster-mode processing, which is exploited in a powerful and flexible pipeline. This architecture, dubbed "P3 E" ( Parallel Pipeline Projection Engine), supports a large variety of image processing and image analysis applications. The image processing applications include: discrete approximations of the Radon and inverse Radon transform, among other projection operators; CT reconstructions; 2-D convolutions; rotations and translations; discrete Fourier transform computations in polar coordinates; autocorrelations; etc. There is also an extensive list of key image analysis algorithms that are supported by P E, thus making it a profound and versatile tool for projection-based computer vision. These include: projections of gray-level images along linear patterns (the Radon transform) and other curved contours; generation of multi-color digital masks; convex hull approximations; Hough transform approximations for line and curve detection; diameter computations; calculations of moments and other principal components; etc. The effectiveness of our approach and the feasibility of the proposed architecture have been demonstrated by running some of these image analysis algorithms in conventional short pipelines, to solve some important automated inspection problems. In the present paper, we will concern ourselves with reconstructing images from their linear projections, and performing convolutions via the Radon transform.

  15. Evaluating In-Clique and Topological Parallelism Strategies for Junction Tree-Based Bayesian Inference Algorithm on the Cray XMT

    SciTech Connect

    Chin, George; Choudhury, Sutanay; Kangas, Lars J.; McFarlane, Sally A.; Marquez, Andres

    2011-09-01

    Long viewed as a strong statistical inference technique, Bayesian networks have emerged to be an important class of applications for high-performance computing. We have applied an architecture-conscious approach to parallelizing the Lauritzen-Spiegelhalter Junction Tree algorithm for exact inferencing in Bayesian networks. In optimizing the Junction Tree algorithm, we have implemented both in-clique and topological parallelism strategies to best leverage the fine-grained synchronization and massive-scale multithreading of the Cray XMT architecture. Two topological techniques were developed to parallelize the evidence propagation process through the Bayesian network. One technique involves performing intelligent scheduling of junction tree nodes based on its topology and relative size. The second technique involves decomposing the junction tree into a much finer tree-like representation to offer much more opportunities for parallelism. We evaluate these optimizations on five different Bayesian networks and report our findings and observations. Another important contribution of this paper is to demonstrate the application of massive-scale multithreading for load balancing and use of implicit parallelism-based compiler optimizations in designing scalable inferencing algorithms.

  16. SIMD massively parallel processing system for real-time image processing

    NASA Astrophysics Data System (ADS)

    Chen, Xiaochu; Zhang, Ming; Yao, Qingdong; Liu, Jilin; Ye, Hong; Wu, Song; Li, Dongxiao; Zhang, Yong; Ding, Lei; Yao, Zhongyang; Yang, Weijian; Pan, Qiaohai

    1998-09-01

    This paper will describe the embedded SIMD massively parallel processor that we have developed for real-time image processing applications, such as real-time small target detection and tracking and video processing. The processor array is based on SIMD chip BAP-128 designed by our own, and uses high performance DSP TMS320C31, which can effectively perform serial and floating point calculations, as the host of the SIMD processor array. As a result, the system is able to perform a variety of image processing tasks in real-time. Furthermore, the processor will be connected with a MIMD parallel processor to construct a heterogeneously parallel processor for more complex real- time ATR (Automatic Target Recognition) and computer vision applications.

  17. Programming Probabilistic Structural Analysis for Parallel Processing Computer

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

    1991-01-01

    The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.

  18. Resolving Multiscale Processes in Tropical Cyclogenesis Using Parallel EEMD

    NASA Astrophysics Data System (ADS)

    Wu, Y.; Shen, B. W.; Cheung, S.; Li, J. L. F.; Liu, Z.

    2014-12-01

    The recent advance in high-resolution global models has suggested that improved multiscale simulations of tropical waves may help extend the lead time of tropical cyclone (TC) formation prediction (e.g., Shen et al., 2010ab, 2012, 2013a). In previous efforts in the multiscale analysis of tropical waves , the Ensemble Empirical Mode Decomposition (EEMD) has been successfully parallelized and used to detect atmospheric wave signals on different spatial scales (e.g. Shen et al., 2013b) that include Mixed Rossby Gravity (MRG) waves, Western Wind Belt (WWB), African Easterly Waves (AEWs), etc. We now extend the related studies to examine the evolution of the large scale waves and their association with the formation of tropical cyclones in the Atlantic for an extensive time period spanning multiple years. Our goal is to analyze the multiscale interaction in the initiation and early intensification stage of an AEW and its subsequent impact on TC genesis that involves mainly the large scale downscaling processes. Specific focus is on the impact of barotropic instability and critical level (CL, or steering level) that may appear in association with the AEW. The presence of the CL is believed to play an important role in providing a favorable environment in the early TC-genesis stage in the marsupial paradigm scenario. Preliminary analysis of the satellite data obtained from the newly launched Global Precipitation Measurement (GPM) mission linked to the TC genesis processes will be included.

  19. Parallel implementation of RX anomaly detection on multi-core processors: impact of data partitioning strategies

    NASA Astrophysics Data System (ADS)

    Molero, Jose M.; Garzón, Ester M.; García, Inmaculada; Plaza, Antonio

    2011-11-01

    Anomaly detection is an important task for remotely sensed hyperspectral data exploitation. One of the most widely used and successful algorithms for anomaly detection in hyperspectral images is the Reed-Xiaoli (RX) algorithm. Despite its wide acceptance and high computational complexity when applied to real hyperspectral scenes, few documented parallel implementations of this algorithm exist, in particular for multi-core processors. The advantage of multi-core platforms over other specialized parallel architectures is that they are a low-power, inexpensive, widely available and well-known technology. A critical issue in the parallel implementation of RX is the sample covariance matrix calculation, which can be approached in global or local fashion. This aspect is crucial for the RX implementation since the consideration of a local or global strategy for the computation of the sample covariance matrix is expected to affect both the scalability of the parallel solution and the anomaly detection results. In this paper, we develop new parallel implementations of the RX in multi-core processors and specifically investigate the impact of different data partitioning strategies when parallelizing its computations. For this purpose, we consider both global and local data partitioning strategies in the spatial domain of the scene, and further analyze their scalability in different multi-core platforms. The numerical effectiveness of the considered solutions is evaluated using receiver operating characteristics (ROC) curves, analyzing their capacity to detect thermal hot spots (anomalies) in hyperspectral data collected by the NASA's Airborne Visible Infra- Red Imaging Spectrometer system over the World Trade Center in New York, five days after the terrorist attacks of September 11th, 2001.

  20. Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R; Ratterman, Joseph D; Smith, Brian E

    2014-11-18

    Methods, apparatuses, and computer program products for endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface (`PAMI`) of a parallel computer are provided. Embodiments include establishing by a parallel application a data communications geometry, the geometry specifying a set of endpoints that are used in collective operations of the PAMI, including associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry. Embodiments also include registering in each endpoint in the geometry a dispatch callback function for a collective operation and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.

  1. Parallel processing using an optical delay-based reservoir computer

    NASA Astrophysics Data System (ADS)

    Van der Sande, Guy; Nguimdo, Romain Modeste; Verschaffelt, Guy

    2016-04-01

    Delay systems subject to delayed optical feedback have recently shown great potential in solving computationally hard tasks. By implementing a neuro-inspired computational scheme relying on the transient response to optical data injection, high processing speeds have been demonstrated. However, reservoir computing systems based on delay dynamics discussed in the literature are designed by coupling many different stand-alone components which lead to bulky, lack of long-term stability, non-monolithic systems. Here we numerically investigate the possibility of implementing reservoir computing schemes based on semiconductor ring lasers. Semiconductor ring lasers are semiconductor lasers where the laser cavity consists of a ring-shaped waveguide. SRLs are highly integrable and scalable, making them ideal candidates for key components in photonic integrated circuits. SRLs can generate light in two counterpropagating directions between which bistability has been demonstrated. We demonstrate that two independent machine learning tasks , even with different nature of inputs with different input data signals can be simultaneously computed using a single photonic nonlinear node relying on the parallelism offered by photonics. We illustrate the performance on simultaneous chaotic time series prediction and a classification of the Nonlinear Channel Equalization. We take advantage of different directional modes to process individual tasks. Each directional mode processes one individual task to mitigate possible crosstalk between the tasks. Our results indicate that prediction/classification with errors comparable to the state-of-the-art performance can be obtained even with noise despite the two tasks being computed simultaneously. We also find that a good performance is obtained for both tasks for a broad range of the parameters. The results are discussed in detail in [Nguimdo et al., IEEE Trans. Neural Netw. Learn. Syst. 26, pp. 3301-3307, 2015

  2. Parallel processing methods for space based power systems

    NASA Technical Reports Server (NTRS)

    Berry, F. C.

    1993-01-01

    This report presents a method for doing load-flow analysis of a power system by using a decomposition approach. The power system for the Space Shuttle is used as a basis to build a model for the load-flow analysis. To test the decomposition method for doing load-flow analysis, simulations were performed on power systems of 16, 25, 34, 43, 52, 61, 70, and 79 nodes. Each of the power systems was divided into subsystems and simulated under steady-state conditions. The results from these tests have been found to be as accurate as tests performed using a standard serial simulator. The division of the power systems into different subsystems was done by assigning a processor to each area. There were 13 transputers available, therefore, up to 13 different subsystems could be simulated at the same time. This report has preliminary results for a load-flow analysis using a decomposition principal. The report shows that the decomposition algorithm for load-flow analysis is well suited for parallel processing and provides increases in the speed of execution.

  3. Rapid parallel semantic processing of numbers without awareness.

    PubMed

    Van Opstal, Filip; de Lange, Floris P; Dehaene, Stanislas

    2011-07-01

    In this study, we investigate whether multiple digits can be processed at a semantic level without awareness, either serially or in parallel. In two experiments, we presented participants with two successive sets of four simultaneous Arabic digits. The first set was masked and served as a subliminal prime for the second, visible target set. According to the instructions, participants had to extract from the target set either the mean or the sum of the digits, and to compare it with a reference value. Results showed that participants applied the requested instruction to the entire set of digits that was presented below the threshold of conscious perception, because their magnitudes jointly affected the participant's decision. Indeed, response decision could be accurately modeled as a sigmoid logistic function that pooled together the evidence provided by the four targets and, with lower weights, the four primes. In less than 800ms, participants successfully approximated the addition and mean tasks, although they tended to overweight the large numbers, particularly in the sum task. These findings extend previous observations on ensemble coding by showing that set statistics can be extracted from abstract symbolic stimuli rather than low-level perceptual stimuli, and that an ensemble code can be represented without awareness.

  4. An integrated approach to improving the parallel applications development process

    SciTech Connect

    Rasmussen, Craig E; Watson, Gregory R; Tibbitts, Beth R

    2009-01-01

    The development of parallel applications is becoming increasingly important to a broad range of industries. Traditionally, parallel programming was a niche area that was primarily exploited by scientists trying to model extremely complicated physical phenomenon. It is becoming increasingly clear, however, that continued hardware performance improvements through clock scaling and feature-size reduction are simply not going to be achievable for much longer. The hardware vendor's approach to addressing this issue is to employ parallelism through multi-processor and multi-core technologies. While there is little doubt that this approach produces scaling improvements, there are still many significant hurdles to be overcome before parallelism can be employed as a general replacement to more traditional programming techniques. The Parallel Tools Platform (PTP) Project was created in 2005 in an attempt to provide developers with new tools aimed at addressing some of the parallel development issues. Since then, the introduction of a new generation of peta-scale and multi-core systems has highlighted the need for such a platform. In this paper, we describe some of the challenges facing parallel application developers, present the current state of PTP, and provide a simple case study that demonstrates how PTP can be used to locate a potential deadlock situation in an MPI code.

  5. Introducing data parallelism into climate model post-processing through a parallel version of the NCAR Command Language (NCL)

    NASA Astrophysics Data System (ADS)

    Jacob, R. L.; Xu, X.; Krishna, J.; Tautges, T.

    2011-12-01

    The relationship between the needs of post-processing climate model output and the capability of the available tools has reached a crisis point. The large volume of data currently produced by climate models is overwhelming the current, decades-old analysis workflow. The tools used to implement that workflow are now a bottleneck in the climate science discovery processes. This crisis will only worsen as ultra-high resolution global climate models with horizontal scales of 4 km or smaller, running on leadership computing facilities, begin to produce tens to hundreds of terabytes for a single, hundred-year climate simulation. While climate models have used parallelism for several years, the post-processing tools are still mostly single-threaded applications. We have created a Parallel Climate Analysis Library (ParCAL) which implements many common climate analysis operations in a data-parallel fashion using the Message Passing Interface. ParCAL has in turn been built on sophisticated packages for describing grids in parallel (the Mesh Oriented database (MOAB) and for performing vector operations on arbitrary grids (Intrepid). ParCAL is also using parallel I/O through the PnetCDF library. ParCAL has been used to implement a parallel version of the NCAR Command Language (NCL). ParNCL/ParCAL not only speeds up analysis of large datasets but also allows operations to be performed on native grids, eliminating the need to transform everything to latitude-longitude grids. In most cases, users NCL scripts can run unaltered in parallel using ParNCL.

  6. Parallel ALLSPD-3D: Speeding Up Combustor Analysis Via Parallel Processing

    NASA Technical Reports Server (NTRS)

    Fricker, David M.

    1997-01-01

    The ALLSPD-3D Computational Fluid Dynamics code for reacting flow simulation was run on a set of benchmark test cases to determine its parallel efficiency. These test cases included non-reacting and reacting flow simulations with varying numbers of processors. Also, the tests explored the effects of scaling the simulation with the number of processors in addition to distributing a constant size problem over an increasing number of processors. The test cases were run on a cluster of IBM RS/6000 Model 590 workstations with ethernet and ATM networking plus a shared memory SGI Power Challenge L workstation. The results indicate that the network capabilities significantly influence the parallel efficiency, i.e., a shared memory machine is fastest and ATM networking provides acceptable performance. The limitations of ethernet greatly hamper the rapid calculation of flows using ALLSPD-3D.

  7. The convergence analysis of parallel genetic algorithm based on allied strategy

    NASA Astrophysics Data System (ADS)

    Lin, Feng; Sun, Wei; Chang, K. C.

    2010-04-01

    Genetic algorithms (GAs) have been applied to many difficult optimization problems such as track assignment and hypothesis managements for multisensor integration and data fusion. However, premature convergence has been a main problem for GAs. In order to prevent premature convergence, we introduce an allied strategy based on biological evolution and present a parallel Genetic Algorithm with the allied strategy (PGAAS). The PGAAS can prevent premature convergence, increase the optimization speed, and has been successfully applied in a few applications. In this paper, we first present a Markov chain model in the PGAAS. Based on this model, we analyze the convergence property of PGAAS. We then present the proof of global convergence for the PGAAS algorithm. The experiments results show that PGAAS is an efficient and effective parallel Genetic algorithm. Finally, we discuss several potential applications of the proposed methodology.

  8. Real-time massively parallel processing of spectral optical coherence tomography data on graphics processing units

    NASA Astrophysics Data System (ADS)

    Sylwestrzak, Marcin; Szlag, Daniel; Szkulmowski, Maciej; Targowski, Piotr

    2011-06-01

    In this contribution we describe a specialised data processing system for Spectral Optical Coherence Tomography (SOCT) biomedical imaging which utilises massively parallel data processing on a low-cost, Graphics Processing Unit (GPU). One of the most significant limitations of SOCT is the data processing time on the main processor of the computer (CPU), which is generally longer than the data acquisition. Therefore, real-time imaging with acceptable quality is limited to a small number of tomogram lines (A-scans). Recent progress in graphics cards technology gives a promising solution of this problem. The newest graphics processing units allow not only for a very high speed three dimensional (3D) rendering, but also for a general purpose parallel numerical calculations with efficiency higher than provided by the CPU. The presented system utilizes CUDATM graphic card and allows for a very effective real time SOCT imaging. The total imaging speed for 2D data consisting of 1200 A-scans is higher than refresh rate of a 120 Hz monitor. 3D rendering of the volume data build of 10 000 A-scans is performed with frame rate of about 9 frames per second. These frame rates include data transfer from a frame grabber to GPU, data processing and 3D rendering to the screen. The software description includes data flow, parallel processing and organization of threads. For illustration we show real time high resolution SOCT imaging of human skin and eye.

  9. Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study

    DOE PAGESBeta

    Radhakrishnan, Hari; Rouson, Damian W. I.; Morris, Karla; Shende, Sameer; Kassinos, Stavros C.

    2015-01-01

    This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO) and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP) facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were donemore » using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015.« less

  10. Parallelized CCHE2D flow model with CUDA Fortran on Graphics Process Units

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This paper presents the CCHE2D implicit flow model parallelized using CUDA Fortran programming technique on Graphics Processing Units (GPUs). A parallelized implicit Alternating Direction Implicit (ADI) solver using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solve...

  11. Parallel versus sequential processing in print and braille reading.

    PubMed

    Veispak, Anneli; Boets, Bart; Ghesquière, Pol

    2012-01-01

    In the current study we investigated word, pseudoword and story reading in Dutch speaking braille and print readers. To examine developmental patterns, these reading skills were assessed in both children and adults. The results reveal that braille readers read less accurately and fast than print readers. While item length has no impact on word reading accuracy and speed in the group of print readers, it has a significant impact on reading accuracy and speed in the group of braille readers, particularly in the younger sample. This suggests that braille readers rely more strongly on an enduring sequential reading strategy. Comparison of the different reading tasks suggests that the advantage in accuracy and speed of reading in adult as compared to young braille readers is achieved through semantic top-down processing.

  12. Parallel design of JPEG-LS encoder on graphics processing units

    NASA Astrophysics Data System (ADS)

    Duan, Hao; Fang, Yong; Huang, Bormin

    2012-01-01

    With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.

  13. Parallel processing architecture for computing inverse differential kinematic equations of the PUMA arm

    NASA Technical Reports Server (NTRS)

    Hsia, T. C.; Lu, G. Z.; Han, W. H.

    1987-01-01

    In advanced robot control problems, on-line computation of inverse Jacobian solution is frequently required. Parallel processing architecture is an effective way to reduce computation time. A parallel processing architecture is developed for the inverse Jacobian (inverse differential kinematic equation) of the PUMA arm. The proposed pipeline/parallel algorithm can be inplemented on an IC chip using systolic linear arrays. This implementation requires 27 processing cells and 25 time units. Computation time is thus significantly reduced.

  14. Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R; Ratterman, Joseph D; Smith, Brian E

    2014-11-11

    Endpoint-based parallel data processing with non-blocking collective instructions in a PAMI of a parallel computer is disclosed. The PAMI is composed of data communications endpoints, each including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task. The compute nodes are coupled for data communications through the PAMI. The parallel application establishes a data communications geometry specifying a set of endpoints that are used in collective operations of the PAMI by associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry; registering in each endpoint in the geometry a dispatch callback function for a collective operation; and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.

  15. The source of dual-task limitations: Serial or parallel processing of multiple response selections?

    PubMed Central

    Marois, René

    2014-01-01

    Although it is generally recognized that the concurrent performance of two tasks incurs costs, the sources of these dual-task costs remain controversial. The serial bottleneck model suggests that serial postponement of task performance in dual-task conditions results from a central stage of response selection that can only process one task at a time. Cognitive-control models, by contrast, propose that multiple response selections can proceed in parallel, but that serial processing of task performance is predominantly adopted because its processing efficiency is higher than that of parallel processing. In the present study, we empirically tested this proposition by examining whether parallel processing would occur when it was more efficient and financially rewarded. The results indicated that even when parallel processing was more efficient and was incentivized by financial reward, participants still failed to process tasks in parallel. We conclude that central information processing is limited by a serial bottleneck. PMID:23864266

  16. Parallel Processing Creates a Low-Cost Growth Path.

    ERIC Educational Resources Information Center

    Shekhel, Alex; Freeman, Eva

    1987-01-01

    Discusses the advantages of parallel processor computers in terms of expandibility, cost, performance and reliability, and suggests that such computers be used in library automation systems as a cost effective approach to planning for the growth of information services and computer applications. (CLB)

  17. Control of automatic processes: A parallel distributed-processing model of the stroop effect. Technical report

    SciTech Connect

    Cohen, J.D.; Dunbar, K.; McClelland, J.L.

    1988-06-16

    A growing body of evidence suggests that traditional views of automaticity are in need of revision. For example, automaticity has often been treated as an all-or-none phenomenon, and traditional theories have held that automatic processes are independent of attention. Yet recent empirial data suggests that automatic processes are continuous, and furthermore are subject to attentional control. In this paper we present a model of attention which addresses these issues. Using a parallel distributed processing framework we propose that the attributes of automaticity depend upon the strength of a process and that strength increases with training. Using the Stroop effect as an example, we show how automatic processes are continuous and emerge gradually with practice. Specifically, we present a computational model of the Stroop task which simulates the time course of processing as well as the effects of learning.

  18. Parallel processing numerical method for confined vortex dynamics and applications

    NASA Astrophysics Data System (ADS)

    Bistrian, Diana Alina

    2013-10-01

    This paper explores a combined analytical and numerical technique to investigate the hydrodynamic instability of confined swirling flows, with application to vortex rope dynamics in a Francis turbine diffuser, in condition of sophisticated boundary constraints. We present a new approach based on the method of orthogonal decomposition in the Hilbert space, implemented with a spectral descriptor scheme in discrete space. A parallel implementation of the numerical scheme is conducted reducing the computational time compared to other techniques.

  19. A Parallel Processing Algorithm for Remote Sensing Classification

    NASA Technical Reports Server (NTRS)

    Gualtieri, J. Anthony

    2005-01-01

    A current thread in parallel computation is the use of cluster computers created by networking a few to thousands of commodity general-purpose workstation-level commuters using the Linux operating system. For example on the Medusa cluster at NASA/GSFC, this provides for super computing performance, 130 G(sub flops) (Linpack Benchmark) at moderate cost, $370K. However, to be useful for scientific computing in the area of Earth science, issues of ease of programming, access to existing scientific libraries, and portability of existing code need to be considered. In this paper, I address these issues in the context of tools for rendering earth science remote sensing data into useful products. In particular, I focus on a problem that can be decomposed into a set of independent tasks, which on a serial computer would be performed sequentially, but with a cluster computer can be performed in parallel, giving an obvious speedup. To make the ideas concrete, I consider the problem of classifying hyperspectral imagery where some ground truth is available to train the classifier. In particular I will use the Support Vector Machine (SVM) approach as applied to hyperspectral imagery. The approach will be to introduce notions about parallel computation and then to restrict the development to the SVM problem. Pseudocode (an outline of the computation) will be described and then details specific to the implementation will be given. Then timing results will be reported to show what speedups are possible using parallel computation. The paper will close with a discussion of the results.

  20. Partitioning Rectangular and Structurally Nonsymmetric Sparse Matrices for Parallel Processing

    SciTech Connect

    B. Hendrickson; T.G. Kolda

    1998-09-01

    A common operation in scientific computing is the multiplication of a sparse, rectangular or structurally nonsymmetric matrix and a vector. In many applications the matrix- transpose-vector product is also required. This paper addresses the efficient parallelization of these operations. We show that the problem can be expressed in terms of partitioning bipartite graphs. We then introduce several algorithms for this partitioning problem and compare their performance on a set of test matrices.

  1. Parallel processing data network of master and slave transputers controlled by a serial control network

    DOEpatents

    Crosetto, D.B.

    1996-12-31

    The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor to a plurality of slave processors to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor`s status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer, a digital signal processor, a parallel transfer controller, and two three-port memory devices. A communication switch within each node connects it to a fast parallel hardware channel through which all high density data arrives or leaves the node. 6 figs.

  2. Parallel processing data network of master and slave transputers controlled by a serial control network

    DOEpatents

    Crosetto, Dario B.

    1996-01-01

    The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor (100) to a plurality of slave processors (200) to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor's status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer (104), a digital signal processor (114), a parallel transfer controller (106), and two three-port memory devices. A communication switch (108) within each node (100) connects it to a fast parallel hardware channel (70) through which all high density data arrives or leaves the node.

  3. Toward a Model Framework of Generalized Parallel Componential Processing of Multi-Symbol Numbers

    ERIC Educational Resources Information Center

    Huber, Stefan; Cornelsen, Sonja; Moeller, Korbinian; Nuerk, Hans-Christoph

    2015-01-01

    In this article, we propose and evaluate a new model framework of parallel componential multi-symbol number processing, generalizing the idea of parallel componential processing of multi-digit numbers to the case of negative numbers by considering the polarity signs similar to single digits. In a first step, we evaluated this account by defining…

  4. Studies in optical parallel processing. [All optical and electro-optic approaches

    NASA Technical Reports Server (NTRS)

    Lee, S. H.

    1978-01-01

    Threshold and A/D devices for converting a gray scale image into a binary one were investigated for all-optical and opto-electronic approaches to parallel processing. Integrated optical logic circuits (IOC) and optical parallel logic devices (OPA) were studied as an approach to processing optical binary signals. In the IOC logic scheme, a single row of an optical image is coupled into the IOC substrate at a time through an array of optical fibers. Parallel processing is carried out out, on each image element of these rows, in the IOC substrate and the resulting output exits via a second array of optical fibers. The OPAL system for parallel processing which uses a Fabry-Perot interferometer for image thresholding and analog-to-digital conversion, achieves a higher degree of parallel processing than is possible with IOC.

  5. Two-dimensional mesh-connected parallel processor with complex processing elements

    NASA Astrophysics Data System (ADS)

    Chen, Chaoyang; Shen, Xubang; Wang, Zhong; Sang, Hongshi

    2001-09-01

    LS MPP is a massively parallel processor .It has fine-grained parallelism with up to 4096 processing elements arranged in a SIMD architecture .The processing elements are arranged in 64x64 two-dimensional mesh-connected array for low-level image processing .In this paper, the system architecture ,the components of processing element ,array controller ,memory organization of LS MPP processor are described .In the final ,we have discussed the performance of LS MPP.

  6. A control strategy for parallel hybrid electric vehicles based on extremum seeking

    NASA Astrophysics Data System (ADS)

    Dinçmen, Erkin; Aksun Güvenç, Bilin

    2012-02-01

    An energy management control strategy for a parallel hybrid electric vehicle based on the extremum-seeking method for splitting torque between the internal combustion engine and electric motor is proposed in this paper. The control strategy has two levels of operation: the upper and lower levels. The upper level decision-making controller chooses the vehicle operation mode such as the simultaneous use of the internal combustion engine and electric motor, use of only the electric motor, use of only the internal combustion engine, or regenerative braking. In the simultaneous use of the internal combustion engine and electric motor, the optimum energy distribution between these two sources of energy is determined via the extremum-seeking algorithm that searches for maximum drivetrain efficiency. A dynamic programming solution is also obtained and used to form a benchmark for performance evaluation of the proposed method based on extremum seeking. Detailed simulations using a realistic model are presented to illustrate the effectiveness of the methodology.

  7. Parallel processing in the brain's visual form system: an fMRI study

    PubMed Central

    Shigihara, Yoshihito; Zeki, Semir

    2014-01-01

    We here extend and complement our earlier time-based, magneto-encephalographic (MEG), study of the processing of forms by the visual brain (Shigihara and Zeki, 2013) with a functional magnetic resonance imaging (fMRI) study, in order to better localize the activity produced in early visual areas when subjects view simple geometric stimuli of increasing perceptual complexity (lines, angles, rhombuses) constituted from the same elements (lines). Our results show that all three categories of form activate all three visual areas with which we were principally concerned (V1–V3), with angles producing the strongest and rhombuses the weakest activity in all three. The difference between the activity produced by angles and rhombuses was significant, that between lines and rhombuses was trend significant while that between lines and angles was not. Taken together with our earlier MEG results, the present ones suggest that a parallel strategy is used in processing forms, in addition to the well-documented hierarchical strategy. PMID:25126064

  8. Signal processing applications of massively parallel charge domain computing devices

    NASA Technical Reports Server (NTRS)

    Fijany, Amir (Inventor); Barhen, Jacob (Inventor); Toomarian, Nikzad (Inventor)

    1999-01-01

    The present invention is embodied in a charge coupled device (CCD)/charge injection device (CID) architecture capable of performing a Fourier transform by simultaneous matrix vector multiplication (MVM) operations in respective plural CCD/CID arrays in parallel in O(1) steps. For example, in one embodiment, a first CCD/CID array stores charge packets representing a first matrix operator based upon permutations of a Hartley transform and computes the Fourier transform of an incoming vector. A second CCD/CID array stores charge packets representing a second matrix operator based upon different permutations of a Hartley transform and computes the Fourier transform of an incoming vector. The incoming vector is applied to the inputs of the two CCD/CID arrays simultaneously, and the real and imaginary parts of the Fourier transform are produced simultaneously in the time required to perform a single MVM operation in a CCD/CID array.

  9. XTP as a transport protocol for distributed parallel processing

    SciTech Connect

    Strayer, W.T.; Lewis, M.J.; Cline, R.E. Jr.

    1994-12-31

    The Xpress Transfer Protocol (XTP) is a flexible transport layer protocol designed to provide efficient service without dictating the communication paradigm or the delivery characteristics that quality the paradigm. XTP provides the tools to build communication services appropriate to the application. Current data delivery solutions for many popular cluster computing environments use TCP and UDP. We examine TCP, UDP, and XTP with respect to the communication characteristics typical of parallel applications. We perform measurements of end-to-end latency for several paradigms important to cluster computing. An implementation of XTP is shown to be comparable to TCP in end-to-end latency on preestablished connections, and does better for paradigms where connections must be constructed on the fly.

  10. Parallel workflow tools to facilitate human brain MRI post-processing

    PubMed Central

    Cui, Zaixu; Zhao, Chenxi; Gong, Gaolang

    2015-01-01

    Multi-modal magnetic resonance imaging (MRI) techniques are widely applied in human brain studies. To obtain specific brain measures of interest from MRI datasets, a number of complex image post-processing steps are typically required. Parallel workflow tools have recently been developed, concatenating individual processing steps and enabling fully automated processing of raw MRI data to obtain the final results. These workflow tools are also designed to make optimal use of available computational resources and to support the parallel processing of different subjects or of independent processing steps for a single subject. Automated, parallel MRI post-processing tools can greatly facilitate relevant brain investigations and are being increasingly applied. In this review, we briefly summarize these parallel workflow tools and discuss relevant issues. PMID:26029043

  11. Parallel processing of remotely sensed data: Application to the ATSR-2 instrument

    NASA Astrophysics Data System (ADS)

    Simpson, J.; McIntire, T.; Berg, J.; Tsou, Y.

    2007-01-01

    Massively parallel computational paradigms can mitigate many issues associated with the analysis of large and complex remotely sensed data sets. Recently, the Beowulf cluster has emerged as the most attractive, massively parallel architecture due to its low cost and high performance. Whereas most Beowulf designs have emphasized numerical modeling applications, the Parallel Image Processing Environment (PIPE) specifically addresses the unique requirements of remote sensing applications. Automated, parallelization of user-defined analyses is fully supported. A neural network application, applied to Along Track Scanning Radiometer-2 (ATSR-2) data shows the advantages and performance characteristics of PIPE.

  12. Arts Integration Parallels Between Music and Reading: Process, Product and Affective Response.

    ERIC Educational Resources Information Center

    Merrion, Margaret Dee

    The process of aesthetic education is not limited to the fine arts. Parallels may be identified in the language arts and particularly in the art of creative reading. As in a musical experience, a creative reader will apprehend the content of the literature and couple personal feelings with the events of the reading experience. Parallel brain…

  13. Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Byun, Chansup; Kwak, Dochan (Technical Monitor)

    2001-01-01

    A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel super computers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.

  14. Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Kwak, Dochan (Technical Monitor)

    2002-01-01

    A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel supercomputers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.

  15. A new cascaded control strategy for paralleled line-interactive UPS with LCL filter

    NASA Astrophysics Data System (ADS)

    Zhang, X. Y.; Zhang, X. H.; Li, L.; Luo, F.; Zhang, Y. S.

    2016-08-01

    Traditional uninterrupted power supply (UPS) is difficult to meet the output voltage quality and grid-side power quality requirements at the same time, and usually has some disadvantage, such as multi-stage conversion, complex structure, or harmonic current pollution to the utility grid and so on. A three-phase three-level paralleled line-interactive UPS with LCL filter is presented in this paper. It can achieve the output voltage quality and grid-side power quality control simultaneously with only single-conversion power stage, but the multi-objective control strategy design is difficult. Based on the detailed analysis of the circuit structure and operation mechanism, a new cascaded control strategy for the power, voltage, and current is proposed. An outer current control loop based on the resonant control theory is designed to ensure the grid-side power quality. An inner voltage control loop based on the capacitance voltage and capacitance current feedback is designed to ensure the output voltage quality and avoid the resonance peak of the LCL filter. Improved repetitive controller is added to reduce the distortion of the output voltage. The setting of the controller parameters is detailed discussed. A 100kVA UPS prototype is built and experiments under the unbalanced resistive load and nonlinear load are carried out. Theoretical analysis and experimental results show the effectiveness of the control strategy. The paralleled line-interactive UPS can not only remain constant three-phase balanced output voltage, but also has the comprehensive power quality management functions with three-phase balanced grid active power input, low THD of output voltage and grid current, and reactive power compensation. The UPS is a green friendly load to the utility.

  16. Parallel processing a real code: A case history

    SciTech Connect

    Mandell, D.A.; Trease, H.E.

    1988-01-01

    A three-dimensional, time-dependent Free-Lagrange hydrodynamics code has been multitasked and autotasked on a Cray X-MP/416. The multitasking was done by using the Los Alamos Multitasking Control Library, which is a superset of the Cray multitasking library. Autotasking is done by using constructs which are only comment cards if the source code is not run through a preprocessor. The 3-D algorithm has presented a number of problems that simpler algorithms, such as 1-D hydrodynamics, did not exhibit. Problems in converting the serial code, originally written for a Cray 1, to a multitasking code are discussed, Autotasking of a rewritten version of the code is discussed. Timing results for subroutines and hot spots in the serial code are presented and suggestions for additional tools and debugging aids are given. Theoretical speedup results obtained from Amdahl's law and actual speedup results obtained on a dedicated machine are presented. Suggestions for designing large parallel codes are given. 8 refs., 13 figs.

  17. Parallel processing a three-dimensional free-lagrange code

    SciTech Connect

    Mandell, D.A.; Trease, H.E. )

    1989-01-01

    A three-dimensional, time-dependent free-Lagrange hydrodynamics code has been multitasked and autotasked on a CRAY X-MP/416. The multitasking was done by using the Los Alamos Multitasking Control Library, which is a superset of the CRAY multitasking library. Autotasking is done by using constructs which are only comment cards if the source code is not run through a preprocessor. The three-dimensional algorithm has presented a number of problems that simpler algorithms, such as those for one-dimensional hydrodynamics, did not exhibit. Problems in converting the serial code, originally written for a CRAY-1, to a multitasking code are discussed. Autotasking of a rewritten version of the code is discussed. Timing results for subroutines and hot spots in the serial code are presented and suggestions for additional tools and debugging aids are given. Theoretical speedup results obtained from Amdahl's law and actual speedup results obtained on a dedicated machine are presented. Suggestions for designing large parallel codes are given.

  18. Parallel Block Structured Adaptive Mesh Refinement on Graphics Processing Units

    SciTech Connect

    Beckingsale, D. A.; Gaudin, W. P.; Hornung, R. D.; Gunney, B. T.; Gamblin, T.; Herdman, J. A.; Jarvis, S. A.

    2014-11-17

    Block-structured adaptive mesh refinement is a technique that can be used when solving partial differential equations to reduce the number of zones necessary to achieve the required accuracy in areas of interest. These areas (shock fronts, material interfaces, etc.) are recursively covered with finer mesh patches that are grouped into a hierarchy of refinement levels. Despite the potential for large savings in computational requirements and memory usage without a corresponding reduction in accuracy, AMR adds overhead in managing the mesh hierarchy, adding complex communication and data movement requirements to a simulation. In this paper, we describe the design and implementation of a native GPU-based AMR library, including: the classes used to manage data on a mesh patch, the routines used for transferring data between GPUs on different nodes, and the data-parallel operators developed to coarsen and refine mesh data. We validate the performance and accuracy of our implementation using three test problems and two architectures: an eight-node cluster, and over four thousand nodes of Oak Ridge National Laboratory’s Titan supercomputer. Our GPU-based AMR hydrodynamics code performs up to 4.87× faster than the CPU-based implementation, and has been scaled to over four thousand GPUs using a combination of MPI and CUDA.

  19. Advantages of Parallel Processing and the Effects of Communications Time

    NASA Technical Reports Server (NTRS)

    Eddy, Wesley M.; Allman, Mark

    2000-01-01

    Many computing tasks involve heavy mathematical calculations, or analyzing large amounts of data. These operations can take a long time to complete using only one computer. Networks such as the Internet provide many computers with the ability to communicate with each other. Parallel or distributed computing takes advantage of these networked computers by arranging them to work together on a problem, thereby reducing the time needed to obtain the solution. The drawback to using a network of computers to solve a problem is the time wasted in communicating between the various hosts. The application of distributed computing techniques to a space environment or to use over a satellite network would therefore be limited by the amount of time needed to send data across the network, which would typically take much longer than on a terrestrial network. This experiment shows how much faster a large job can be performed by adding more computers to the task, what role communications time plays in the total execution time, and the impact a long-delay network has on a distributed computing system.

  20. A Distributed Parallel Genetic Algorithm of Placement Strategy for Virtual Machines Deployment on Cloud Platform

    PubMed Central

    Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

    2014-01-01

    The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform. PMID:25097872

  1. A distributed parallel genetic algorithm of placement strategy for virtual machines deployment on cloud platform.

    PubMed

    Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

    2014-01-01

    The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform.

  2. A distributed parallel genetic algorithm of placement strategy for virtual machines deployment on cloud platform.

    PubMed

    Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

    2014-01-01

    The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform. PMID:25097872

  3. Control of automatic processes: A parallel distributed-processing account of the Stroop effect. Technical report

    SciTech Connect

    Cohen, J.D.; Dunbar, K.; McClelland, J.L.

    1989-11-22

    A growing body of evidence suggests that traditional views of automaticity are in need of revision. For example, automaticity has often been treated as an all-or-none phenomenon, and traditional theories have held that automatic processes are independent of attention. Yet recent empirical data suggest that automatic processes are continuous, and furthermore are subject to attentional control. In this paper we present a model of attention which addresses these issues. Using a parallel distributed processing framework we propose that the attributes of automaticity depend upon the strength of a processing pathway and that strength increases with training. Using the Stroop effect as an example, we show how automatic processes are continuous and emerge gradually with practice. Specifically, we present a computational model of the Stroop task which simulates the time course of processing as well as the effects of learning. This was accomplished by combining the cascade mechanism described by McClelland (1979) with the back propagation learning algorithm (Rumelhart, Hinton, Williams, 1986). The model is able to simulate performance in the standard Stroop task, as well as aspects of performance in variants of this task which manipulate SOA, response set, and degree of practice. In the discussion we contrast our model with other models, and indicate how it relates to many of the central issues in the literature on attention, automaticity, and interference.

  4. An iterative expanding and shrinking process for processor allocation in mixed-parallel workflow scheduling.

    PubMed

    Huang, Kuo-Chan; Wu, Wei-Ya; Wang, Feng-Jian; Liu, Hsiao-Ching; Hung, Chun-Hao

    2016-01-01

    Parallel computation has been widely applied in a variety of large-scale scientific and engineering applications. Many studies indicate that exploiting both task and data parallelisms, i.e. mixed-parallel workflows, to solve large computational problems can get better efficacy compared with either pure task parallelism or pure data parallelism. Scheduling traditional workflows of pure task parallelism on parallel systems has long been known to be an NP-complete problem. Mixed-parallel workflow scheduling has to deal with an additional challenging issue of processor allocation. In this paper, we explore the processor allocation issue in scheduling mixed-parallel workflows of moldable tasks, called M-task, and propose an Iterative Allocation Expanding and Shrinking (IAES) approach. Compared to previous approaches, our IAES has two distinguishing features. The first is allocating more processors to the tasks on allocated critical paths for effectively reducing the makespan of workflow execution. The second is allowing the processor allocation of an M-task to shrink during the iterative procedure, resulting in a more flexible and effective process for finding better allocation. The proposed IAES approach has been evaluated with a series of simulation experiments and compared to several well-known previous methods, including CPR, CPA, MCPA, and MCPA2. The experimental results indicate that our IAES approach outperforms those previous methods significantly in most situations, especially when nodes of the same layer in a workflow might have unequal workloads. PMID:27504236

  5. An iterative expanding and shrinking process for processor allocation in mixed-parallel workflow scheduling.

    PubMed

    Huang, Kuo-Chan; Wu, Wei-Ya; Wang, Feng-Jian; Liu, Hsiao-Ching; Hung, Chun-Hao

    2016-01-01

    Parallel computation has been widely applied in a variety of large-scale scientific and engineering applications. Many studies indicate that exploiting both task and data parallelisms, i.e. mixed-parallel workflows, to solve large computational problems can get better efficacy compared with either pure task parallelism or pure data parallelism. Scheduling traditional workflows of pure task parallelism on parallel systems has long been known to be an NP-complete problem. Mixed-parallel workflow scheduling has to deal with an additional challenging issue of processor allocation. In this paper, we explore the processor allocation issue in scheduling mixed-parallel workflows of moldable tasks, called M-task, and propose an Iterative Allocation Expanding and Shrinking (IAES) approach. Compared to previous approaches, our IAES has two distinguishing features. The first is allocating more processors to the tasks on allocated critical paths for effectively reducing the makespan of workflow execution. The second is allowing the processor allocation of an M-task to shrink during the iterative procedure, resulting in a more flexible and effective process for finding better allocation. The proposed IAES approach has been evaluated with a series of simulation experiments and compared to several well-known previous methods, including CPR, CPA, MCPA, and MCPA2. The experimental results indicate that our IAES approach outperforms those previous methods significantly in most situations, especially when nodes of the same layer in a workflow might have unequal workloads.

  6. Parallel plan execution with self-processing networks

    NASA Technical Reports Server (NTRS)

    Dautrechy, C. Lynne; Reggia, James A.

    1989-01-01

    A critical issue for space operations is how to develop and apply advanced automation techniques to reduce the cost and complexity of working in space. In this context, it is important to examine how recent advances in self-processing networks can be applied for planning and scheduling tasks. For this reason, the feasibility of applying self-processing network models to a variety of planning and control problems relevant to spacecraft activities is being explored. Goals are to demonstrate that self-processing methods are applicable to these problems, and that MIRRORS/II, a general purpose software environment for implementing self-processing models, is sufficiently robust to support development of a wide range of application prototypes. Using MIRRORS/II and marker passing modelling techniques, a model of the execution of a Spaceworld plan was implemented. This is a simplified model of the Voyager spacecraft which photographed Jupiter, Saturn, and their satellites. It is shown that plan execution, a task usually solved using traditional artificial intelligence (AI) techniques, can be accomplished using a self-processing network. The fact that self-processing networks were applied to other space-related tasks, in addition to the one discussed here, demonstrates the general applicability of this approach to planning and control problems relevant to spacecraft activities. It is also demonstrated that MIRRORS/II is a powerful environment for the development and evaluation of self-processing systems.

  7. Distinct lateral inhibitory circuits drive parallel processing of sensory information in the mammalian olfactory bulb.

    PubMed

    Geramita, Matthew A; Burton, Shawn D; Urban, Nathan N

    2016-01-01

    Splitting sensory information into parallel pathways is a common strategy in sensory systems. Yet, how circuits in these parallel pathways are composed to maintain or even enhance the encoding of specific stimulus features is poorly understood. Here, we have investigated the parallel pathways formed by mitral and tufted cells of the olfactory system in mice and characterized the emergence of feature selectivity in these cell types via distinct lateral inhibitory circuits. We find differences in activity-dependent lateral inhibition between mitral and tufted cells that likely reflect newly described differences in the activation of deep and superficial granule cells. Simulations show that these circuit-level differences allow mitral and tufted cells to best discriminate odors in separate concentration ranges, indicating that segregating information about different ranges of stimulus intensity may be an important function of these parallel sensory pathways.

  8. Distinct lateral inhibitory circuits drive parallel processing of sensory information in the mammalian olfactory bulb.

    PubMed

    Geramita, Matthew A; Burton, Shawn D; Urban, Nathan N

    2016-01-01

    Splitting sensory information into parallel pathways is a common strategy in sensory systems. Yet, how circuits in these parallel pathways are composed to maintain or even enhance the encoding of specific stimulus features is poorly understood. Here, we have investigated the parallel pathways formed by mitral and tufted cells of the olfactory system in mice and characterized the emergence of feature selectivity in these cell types via distinct lateral inhibitory circuits. We find differences in activity-dependent lateral inhibition between mitral and tufted cells that likely reflect newly described differences in the activation of deep and superficial granule cells. Simulations show that these circuit-level differences allow mitral and tufted cells to best discriminate odors in separate concentration ranges, indicating that segregating information about different ranges of stimulus intensity may be an important function of these parallel sensory pathways. PMID:27351103

  9. Distinct lateral inhibitory circuits drive parallel processing of sensory information in the mammalian olfactory bulb

    PubMed Central

    Geramita, Matthew A; Burton, Shawn D; Urban, Nathan N

    2016-01-01

    Splitting sensory information into parallel pathways is a common strategy in sensory systems. Yet, how circuits in these parallel pathways are composed to maintain or even enhance the encoding of specific stimulus features is poorly understood. Here, we have investigated the parallel pathways formed by mitral and tufted cells of the olfactory system in mice and characterized the emergence of feature selectivity in these cell types via distinct lateral inhibitory circuits. We find differences in activity-dependent lateral inhibition between mitral and tufted cells that likely reflect newly described differences in the activation of deep and superficial granule cells. Simulations show that these circuit-level differences allow mitral and tufted cells to best discriminate odors in separate concentration ranges, indicating that segregating information about different ranges of stimulus intensity may be an important function of these parallel sensory pathways. DOI: http://dx.doi.org/10.7554/eLife.16039.001 PMID:27351103

  10. Parallel conjugate gradient: effects of ordering strategies, programming paradigms, and architectural platforms

    SciTech Connect

    Oliker, L.; Li, X.; Heber, G.; Biswas, R.

    2000-05-01

    The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations with a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multithreaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.

  11. Multislice perfusion of the kidneys using parallel imaging: image acquisition and analysis strategies.

    PubMed

    Gardener, Alexander G; Francis, Susan T

    2010-06-01

    Flow-sensitive alternating inversion recovery arterial spin labeling with parallel imaging acquisition is used to acquire single-shot, multislice perfusion maps of the kidney. A considerable problem for arterial spin labeling methods, which are based on sequential subtraction, is the movement of the kidneys due to respiratory motion between acquisitions. The effects of breathing strategy (free, respiratory-triggered and breath hold) are studied and the use of background suppression is investigated. The application of movement correction by image registration is assessed and perfusion rates are measured. Postacquisition image realignment is shown to improve visual quality and subsequent perfusion quantification. Using such correction, data can be collected from free breathing alone, without the need for a good respiratory trace and in the shortest overall acquisition time, advantageous for patient comfort. The addition of background suppression to arterial spin labeling data is shown to reduce the perfusion signal-to-noise ratio and underestimate perfusion.

  12. Parallel Conjugate Gradient: Effects of Ordering Strategies, Programming Paradigms, and Architectural Platforms

    NASA Technical Reports Server (NTRS)

    Oliker, Leonid; Heber, Gerd; Biswas, Rupak

    2000-01-01

    The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations within a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming paradigms and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multi-threaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.

  13. Performance of a VME-based parallel processing LIDAR data acquisition system (summary)

    SciTech Connect

    Moore, K.; Buttler, B.; Caffrey, M.; Soriano, C.

    1995-05-01

    It may be possible to make accurate real time, autonomous, 2 and 3 dimensional wind measurements remotely with an elastic backscatter Light Detection and Ranging (LIDAR) system by incorporating digital parallel processing hardware into the data acquisition system. In this paper, we report the performance of a commercially available digital parallel processing system in implementing the maximum correlation technique for wind sensing using actual LIDAR data. Timing and numerical accuracy are benchmarked against a standard microprocessor impementation.

  14. Efficient Parallel Video Processing Techniques on GPU: From Framework to Implementation

    PubMed Central

    Su, Huayou; Wen, Mei; Wu, Nan; Ren, Ju; Zhang, Chunyuan

    2014-01-01

    Through reorganizing the execution order and optimizing the data structure, we proposed an efficient parallel framework for H.264/AVC encoder based on massively parallel architecture. We implemented the proposed framework by CUDA on NVIDIA's GPU. Not only the compute intensive components of the H.264 encoder are parallelized but also the control intensive components are realized effectively, such as CAVLC and deblocking filter. In addition, we proposed serial optimization methods, including the multiresolution multiwindow for motion estimation, multilevel parallel strategy to enhance the parallelism of intracoding as much as possible, component-based parallel CAVLC, and direction-priority deblocking filter. More than 96% of workload of H.264 encoder is offloaded to GPU. Experimental results show that the parallel implementation outperforms the serial program by 20 times of speedup ratio and satisfies the requirement of the real-time HD encoding of 30 fps. The loss of PSNR is from 0.14 dB to 0.77 dB, when keeping the same bitrate. Through the analysis to the kernels, we found that speedup ratios of the compute intensive algorithms are proportional with the computation power of the GPU. However, the performance of the control intensive parts (CAVLC) is much related to the memory bandwidth, which gives an insight for new architecture design. PMID:24757432

  15. Neurocognitive inefficacy of the strategy process.

    PubMed

    Klein, Harold E; D'Esposito, Mark

    2007-11-01

    The most widely used (and taught) protocols for strategic analysis-Strengths, Weaknesses, Opportunities, and Threats (SWOT) and Porter's (1980) Five Force Framework for industry analysis-have been found to be insufficient as stimuli for strategy creation or even as a basis for further strategy development. We approach this problem from a neurocognitive perspective. We see profound incompatibilities between the cognitive process-deductive reasoning-channeled into the collective mind of strategists within the formal planning process through its tools of strategic analysis (i.e., rational technologies) and the essentially inductive reasoning process actually needed to address ill-defined, complex strategic situations. Thus, strategic analysis protocols that may appear to be and, indeed, are entirely rational and logical are not interpretable as such at the neuronal substrate level where thinking takes place. The analytical structure (or propositional representation) of these tools results in a mental dead end, the phenomenon known in cognitive psychology as functional fixedness. The difficulty lies with the inability of the brain to make out meaningful (i.e., strategy-provoking) stimuli from the mental images (or depictive representations) generated by strategic analysis tools. We propose decreasing dependence on these tools and conducting further research employing brain imaging technology to explore complex data handling protocols with richer mental representation and greater potential for strategy creation. PMID:17804524

  16. Learning through Action: Parallel Learning Processes in Children and Adults

    ERIC Educational Resources Information Center

    Ethridge, Elizabeth A.; Branscomb, Kathryn R.

    2009-01-01

    Experiential learning has become an essential part of many educational settings from infancy through adulthood. While the effectiveness of active learning has been evaluated in youth and adult settings, few known studies have compared the learning processes of children and adults within the same project. This article contrasts the active learning…

  17. Parallel computing for simultaneous iterative tomographic imaging by graphics processing units

    NASA Astrophysics Data System (ADS)

    Bello-Maldonado, Pedro D.; López, Ricardo; Rogers, Colleen; Jin, Yuanwei; Lu, Enyue

    2016-05-01

    In this paper, we address the problem of accelerating inversion algorithms for nonlinear acoustic tomographic imaging by parallel computing on graphics processing units (GPUs). Nonlinear inversion algorithms for tomographic imaging often rely on iterative algorithms for solving an inverse problem, thus computationally intensive. We study the simultaneous iterative reconstruction technique (SIRT) for the multiple-input-multiple-output (MIMO) tomography algorithm which enables parallel computations of the grid points as well as the parallel execution of multiple source excitation. Using graphics processing units (GPUs) and the Compute Unified Device Architecture (CUDA) programming model an overall improvement of 26.33x was achieved when combining both approaches compared with sequential algorithms. Furthermore we propose an adaptive iterative relaxation factor and the use of non-uniform weights to improve the overall convergence of the algorithm. Using these techniques, fast computations can be performed in parallel without the loss of image quality during the reconstruction process.

  18. Connectionism, parallel constraint satisfaction processes, and gestalt principles: (re) introducing cognitive dynamics to social psychology.

    PubMed

    Read, S J; Vanman, E J; Miller, L C

    1997-01-01

    We argue that recent work in connectionist modeling, in particular the parallel constraint satisfaction processes that are central to many of these models, has great importance for understanding issues of both historical and current concern for social psychologists. We first provide a brief description of connectionist modeling, with particular emphasis on parallel constraint satisfaction processes. Second, we examine the tremendous similarities between parallel constraint satisfaction processes and the Gestalt principles that were the foundation for much of modem social psychology. We propose that parallel constraint satisfaction processes provide a computational implementation of the principles of Gestalt psychology that were central to the work of such seminal social psychologists as Asch, Festinger, Heider, and Lewin. Third, we then describe how parallel constraint satisfaction processes have been applied to three areas that were key to the beginnings of modern social psychology and remain central today: impression formation and causal reasoning, cognitive consistency (balance and cognitive dissonance), and goal-directed behavior. We conclude by discussing implications of parallel constraint satisfaction principles for a number of broader issues in social psychology, such as the dynamics of social thought and the integration of social information within the narrow time frame of social interaction.

  19. Parallel Processing Method for Airborne Laser Scanning Data Using a PC Cluster and a Virtual Grid.

    PubMed

    Han, Soo Hee; Heo, Joon; Sohn, Hong Gyoo; Yu, Kiyun

    2009-01-01

    In this study, a parallel processing method using a PC cluster and a virtual grid is proposed for the fast processing of enormous amounts of airborne laser scanning (ALS) data. The method creates a raster digital surface model (DSM) by interpolating point data with inverse distance weighting (IDW), and produces a digital terrain model (DTM) by local minimum filtering of the DSM. To make a consistent comparison of performance between sequential and parallel processing approaches, the means of dealing with boundary data and of selecting interpolation centers were controlled for each processing node in parallel approach. To test the speedup, efficiency and linearity of the proposed algorithm, actual ALS data up to 134 million points were processed with a PC cluster consisting of one master node and eight slave nodes. The results showed that parallel processing provides better performance when the computational overhead, the number of processors, and the data size become large. It was verified that the proposed algorithm is a linear time operation and that the products obtained by parallel processing are identical to those produced by sequential processing. PMID:22574032

  20. Sculpting in cyberspace: Parallel processing the development of new software

    NASA Technical Reports Server (NTRS)

    Fisher, Rob

    1993-01-01

    Stimulating creativity in problem solving, particularly where software development is involved, is applicable to many disciplines. Metaphorical thinking keeps the problem in focus but in a different light, jarring people out of their mental ruts and sparking fresh insights. It forces the mind to stretch to find patterns between dissimilar concepts, in the hope of discovering unusual ideas in odd associations (Technology Review January 1993, p. 37). With a background in Engineering and Visual Design from MIT, I have for the past 30 years pursued a career as a sculptor of interdisciplinary monumental artworks that bridge the fields of science, engineering and art. Since 1979, I have pioneered the application of computer simulation to solve the complex problems associated with these projects. A recent project for the roof of the Carnegie Science Center in Pittsburgh made particular use of the metaphoric creativity technique described above. The problem-solving process led to the creation of hybrid software combining scientific, architectural and engineering visualization techniques. David Steich, a Doctoral Candidate in Electrical Engineering at Penn State, was commissioned to develop special software that enabled me to create innovative free-form sculpture. This paper explores the process of inventing the software through a detailed analysis of the interaction between an artist and a computer programmer.

  1. The role of parallelism in the real-time processing of anaphora

    PubMed Central

    Poirier, Josée; Walenski, Matthew; Shapiro, Lewis P.

    2012-01-01

    Parallelism effects refer to the facilitated processing of a target structure when it follows a similar, parallel structure. In coordination, a parallelism-related conjunction triggers the expectation that a second conjunct with the same structure as the first conjunct should occur. It has been proposed that parallelism effects reflect the use of the first structure as a template that guides the processing of the second. In this study, we examined the role of parallelism in real-time anaphora resolution by charting activation patterns in coordinated constructions containing anaphora, Verb-Phrase Ellipsis (VPE) and Noun-Phrase Traces (NP-traces). Specifically, we hypothesised that an expectation of parallelism would incite the parser to assume a structure similar to the first conjunct in the second, anaphora-containing conjunct. The speculation of a similar structure would result in early postulation of covert anaphora. Experiment 1 confirms that following a parallelism-related conjunction, first-conjunct material is activated in the second conjunct. Experiment 2 reveals that an NP-trace in the second conjunct is posited immediately where licensed, which is earlier than previously reported in the literature. In light of our findings, we propose an intricate relation between structural expectations and anaphor resolution. PMID:23741080

  2. Non-parallel processing: Gendered attrition in academic computer science

    NASA Astrophysics Data System (ADS)

    Cohoon, Joanne Louise Mcgrath

    2000-10-01

    This dissertation addresses the issue of disproportionate female attrition from computer science as an instance of gender segregation in higher education. By adopting a theoretical framework from organizational sociology, it demonstrates that the characteristics and processes of computer science departments strongly influence female retention. The empirical data identifies conditions under which women are retained in the computer science major at comparable rates to men. The research for this dissertation began with interviews of students, faculty, and chairpersons from five computer science departments. These exploratory interviews led to a survey of faculty and chairpersons at computer science and biology departments in Virginia. The data from these surveys are used in comparisons of the computer science and biology disciplines, and for statistical analyses that identify which departmental characteristics promote equal attrition for male and female undergraduates in computer science. This three-pronged methodological approach of interviews, discipline comparisons, and statistical analyses shows that departmental variation in gendered attrition rates can be explained largely by access to opportunity, relative numbers, and other characteristics of the learning environment. Using these concepts, this research identifies nine factors that affect the differential attrition of women from CS departments. These factors are: (1) The gender composition of enrolled students and faculty; (2) Faculty turnover; (3) Institutional support for the department; (4) Preferential attitudes toward female students; (5) Mentoring and supervising by faculty; (6) The local job market, starting salaries, and competitiveness of graduates; (7) Emphasis on teaching; and (8) Joint efforts for student success. This work contributes to our understanding of the gender segregation process in higher education. In addition, it contributes information that can lead to effective solutions for an

  3. Note on parallel processing techniques for algebraic equations, ordinary differential equations and partial differential equations

    SciTech Connect

    Allidina, A.Y.; Malinowski, K.; Singh, M.G.

    1982-12-01

    The possibilities were explored for enhancing parallelism in the simulation of systems described by algebraic equations, ordinary differential equations and partial differential equations. These techniques, using multiprocessors, were developed to speed up simulations, e.g. for nuclear accidents. Issues involved in their design included suitable approximations to bring the problem into a numerically manageable form and a numerical procedure to perform the computations necessary to solve the problem accurately. Parallel processing techniques used as simulation procedures, and a design of a simulation scheme and simulation procedure employing parallel computer facilities, were both considered.

  4. Parallel processing implementation for the coupled transport of photons and electrons using OpenMP

    NASA Astrophysics Data System (ADS)

    Doerner, Edgardo

    2016-05-01

    In this work the use of OpenMP to implement the parallel processing of the Monte Carlo (MC) simulation of the coupled transport for photons and electrons is presented. This implementation was carried out using a modified EGSnrc platform which enables the use of the Microsoft Visual Studio 2013 (VS2013) environment, together with the developing tools available in the Intel Parallel Studio XE 2015 (XE2015). The performance study of this new implementation was carried out in a desktop PC with a multi-core CPU, taking as a reference the performance of the original platform. The results were satisfactory, both in terms of scalability as parallelization efficiency.

  5. Adapting high-level language programs for parallel processing using data flow

    NASA Technical Reports Server (NTRS)

    Standley, Hilda M.

    1988-01-01

    EASY-FLOW, a very high-level data flow language, is introduced for the purpose of adapting programs written in a conventional high-level language to a parallel environment. The level of parallelism provided is of the large-grained variety in which parallel activities take place between subprograms or processes. A program written in EASY-FLOW is a set of subprogram calls as units, structured by iteration, branching, and distribution constructs. A data flow graph may be deduced from an EASY-FLOW program.

  6. Integration Of Parallel Image Processing With Symbolic And Neural Computations For Imagery Exploitation

    NASA Astrophysics Data System (ADS)

    Roman, Evelyn

    1990-02-01

    In this paper we discuss the work being done at Itek combining parallel, symbolic, and neural methodologies at different stages of processing for imagery exploitation. We describe a prototype system we have been implementing combining real-time parallel image processing on an 8-stage parallel image-processing engine (PIPE) computer with expert system software such as our Multi-Sensor Exploitation Assistant system on the Symbolics LISP machine and with neural computations on the PIPE and on its host IBM AT for target recognition and change detection applications. We also provide a summary of basic neural concepts, and show the commonality between neural nets and related mathematics, artificial intelligence, and traditional image processing concepts. This provides us with numerous choices for the implementation of constraint satisfaction, transformational invariance, inference and representational mechanisms, and software lifecycle engineering methodologies in the different computational layers. Our future work may include optical processing as well, for a real-time capability complementing the PIPE's.

  7. Parallel computer processing and modeling: applications for the ICU

    NASA Astrophysics Data System (ADS)

    Baxter, Grant; Pranger, L. Alex; Draghic, Nicole; Sims, Nathaniel M.; Wiesmann, William P.

    2003-07-01

    Current patient monitoring procedures in hospital intensive care units (ICUs) generate vast quantities of medical data, much of which is considered extemporaneous and not evaluated. Although sophisticated monitors to analyze individual types of patient data are routinely used in the hospital setting, this equipment lacks high order signal analysis tools for detecting long-term trends and correlations between different signals within a patient data set. Without the ability to continuously analyze disjoint sets of patient data, it is difficult to detect slow-forming complications. As a result, the early onset of conditions such as pneumonia or sepsis may not be apparent until the advanced stages. We report here on the development of a distributed software architecture test bed and software medical models to analyze both asynchronous and continuous patient data in real time. Hardware and software has been developed to support a multi-node distributed computer cluster capable of amassing data from multiple patient monitors and projecting near and long-term outcomes based upon the application of physiologic models to the incoming patient data stream. One computer acts as a central coordinating node; additional computers accommodate processing needs. A simple, non-clinical model for sepsis detection was implemented on the system for demonstration purposes. This work shows exceptional promise as a highly effective means to rapidly predict and thereby mitigate the effect of nosocomial infections.

  8. Design and implementation of the parallel processing system of multi-channel polarization images

    NASA Astrophysics Data System (ADS)

    Li, Zhi-yong; Huang, Qin-chao

    2013-08-01

    Compared with traditional optical intensity image processing, polarization images processing has two main problems. One is that the amount of data is larger. The other is that processing tasks is more complex. To resolve these problems, the parallel processing system of multi-channel polarization images is designed by the multi-DSP technique. It contains a communication control unit (CCU) and a data processing array (DPA). CCU controls communications inside and outside the system. Its logics are designed by a FPGA chip. DPA is made up of four Digital Signal Processor (DSP) chips, which are interlinked by the loose coupling method. DPA implements processing tasks including images registration and images synthesis by parallel processing methods. The polarization images parallel processing model is designed on multi levels including the system task, the algorithm and the operation. Its program is designed by the assemble language. While the polarization image resolution is 782x582 pixels, the pixel data length is 12 bits in the experiment. After it received 3 channels of polarization image simultaneously, this system implements parallel task to acquire the target polarization characteristics. Experimental results show that this system has good real-time and reliability. The processing time of images registration is 293.343ms while the registration accuracy achieves 0.5 pixel. The processing time of images synthesis is 3.199ms.

  9. Reliable and Efficient Parallel Processing Algorithms and Architectures for Modern Signal Processing. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Liu, Kuojuey Ray

    1990-01-01

    Least-squares (LS) estimations and spectral decomposition algorithms constitute the heart of modern signal processing and communication problems. Implementations of recursive LS and spectral decomposition algorithms onto parallel processing architectures such as systolic arrays with efficient fault-tolerant schemes are the major concerns of this dissertation. There are four major results in this dissertation. First, we propose the systolic block Householder transformation with application to the recursive least-squares minimization. It is successfully implemented on a systolic array with a two-level pipelined implementation at the vector level as well as at the word level. Second, a real-time algorithm-based concurrent error detection scheme based on the residual method is proposed for the QRD RLS systolic array. The fault diagnosis, order degraded reconfiguration, and performance analysis are also considered. Third, the dynamic range, stability, error detection capability under finite-precision implementation, order degraded performance, and residual estimation under faulty situations for the QRD RLS systolic array are studied in details. Finally, we propose the use of multi-phase systolic algorithms for spectral decomposition based on the QR algorithm. Two systolic architectures, one based on triangular array and another based on rectangular array, are presented for the multiphase operations with fault-tolerant considerations. Eigenvectors and singular vectors can be easily obtained by using the multi-pase operations. Performance issues are also considered.

  10. The remote sensing image segmentation mean shift algorithm parallel processing based on MapReduce

    NASA Astrophysics Data System (ADS)

    Chen, Xi; Zhou, Liqing

    2015-12-01

    With the development of satellite remote sensing technology and the remote sensing image data, traditional remote sensing image segmentation technology cannot meet the massive remote sensing image processing and storage requirements. This article put cloud computing and parallel computing technology in remote sensing image segmentation process, and build a cheap and efficient computer cluster system that uses parallel processing to achieve MeanShift algorithm of remote sensing image segmentation based on the MapReduce model, not only to ensure the quality of remote sensing image segmentation, improved split speed, and better meet the real-time requirements. The remote sensing image segmentation MeanShift algorithm parallel processing algorithm based on MapReduce shows certain significance and a realization of value.

  11. Parallels between a Collaborative Research Process and the Middle Level Philosophy

    ERIC Educational Resources Information Center

    Dever, Robin; Ross, Diane; Miller, Jennifer; White, Paula; Jones, Karen

    2014-01-01

    The characteristics of the middle level philosophy as described in This We Believe closely parallel the collaborative research process. The journey of one research team is described in relationship to these characteristics. The collaborative process includes strengths such as professional relationships, professional development, courageous…

  12. Parallel Processing of the Target Language during Source Language Comprehension in Interpreting

    ERIC Educational Resources Information Center

    Dong, Yanping; Lin, Jiexuan

    2013-01-01

    Two experiments were conducted to test the hypothesis that the parallel processing of the target language (TL) during source language (SL) comprehension in interpreting may be influenced by two factors: (i) link strength from SL to TL, and (ii) the interpreter's cognitive resources supplement to TL processing during SL comprehension. The…

  13. Solution-processed parallel tandem polymer solar cells using silver nanowires as intermediate electrode.

    PubMed

    Guo, Fei; Kubis, Peter; Li, Ning; Przybilla, Thomas; Matt, Gebhard; Stubhan, Tobias; Ameri, Tayebeh; Butz, Benjamin; Spiecker, Erdmann; Forberich, Karen; Brabec, Christoph J

    2014-12-23

    Tandem architecture is the most relevant concept to overcome the efficiency limit of single-junction photovoltaic solar cells. Series-connected tandem polymer solar cells (PSCs) have advanced rapidly during the past decade. In contrast, the development of parallel-connected tandem cells is lagging far behind due to the big challenge in establishing an efficient interlayer with high transparency and high in-plane conductivity. Here, we report all-solution fabrication of parallel tandem PSCs using silver nanowires as intermediate charge collecting electrode. Through a rational interface design, a robust interlayer is established, enabling the efficient extraction and transport of electrons from subcells. The resulting parallel tandem cells exhibit high fill factors of ∼60% and enhanced current densities which are identical to the sum of the current densities of the subcells. These results suggest that solution-processed parallel tandem configuration provides an alternative avenue toward high performance photovoltaic devices. PMID:25405589

  14. Evaluation of parallel reduction strategies for fusion of sensory information from a robot team

    NASA Astrophysics Data System (ADS)

    Lyons, Damian M.; Leroy, Joseph

    2015-05-01

    The advantage of using a team of robots to search or to map an area is that by navigating the robots to different parts of the area, searching or mapping can be completed more quickly. A crucial aspect of the problem is the combination, or fusion, of data from team members to generate an integrated model of the search/mapping area. In prior work we looked at the issue of removing mutual robots views from an integrated point cloud model built from laser and stereo sensors, leading to a cleaner and more accurate model. This paper addresses a further challenge: Even with mutual views removed, the stereo data from a team of robots can quickly swamp a WiFi connection. This paper proposes and evaluates a communication and fusion approach based on the parallel reduction operation, where data is combined in a series of steps of increasing subsets of the team. Eight different strategies for selecting the subsets are evaluated for bandwidth requirements using three robot missions, each carried out with teams of four Pioneer 3-AT robots. Our results indicate that selecting groups to combine based on similar pose but distant location yields the best results.

  15. Managing internode data communications for an uninitialized process in a parallel computer

    DOEpatents

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Parker, Jeffrey J; Ratterman, Joseph D; Smith, Brian E

    2014-05-20

    A parallel computer includes nodes, each having main memory and a messaging unit (MU). Each MU includes computer memory, which in turn includes, MU message buffers. Each MU message buffer is associated with an uninitialized process on the compute node. In the parallel computer, managing internode data communications for an uninitialized process includes: receiving, by an MU of a compute node, one or more data communications messages in an MU message buffer associated with an uninitialized process on the compute node; determining, by an application agent, that the MU message buffer associated with the uninitialized process is full prior to initialization of the uninitialized process; establishing, by the application agent, a temporary message buffer for the uninitialized process in main computer memory; and moving, by the application agent, data communications messages from the MU message buffer associated with the uninitialized process to the temporary message buffer in main computer memory.

  16. [Work processes in Family Health Strategy team].

    PubMed

    Pavoni, Daniela Soccoloski; Medeiros, Cássia Regina Gotler

    2009-01-01

    The Family Health Strategy requires a redefinition of the health care model, characterized by interdisciplinary team work. This study is aimed at knowiong the work processes in a Family Health Team. The research was qualitative, and 10 team members were interviewed. Results demonstrated that the nurse performs a variety of functions that could be shared with other people; this overloads him/her and makes inherent job task execution difficult. Task planning and performing are usually done in teams, but some professionals get more involved in these activities. It was concluded that there is a need for the team to reflect upon work process as well as reassess task assignment, so that each individual is able to perform the work and contribute for an integrated work.

  17. Parallel processing across neural systems: implications for a multiple memory system hypothesis.

    PubMed

    Mizumori, Sheri J Y; Yeshenko, Oksana; Gill, Kathryn M; Davis, Denise M

    2004-11-01

    A common conceptualization of the organization of memory systems in brain is that different types of memory are mediated by distinct neural systems. Strong support for this view comes from studies that show double (or triple) dissociations between spatial, response, and emotional memories following selective lesions of hippocampus, striatum, and the amygdala. Here, we examine the extent to which hippocampal and striatal neural activity patterns support the multiple memory systems view. A comparison is made between hippocampal and striatal neural correlates with behavior during asymptotic performance of spatial and response maze tasks. Location- (or place), movement, and reward-specific firing patterns were found in both structures regardless of the task demands. Many, but not all, place fields of hippocampal and striatal neurons were similarly affected by changes in the visual and reward context regardless of the cognitive demands. Also, many, but not all, hippocampal and striatal movement-sensitive neurons showed significant changes in their behavioral correlates after a change in visual context, irrespective of cognitive strategy. Similar partial reorganization was observed following manipulations of the reward condition for cells recorded from both structures, again regardless of task. Assuming that representations that persist across context changes reflect learned information, we make the following conclusions. First, the consistent pattern of partial reorganization supports a view that the analysis of spatial, response, and reinforcement information is accomplished via an error-driven, or match-mismatch, algorithm across neural systems. Second, task-relevant processing occurs continuously within hippocampus and striatum regardless of the cognitive demands of the task. Third, given the high degree of parallel processing across allegedly different memory systems, we propose that different neural systems may effectively compete for control of a behavioral

  18. Parallel distributed processing and neural networks: origins, methodology and cognitive functions.

    PubMed

    Parks, R W; Long, D L; Levine, D S; Crockett, D J; McGeer, E G; McGeer, P L; Dalton, I E; Zec, R F; Becker, R E; Coburn, K L

    1991-10-01

    Parallel Distributed Processing (PDP), a computational methodology with origins in Associationism, is used to provide empirical information regarding neurobiological systems. Recently, supercomputers have enabled neuroscientists to model brain behavior-relationships. An overview of supercomputer architecture demonstrates the advantages of parallel over serial processing. Histological data provide physical evidence of the parallel distributed nature of certain aspects of the human brain, as do corresponding computer simulations. Whereas sensory networks follow more sequential neural network pathways, in vivo brain imaging studies of attention and rudimentary language tasks appear to involve multiple cortical and subcortical areas. Controversy remains as to whether associative models or Artificial Intelligence symbolic models better reflect neural networks of cognitive functions; however, considerable interest has shifted towards associative models.

  19. The study of multi-granularity switching based on parallel processing routing technology

    NASA Astrophysics Data System (ADS)

    Wang, Yubao; Hao, Xiaoran; Bai, Jian; Hu, Haochen

    A novel parallel processing optical code label-switched paths (PP-OC-LSPs) is proposed to achieve efficient conversion performance for core router and large throughput in optical network, which combines the merits of optical label switching (OLS), optical code-generalized multi-protocol label switching (OC-GMPLS), and optical code division multiplexing-paths (OCDM-paths) technology. In the proposed technology, the smallest switching granularity is an optical code which carries the label information of data packets and would be parallel converted in the core router. In the edge node, the label and payload are separated by using the optical polarity characteristics, and encoded/decoded with two different code series based on OCDM techniques so as to process them in parallel. Compared with the OLS, its switching capability has been extended to support fiber switching, wavelength switching, and OCDM switching. We present simulation results to demonstrate its performance over OCDM-paths technology.

  20. Computation of the Density Matrix in Electronic Structure Theory in Parallel on Multiple Graphics Processing Units.

    PubMed

    Cawkwell, M J; Wood, M A; Niklasson, Anders M N; Mniszewski, S M

    2014-12-01

    The algorithm developed in Cawkwell, M. J. et al. J. Chem. Theory Comput. 2012 , 8 , 4094 for the computation of the density matrix in electronic structure theory on a graphics processing unit (GPU) using the second-order spectral projection (SP2) method [ Niklasson, A. M. N. Phys. Rev. B 2002 , 66 , 155115 ] has been efficiently parallelized over multiple GPUs on a single compute node. The parallel implementation provides significant speed-ups with respect to the single GPU version with no loss of accuracy. The performance and accuracy of the parallel GPU-based algorithm is compared with the performance of the SP2 algorithm and traditional matrix diagonalization methods on a multicore central processing unit (CPU).

  1. Processing communications events in parallel active messaging interface by awakening thread from wait state

    DOEpatents

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2013-10-22

    Processing data communications events in a parallel active messaging interface (`PAMI`) of a parallel computer that includes compute nodes that execute a parallel application, with the PAMI including data communications endpoints, and the endpoints are coupled for data communications through the PAMI and through other data communications resources, including determining by an advance function that there are no actionable data communications events pending for its context, placing by the advance function its thread of execution into a wait state, waiting for a subsequent data communications event for the context; responsive to occurrence of a subsequent data communications event for the context, awakening by the thread from the wait state; and processing by the advance function the subsequent data communications event now pending for the context.

  2. Running ATLAS workloads within massively parallel distributed applications using Athena Multi-Process framework (AthenaMP)

    NASA Astrophysics Data System (ADS)

    Calafiura, Paolo; Leggett, Charles; Seuster, Rolf; Tsulaia, Vakhtang; Van Gemmeren, Peter

    2015-12-01

    AthenaMP is a multi-process version of the ATLAS reconstruction, simulation and data analysis framework Athena. By leveraging Linux fork and copy-on-write mechanisms, it allows for sharing of memory pages between event processors running on the same compute node with little to no change in the application code. Originally targeted to optimize the memory footprint of reconstruction jobs, AthenaMP has demonstrated that it can reduce the memory usage of certain configurations of ATLAS production jobs by a factor of 2. AthenaMP has also evolved to become the parallel event-processing core of the recently developed ATLAS infrastructure for fine-grained event processing (Event Service) which allows the running of AthenaMP inside massively parallel distributed applications on hundreds of compute nodes simultaneously. We present the architecture of AthenaMP, various strategies implemented by AthenaMP for scheduling workload to worker processes (for example: Shared Event Queue and Shared Distributor of Event Tokens) and the usage of AthenaMP in the diversity of ATLAS event processing workloads on various computing resources: Grid, opportunistic resources and HPC.

  3. Parallel processing of face and house stimuli by V1 and specialized visual areas: a magnetoencephalographic (MEG) study

    PubMed Central

    Shigihara, Yoshihito; Zeki, Semir

    2014-01-01

    We used easily distinguishable stimuli of faces and houses constituted from straight lines, with the aim of learning whether they activate V1 on the one hand, and the specialized areas that are critical for the processing of faces and houses on the other, with similar latencies. Eighteen subjects took part in the experiment, which used magnetoencephalography (MEG) coupled to analytical methods to detect the time course of the earliest responses which these stimuli provoke in these cortical areas. Both categories of stimuli activated V1 and areas of the visual cortex outside it at around 40 ms after stimulus onset, and the amplitude elicited by face stimuli was significantly larger than that elicited by house stimuli. These results suggest that “low-level” and “high-level” features of form stimuli are processed in parallel by V1 and visual areas outside it. Taken together with our previous results on the processing of simple geometric forms (Shgihara and Zeki, 2013; Shigihara and Zeki, 2014), the present ones reinforce the conclusion that parallel processing is an important component in the strategy used by the brain to process and construct forms. PMID:25426050

  4. Intelligent approach for parallel HEV control strategy based on driving cycles

    NASA Astrophysics Data System (ADS)

    Montazeri-Gh, M.; Asadi, M.

    2011-02-01

    This article describes a methodological approach for the intelligent control of parallel hybrid electric vehicle (HEV) by the inclusion of the concept of driving cycles. In this approach, a fuzzy logic controller is designed to manage the internal combustion engine to work in the vicinity of its optimal condition instantaneously. In addition, based on the definition of microtrip, several driving patterns are classified that represent the congested to highway traffic conditions. The driving cycle and traffic conditions are then incorporated in an optimisation process to tune the fuzzy membership function parameters. In this study, the optimisation process is formulated to minimise the HEV fuel consumption (FC) and emissions as well as the satisfaction of the driving performance constraints. Finally, optimisation results are provided for three different driving cycles including ECE-EUDC, FTP and TEH-CAR. TEH-CAR is a driving cycle that is developed based on the experimental data collected from the real traffic condition in the city of Tehran. The results from the computer simulation show the effectiveness of the approach and reduction in FC and emissions while ensuring that the vehicle performance is not sacrificed.

  5. Parallel Digital Watermarking Process on Ultrasound Medical Images in Multicores Environment

    PubMed Central

    Khor, Hui Liang; Liew, Siau-Chuin; Zain, Jasni Mohd.

    2016-01-01

    With the advancement of technology in communication network, it facilitated digital medical images transmitted to healthcare professionals via internal network or public network (e.g., Internet), but it also exposes the transmitted digital medical images to the security threats, such as images tampering or inserting false data in the images, which may cause an inaccurate diagnosis and treatment. Medical image distortion is not to be tolerated for diagnosis purposes; thus a digital watermarking on medical image is introduced. So far most of the watermarking research has been done on single frame medical image which is impractical in the real environment. In this paper, a digital watermarking on multiframes medical images is proposed. In order to speed up multiframes watermarking processing time, a parallel watermarking processing on medical images processing by utilizing multicores technology is introduced. An experiment result has shown that elapsed time on parallel watermarking processing is much shorter than sequential watermarking processing. PMID:26981111

  6. Parallel Digital Watermarking Process on Ultrasound Medical Images in Multicores Environment.

    PubMed

    Khor, Hui Liang; Liew, Siau-Chuin; Zain, Jasni Mohd

    2016-01-01

    With the advancement of technology in communication network, it facilitated digital medical images transmitted to healthcare professionals via internal network or public network (e.g., Internet), but it also exposes the transmitted digital medical images to the security threats, such as images tampering or inserting false data in the images, which may cause an inaccurate diagnosis and treatment. Medical image distortion is not to be tolerated for diagnosis purposes; thus a digital watermarking on medical image is introduced. So far most of the watermarking research has been done on single frame medical image which is impractical in the real environment. In this paper, a digital watermarking on multiframes medical images is proposed. In order to speed up multiframes watermarking processing time, a parallel watermarking processing on medical images processing by utilizing multicores technology is introduced. An experiment result has shown that elapsed time on parallel watermarking processing is much shorter than sequential watermarking processing. PMID:26981111

  7. Parallel Digital Watermarking Process on Ultrasound Medical Images in Multicores Environment.

    PubMed

    Khor, Hui Liang; Liew, Siau-Chuin; Zain, Jasni Mohd

    2016-01-01

    With the advancement of technology in communication network, it facilitated digital medical images transmitted to healthcare professionals via internal network or public network (e.g., Internet), but it also exposes the transmitted digital medical images to the security threats, such as images tampering or inserting false data in the images, which may cause an inaccurate diagnosis and treatment. Medical image distortion is not to be tolerated for diagnosis purposes; thus a digital watermarking on medical image is introduced. So far most of the watermarking research has been done on single frame medical image which is impractical in the real environment. In this paper, a digital watermarking on multiframes medical images is proposed. In order to speed up multiframes watermarking processing time, a parallel watermarking processing on medical images processing by utilizing multicores technology is introduced. An experiment result has shown that elapsed time on parallel watermarking processing is much shorter than sequential watermarking processing.

  8. Recent development for the ITS code system: Parallel processing and visualization

    SciTech Connect

    Fan, W.C.; Turner, C.D.; Halbleib, J.A. Sr.; Kensek, R.P.

    1996-03-01

    A brief overview is given for two software developments related to the ITS code system. These developments provide parallel processing and visualization capabilities and thus allow users to perform ITS calculations more efficiently. Timing results and a graphical example are presented to demonstrate these capabilities.

  9. High Performance Parallel Processing Project: Industrial computing initiative. Progress reports for fiscal year 1995

    SciTech Connect

    Koniges, A.

    1996-02-09

    This project is a package of 11 individual CRADA`s plus hardware. This innovative project established a three-year multi-party collaboration that is significantly accelerating the availability of commercial massively parallel processing computing software technology to U.S. government, academic, and industrial end-users. This report contains individual presentations from nine principal investigators along with overall program information.

  10. Parallel processing in the honeybee olfactory pathway: structure, function, and evolution.

    PubMed

    Rössler, Wolfgang; Brill, Martin F

    2013-11-01

    Animals face highly complex and dynamic olfactory stimuli in their natural environments, which require fast and reliable olfactory processing. Parallel processing is a common principle of sensory systems supporting this task, for example in visual and auditory systems, but its role in olfaction remained unclear. Studies in the honeybee focused on a dual olfactory pathway. Two sets of projection neurons connect glomeruli in two antennal-lobe hemilobes via lateral and medial tracts in opposite sequence with the mushroom bodies and lateral horn. Comparative studies suggest that this dual-tract circuit represents a unique adaptation in Hymenoptera. Imaging studies indicate that glomeruli in both hemilobes receive redundant sensory input. Recent simultaneous multi-unit recordings from projection neurons of both tracts revealed widely overlapping response profiles strongly indicating parallel olfactory processing. Whereas lateral-tract neurons respond fast with broad (generalistic) profiles, medial-tract neurons are odorant specific and respond slower. In analogy to "what-" and "where" subsystems in visual pathways, this suggests two parallel olfactory subsystems providing "what-" (quality) and "when" (temporal) information. Temporal response properties may support across-tract coincidence coding in higher centers. Parallel olfactory processing likely enhances perception of complex odorant mixtures to decode the diverse and dynamic olfactory world of a social insect.

  11. One Factor or Two Parallel Processes? Comorbidity and Development of Adolescent Anxiety and Depressive Disorder Symptoms

    ERIC Educational Resources Information Center

    Hale, William W., III; Raaijmakers, Quinten A. W.; Muris, Peter; van Hoof, Anne; Meeus, Wim H. J.

    2009-01-01

    Background: This study investigates whether anxiety and depressive disorder symptoms of adolescents from the general community are best described by a model that assumes they are indicative of one general factor or by a model that assumes they are two distinct disorders with parallel growth processes. Additional analyses were conducted to explore…

  12. Cocaine Use and Delinquent Behavior among High-Risk Youths: A Growth Model of Parallel Processes

    ERIC Educational Resources Information Center

    Dembo, Richard; Sullivan, Christopher

    2009-01-01

    We report the results of a parallel-process, latent growth model analysis examining the relationships between cocaine use and delinquent behavior among youths. The study examined a sample of 278 justice-involved juveniles completing at least one of three follow-up interviews as part of a National Institute on Drug Abuse-funded study. The results…

  13. Parallel Distributed Processing at 25: Further Explorations in the Microstructure of Cognition

    ERIC Educational Resources Information Center

    Rogers, Timothy T.; McClelland, James L.

    2014-01-01

    This paper introduces a special issue of "Cognitive Science" initiated on the 25th anniversary of the publication of "Parallel Distributed Processing" (PDP), a two-volume work that introduced the use of neural network models as vehicles for understanding cognition. The collection surveys the core commitments of the PDP…

  14. A Neurally Plausible Parallel Distributed Processing Model of Event-Related Potential Word Reading Data

    ERIC Educational Resources Information Center

    Laszlo, Sarah; Plaut, David C.

    2012-01-01

    The Parallel Distributed Processing (PDP) framework has significant potential for producing models of cognitive tasks that approximate how the brain performs the same tasks. To date, however, there has been relatively little contact between PDP modeling and data from cognitive neuroscience. In an attempt to advance the relationship between…

  15. Tracking the Continuity of Language Comprehension: Computer Mouse Trajectories Suggest Parallel Syntactic Processing

    ERIC Educational Resources Information Center

    Farmer, Thomas A.; Cargill, Sarah A.; Hindy, Nicholas C.; Dale, Rick; Spivey, Michael J.

    2007-01-01

    Although several theories of online syntactic processing assume the parallel activation of multiple syntactic representations, evidence supporting simultaneous activation has been inconclusive. Here, the continuous and non-ballistic properties of computer mouse movements are exploited, by recording their streaming x, y coordinates to procure…

  16. Psychodrama: A Creative Approach for Addressing Parallel Process in Group Supervision

    ERIC Educational Resources Information Center

    Hinkle, Michelle Gimenez

    2008-01-01

    This article provides a model for using psychodrama to address issues of parallel process during group supervision. Information on how to utilize the specific concepts and techniques of psychodrama in relation to group supervision is discussed. A case vignette of the model is provided.

  17. An Inconvenient Truth: An Application of the Extended Parallel Process Model

    ERIC Educational Resources Information Center

    Goodall, Catherine E.; Roberto, Anthony J.

    2008-01-01

    "An Inconvenient Truth" is an Academy Award-winning documentary about global warming presented by Al Gore. This documentary is appropriate for a lesson on fear appeals and the extended parallel process model (EPPM). The EPPM is concerned with the effects of perceived threat and efficacy on behavior change. Perceived threat is composed of an…

  18. Using the Extended Parallel Process Model to Examine Teachers' Likelihood of Intervening in Bullying

    ERIC Educational Resources Information Center

    Duong, Jeffrey; Bradshaw, Catherine P.

    2013-01-01

    Background: Teachers play a critical role in protecting students from harm in schools, but little is known about their attitudes toward addressing problems like bullying. Previous studies have rarely used theoretical frameworks, making it difficult to advance this area of research. Using the Extended Parallel Process Model (EPPM), we examined the…

  19. Parallel Process and Isomorphism: A Model for Decision Making in the Supervisory Triad

    ERIC Educational Resources Information Center

    Koltz, Rebecca L.; Odegard, Melissa A.; Feit, Stephen S.; Provost, Kent; Smith, Travis

    2012-01-01

    Parallel process and isomorphism are two supervisory concepts that are often discussed independently but rarely discussed in connection with each other. These two concepts, philosophically, have different historical roots, as well as different implications for interventions with regard to the supervisory triad. The authors examine the difference…

  20. Parallel pulse processing and data acquisition for high speed, low error flow cytometry

    DOEpatents

    Engh, G.J. van den; Stokdijk, W.

    1992-09-22

    A digitally synchronized parallel pulse processing and data acquisition system for a flow cytometer has multiple parallel input channels with independent pulse digitization and FIFO storage buffer. A trigger circuit controls the pulse digitization on all channels. After an event has been stored in each FIFO, a bus controller moves the oldest entry from each FIFO buffer onto a common data bus. The trigger circuit generates an ID number for each FIFO entry, which is checked by an error detection circuit. The system has high speed and low error rate. 17 figs.

  1. Parallel pulse processing and data acquisition for high speed, low error flow cytometry

    DOEpatents

    van den Engh, Gerrit J.; Stokdijk, Willem

    1992-01-01

    A digitally synchronized parallel pulse processing and data acquisition system for a flow cytometer has multiple parallel input channels with independent pulse digitization and FIFO storage buffer. A trigger circuit controls the pulse digitization on all channels. After an event has been stored in each FIFO, a bus controller moves the oldest entry from each FIFO buffer onto a common data bus. The trigger circuit generates an ID number for each FIFO entry, which is checked by an error detection circuit. The system has high speed and low error rate.

  2. Real-time target detection technology of large view-field infrared image based on multicore DSP parallel processing

    NASA Astrophysics Data System (ADS)

    Sun, Gang; Liu, Songlin; Wang, Weihua; Chen, Zengping

    2013-10-01

    In order to implement real-time detection of hedgehopping target in large view-field infrared (LVIR) image, the paper proposes a fast algorithm flow to extract the target region of interest (ROI). The ground building region was rejected quickly and target ROI was segmented roughly through the background classification. Then the background image containing target ROI was matched with previous frame based on a mean removal normalized product correlation (MRNPC) similarity measure function. Finally, the target motion area was extracted by inter-frame difference in time domain. According to the proposed algorithm flow, this paper designs the high-speed real-time signal processing hardware platform based on FPGA + DSP, and also presents a new parallel processing strategy that called function-level and task-level, which could parallel process LVIR image by multi-core and multi-task. Experimental results show that the algorithm can extract low altitude aero target with complex background in large view effectively, and the new design hardware platform could implement real time processing of the IR image with 50000x288 pixels per second in large view-field infrared search system (LVIRSS).

  3. Big Data GPU-Driven Parallel Processing Spatial and Spatio-Temporal Clustering Algorithms

    NASA Astrophysics Data System (ADS)

    Konstantaras, Antonios; Skounakis, Emmanouil; Kilty, James-Alexander; Frantzeskakis, Theofanis; Maravelakis, Emmanuel

    2016-04-01

    Advances in graphics processing units' technology towards encompassing parallel architectures [1], comprised of thousands of cores and multiples of parallel threads, provide the foundation in terms of hardware for the rapid processing of various parallel applications regarding seismic big data analysis. Seismic data are normally stored as collections of vectors in massive matrices, growing rapidly in size as wider areas are covered, denser recording networks are being established and decades of data are being compiled together [2]. Yet, many processes regarding seismic data analysis are performed on each seismic event independently or as distinct tiles [3] of specific grouped seismic events within a much larger data set. Such processes, independent of one another can be performed in parallel narrowing down processing times drastically [1,3]. This research work presents the development and implementation of three parallel processing algorithms using Cuda C [4] for the investigation of potentially distinct seismic regions [5,6] present in the vicinity of the southern Hellenic seismic arc. The algorithms, programmed and executed in parallel comparatively, are the: fuzzy k-means clustering with expert knowledge [7] in assigning overall clusters' number; density-based clustering [8]; and a selves-developed spatio-temporal clustering algorithm encompassing expert [9] and empirical knowledge [10] for the specific area under investigation. Indexing terms: GPU parallel programming, Cuda C, heterogeneous processing, distinct seismic regions, parallel clustering algorithms, spatio-temporal clustering References [1] Kirk, D. and Hwu, W.: 'Programming massively parallel processors - A hands-on approach', 2nd Edition, Morgan Kaufman Publisher, 2013 [2] Konstantaras, A., Valianatos, F., Varley, M.R. and Makris, J.P.: 'Soft-Computing Modelling of Seismicity in the Southern Hellenic Arc', Geoscience and Remote Sensing Letters, vol. 5 (3), pp. 323-327, 2008 [3] Papadakis, S. and

  4. MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.

    PubMed

    Díaz, David; Esteban, Francisco J; Hernández, Pilar; Caballero, Juan Antonio; Guevara, Antonio; Dorado, Gabriel; Gálvez, Sergio

    2014-01-01

    We have developed the MC64-ClustalWP2 as a new implementation of the Clustal W algorithm, integrating a novel parallelization strategy and significantly increasing the performance when aligning long sequences in architectures with many cores. It must be stressed that in such a process, the detailed analysis of both the software and hardware features and peculiarities is of paramount importance to reveal key points to exploit and optimize the full potential of parallelism in many-core CPU systems. The new parallelization approach has focused into the most time-consuming stages of this algorithm. In particular, the so-called progressive alignment has drastically improved the performance, due to a fine-grained approach where the forward and backward loops were unrolled and parallelized. Another key approach has been the implementation of the new algorithm in a hybrid-computing system, integrating both an Intel Xeon multi-core CPU and a Tilera Tile64 many-core card. A comparison with other Clustal W implementations reveals the high-performance of the new algorithm and strategy in many-core CPU architectures, in a scenario where the sequences to align are relatively long (more than 10 kb) and, hence, a many-core GPU hardware cannot be used. Thus, the MC64-ClustalWP2 runs multiple alignments more than 18x than the original Clustal W algorithm, and more than 7x than the best x86 parallel implementation to date, being publicly available through a web service. Besides, these developments have been deployed in cost-effective personal computers and should be useful for life-science researchers, including the identification of identities and differences for mutation/polymorphism analyses, biodiversity and evolutionary studies and for the development of molecular markers for paternity testing, germplasm management and protection, to assist breeding, illegal traffic control, fraud prevention and for the protection of the intellectual property (identification

  5. Development and Evaluation of a Parallel Reaction Monitoring Strategy for Large-Scale Targeted Metabolomics Quantification.

    PubMed

    Zhou, Juntuo; Liu, Huiying; Liu, Yang; Liu, Jia; Zhao, Xuyang; Yin, Yuxin

    2016-04-19

    Recent advances in mass spectrometers which have yielded higher resolution and faster scanning speeds have expanded their application in metabolomics of diverse diseases. Using a quadrupole-Orbitrap LC-MS system, we developed an efficient large-scale quantitative method targeting 237 metabolites involved in various metabolic pathways using scheduled, parallel reaction monitoring (PRM). We assessed the dynamic range, linearity, reproducibility, and system suitability of the PRM assay by measuring concentration curves, biological samples, and clinical serum samples. The quantification performances of PRM and MS1-based assays in Q-Exactive were compared, and the MRM assay in QTRAP 6500 was also compared. The PRM assay monitoring 237 polar metabolites showed greater reproducibility and quantitative accuracy than MS1-based quantification and also showed greater flexibility in postacquisition assay refinement than the MRM assay in QTRAP 6500. We present a workflow for convenient PRM data processing using Skyline software which is free of charge. In this study we have established a reliable PRM methodology on a quadrupole-Orbitrap platform for evaluation of large-scale targeted metabolomics, which provides a new choice for basic and clinical metabolomics study. PMID:27002337

  6. Motor and perceptual sequence learning: different time course of parallel processes.

    PubMed

    Dirnberger, Georg; Novak-Knollmueller, Judith

    2013-07-10

    The aim was to determine the extent and time course of motor and perceptual learning in a procedural learning task, and the relation of these two processes. Because environmental constraints modulate the relative impact of different learning mechanisms, we chose a simple learning task similar to real-life exercise. Thirty-four healthy individuals performed a visuomotor serial reaction time task. Learning blocks with high stimulus-response compatibility were practiced repeatedly; in between these, participants performed test blocks with the same or a different (mirror-inverted, or new) stimulus sequence and/or with the same or a different (mirror-inverted) stimulus-response allocation. This design allowed us to measure the progress of motor learning and perceptual learning independently. Results showed that in the learning blocks, a steady reduction of the reaction times indicated that - as expected - participants improved their skills continuously. Analysis of the test blocks indicated that both motor learning and perceptual learning were significant. The two mechanisms were correlated (r=0.62, P<0.001). However, their time course was different: the impact of motor learning increased strongly from earlier to later intervals, whereas the progress of perceptual learning was more stable but slower. In conclusion, in a simple visuomotor learning task, participants can learn the motor sequence and the stimulus sequence in parallel. The positive correlation of motor and perceptual learning suggests that the two mechanisms act in synergy and are not alternative opposing strategies. The impact of these two learning mechanisms changes over time: motor learning sets in later and becomes relevant only in the course of training.

  7. A multi-satellite orbit determination problem in a parallel processing environment

    NASA Technical Reports Server (NTRS)

    Deakyne, M. S.; Anderle, R. J.

    1988-01-01

    The Engineering Orbit Analysis Unit at GE Valley Forge used an Intel Hypercube Parallel Processor to investigate the performance and gain experience of parallel processors with a multi-satellite orbit determination problem. A general study was selected in which major blocks of computation for the multi-satellite orbit computations were used as units to be assigned to the various processors on the Hypercube. Problems encountered or successes achieved in addressing the orbit determination problem would be more likely to be transferable to other parallel processors. The prime objective was to study the algorithm to allow processing of observations later in time than those employed in the state update. Expertise in ephemeris determination was exploited in addressing these problems and the facility used to bring a realism to the study which would highlight the problems which may not otherwise be anticipated. Secondary objectives were to gain experience of a non-trivial problem in a parallel processor environment, to explore the necessary interplay of serial and parallel sections of the algorithm in terms of timing studies, to explore the granularity (coarse vs. fine grain) to discover the granularity limit above which there would be a risk of starvation where the majority of nodes would be idle or under the limit where the overhead associated with splitting the problem may require more work and communication time than is useful.

  8. Parallel processing in a host plus multiple array processor system for radar

    NASA Technical Reports Server (NTRS)

    Barkan, B. Z.

    1983-01-01

    Host plus multiple array processor architecture is demonstrated to yield a modular, fast, and cost-effective system for radar processing. Software methodology for programming such a system is developed. Parallel processing with pipelined data flow among the host, array processors, and discs is implemented. Theoretical analysis of performance is made and experimentally verified. The broad class of problems to which the architecture and methodology can be applied is indicated.

  9. Comparing Binaural Pre-processing Strategies III

    PubMed Central

    Warzybok, Anna; Ernst, Stephan M. A.

    2015-01-01

    A comprehensive evaluation of eight signal pre-processing strategies, including directional microphones, coherence filters, single-channel noise reduction, binaural beamformers, and their combinations, was undertaken with normal-hearing (NH) and hearing-impaired (HI) listeners. Speech reception thresholds (SRTs) were measured in three noise scenarios (multitalker babble, cafeteria noise, and single competing talker). Predictions of three common instrumental measures were compared with the general perceptual benefit caused by the algorithms. The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. Ten listeners with NH and 12 HI listeners participated. The participants varied in age and pure-tone threshold levels. Although HI listeners required a better signal-to-noise ratio to obtain 50% intelligibility than listeners with NH, no differences in SRT benefit from the different algorithms were found between the two groups. With the exception of single-channel noise reduction, all algorithms showed an improvement in SRT of between 2.1 dB (in cafeteria noise) and 4.8 dB (in single competing talker condition). Model predictions with binaural speech intelligibility model explained 83% of the measured variance of the individual SRTs in the no pre-processing condition. Regarding the benefit from the algorithms, the instrumental measures were not able to predict the perceptual data in all tested noise conditions. The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. Although the model can predict the individual SRTs without pre-processing, further development is necessary to predict the benefits obtained from the algorithms at an individual level. PMID:26721922

  10. An automated workflow for parallel processing of large multiview SPIM recordings

    PubMed Central

    Schmied, Christopher; Steinbach, Peter; Pietzsch, Tobias; Preibisch, Stephan; Tomancak, Pavel

    2016-01-01

    Summary: Selective Plane Illumination Microscopy (SPIM) allows to image developing organisms in 3D at unprecedented temporal resolution over long periods of time. The resulting massive amounts of raw image data requires extensive processing interactively via dedicated graphical user interface (GUI) applications. The consecutive processing steps can be easily automated and the individual time points can be processed independently, which lends itself to trivial parallelization on a high performance computing (HPC) cluster. Here, we introduce an automated workflow for processing large multiview, multichannel, multiillumination time-lapse SPIM data on a single workstation or in parallel on a HPC cluster. The pipeline relies on snakemake to resolve dependencies among consecutive processing steps and can be easily adapted to any cluster environment for processing SPIM data in a fraction of the time required to collect it. Availability and implementation: The code is distributed free and open source under the MIT license http://opensource.org/licenses/MIT. The source code can be downloaded from github: https://github.com/mpicbg-scicomp/snakemake-workflows. Documentation can be found here: http://fiji.sc/Automated_workflow_for_parallel_Multiview_Reconstruction. Contact: schmied@mpi-cbg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26628585

  11. Massively parallel per-pixel-based zerotree processing architecture for real-time video compression

    NASA Astrophysics Data System (ADS)

    Alagoda, Geoffrey; Rassau, Alexander M.; Eshraghian, Kamran

    2001-11-01

    In the span of a few years, mobile multimedia communication has rapidly become a significant area of research and development constantly challenging boundaries on a variety of technological fronts. Video compression, a fundamental component for most mobile multimedia applications, generally places heavy demands in terms of the required processing capacity. Hardware implementations of typical modern hybrid codecs require realisation of components such as motion compensation, wavelet transform, quantisation, zerotree coding and arithmetic coding in real-time. While the implementation of such codecs using a fast generic processor is possible, undesirable trade-offs in terms of power consumption and speed must generally be made. The improvement in power consumption that is achievable through the use of a slow-clocked massively parallel processing environment, while maintaining real-time processing speeds, should thus not be overlooked. An architecture to realise such a massively parallel solution for a zerotree entropy coder is, therefore, presented in this paper.

  12. Parallel processing architecture for H.264 deblocking filter on multi-core platforms

    NASA Astrophysics Data System (ADS)

    Prasad, Durga P.; Sonachalam, Sekar; Kunchamwar, Mangesh K.; Gunupudi, Nageswara Rao

    2012-03-01

    filter for multi core platforms such as HyperX technology. Parallel techniques such as parallel processing of independent macroblocks, sub blocks, and pixel row level are examined in this work. The deblocking architecture consists of a basic cell called deblocking filter unit (DFU) and dependent data buffer manager (DFM). The DFU can be used in several instances, catering to different performance needs the DFM serves the data required for the different number of DFUs, and also manages all the neighboring data required for future data processing of DFUs. This approach achieves the scalability, flexibility, and performance excellence required in deblocking filters.

  13. A learnable parallel processing architecture towards unity of memory and computing

    NASA Astrophysics Data System (ADS)

    Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.

    2015-08-01

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  14. A learnable parallel processing architecture towards unity of memory and computing.

    PubMed

    Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J

    2015-08-14

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  15. A learnable parallel processing architecture towards unity of memory and computing.

    PubMed

    Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J

    2015-01-01

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area. PMID:26271243

  16. On the design and implementation of a parallel, object-oriented, image processing toolkit

    SciTech Connect

    Kamath, C; Baldwin, C; Fodor, I; Tang, N A

    2000-06-22

    Advanced in technology have enabled us to collect data from observations, experiments, and simulations at an ever increasing pace. As these data sets approach the terabyte and petabyte range, scientists are increasingly using semi-automated techniques from data mining and pattern recognition to find useful information in the data. In order for data mining to be successful, the raw data must first be processed into a form suitable for the detection of patterns. When the data is in the form of images, this can involve a substantial amount of processing on very large data sets. To help make this task more efficient, they are designing and implementing an object-oriented image processing toolkit that specifically targets massively-parallel, distributed-memory architectures. They first show that it is possible to use object-oriented technology to effectively address the diverse needs of image applications. Next, they describe how we abstract out the similarities in image processing algorithms to enable re-use in the software. They will also discuss the difficulties encountered in parallelizing image algorithms on massively parallel machines as well as the bottlenecks to high performance. They will demonstrate the work using images from an astronomical data set, and illustrate how techniques such as filters and denoising through the thresholding of wavelet coefficients can be applied when a large image is distributed across several processors.

  17. A learnable parallel processing architecture towards unity of memory and computing

    PubMed Central

    Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.

    2015-01-01

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area. PMID:26271243

  18. An FPGA-based High Speed Parallel Signal Processing System for Adaptive Optics Testbed

    NASA Astrophysics Data System (ADS)

    Kim, H.; Choi, Y.; Yang, Y.

    In this paper a state-of-the-art FPGA (Field Programmable Gate Array) based high speed parallel signal processing system (SPS) for adaptive optics (AO) testbed with 1 kHz wavefront error (WFE) correction frequency is reported. The AO system consists of Shack-Hartmann sensor (SHS) and deformable mirror (DM), tip-tilt sensor (TTS), tip-tilt mirror (TTM) and an FPGA-based high performance SPS to correct wavefront aberrations. The SHS is composed of 400 subapertures and the DM 277 actuators with Fried geometry, requiring high speed parallel computing capability SPS. In this study, the target WFE correction speed is 1 kHz; therefore, it requires massive parallel computing capabilities as well as strict hard real time constraints on measurements from sensors, matrix computation latency for correction algorithms, and output of control signals for actuators. In order to meet them, an FPGA based real-time SPS with parallel computing capabilities is proposed. In particular, the SPS is made up of a National Instrument's (NI's) real time computer and five FPGA boards based on state-of-the-art Xilinx Kintex 7 FPGA. Programming is done with NI's LabView environment, providing flexibility when applying different algorithms for WFE correction. It also facilitates faster programming and debugging environment as compared to conventional ones. One of the five FPGA's is assigned to measure TTS and calculate control signals for TTM, while the rest four are used to receive SHS signal, calculate slops for each subaperture and correction signal for DM. With this parallel processing capabilities of the SPS the overall closed-loop WFE correction speed of 1 kHz has been achieved. System requirements, architecture and implementation issues are described; furthermore, experimental results are also given.

  19. Serial processing in reading aloud: no challenge for a parallel model.

    PubMed

    Zorzi, M

    2000-04-01

    K. Rastle and M. Coltheart (1999) challenged parallel models of reading by showing that the cost of irregularity in low-frequency exception words was modulated by the position of the irregularity in the word. This position-of-irregularity effect was taken as strong evidence of serial processing in reading. This article refutes Rastle and Coltheart's theoretical conclusions in 3 ways: First, a parallel model, the connectionist dual process model (M. Zorzi, G. Houghton, & B. Butterworth, 1998b), produces a position-of-irregularity effect. Second, the supposed serial effect can be reduced to a position-specific grapheme-phoneme consistency effect. Third, the position-of-irregularity effect vanishes when the experimental data are reanalyzed using grapheme-phoneme consistency as the covariate. This demonstration has broader implications for studies aiming at adjudicating between models: Strong inferences should be avoided until the computational models are actually tested.

  20. A structured command history for UNIX using a parallel distributed processing model

    SciTech Connect

    Uejio, J.Y.

    1989-03-01

    This thesis investigates the use of a structured history to assist users in recalling previously entered complex UNIX commands. A structured history is a database of commands that were previously entered. Two models are presented: the first model uses a conventional database and the second model uses a parallel distributed processing system. The conventional database system can recall commands by pattern matching based on command name, pattern matching on command options, frequency of use, and relative time. A simple prototype, created using the Emacs environment, was useful in recalling previously entered UNIX commands. The parallel distributed processing system stores UNIX commands by decomposing them into two character sequences called bigrams. A prototype implementation used a two layer network, in which each unit represents a bigram. The implementation showed features such as spelling correction and associated command recall. The ability to recall a list of commands satisfying a particular criteria is among the advantages of a structured history. 19 refs., 8 figs., 1 tab.

  1. Lower upper cycle independent implicit parallel processing algorithm for the compressible Navier-Stokes equations

    NASA Astrophysics Data System (ADS)

    Ghizawi, Nidal Awni

    Computational Fluid Dynamics problems of engineering interest are among the most demanding scientific problems in terms of the massive computational resources they require. Only parallel architecture computers offer the promise of providing orders of magnitude greater computational power. A common feature of the currently available implicit flow solvers for the compressible Navier-Stokes equations is that the solution for a multi-dimensional problem is obtained by the solution of a set of dependent problems which must be computed in series. In this study, a lower upper cycle independent (LUCI) implicit parallel processing algorithm for solving the compressible Navier-Stokes equations is proposed. A characteristic feature of this algorithm is that the solution for a multi-dimensional problem is obtained by the superposition of the solution of a set of independent problems which, therefore, enhances its parallel processing functionality. The accuracy and stability of this algorithm are carefully analyzed and compared with those of other algorithms. Flow computations using the LUCI algorithm are performed for two test cases which show the symmetry preserving property of this algorithm and demonstrate its accuracy. Through employing the principle of pseudo-parallelism, effects of domain decomposition on the stability and convergence of the LUCI and the Symmetric Successive Over-Relaxation (SSOR) schemes (representative of cycle dependent implicit schemes) are analyzed and quantified. Parallel implementation details of the LUCI (in two VERSIONS: I and II) and the SSOR (in VERSION I) schemes using the standard (portable) Message Passing Interface (MPI) on two computational platforms are given. These platforms are: Lewis Advanced Cluster Environment (LACE) which is an example of Network of Workstations (NOWs), and the Ohio Supercomputer CRAY T3D massive parallel computing environment. Parallel performance results indicate that VERSION I of the LUCI scheme is superior to

  2. Scalability of preconditioners as a strategy for parallel computation of compressible fluid flow

    SciTech Connect

    Hansen, G.A.

    1996-05-01

    Parallel implementations of a Newton-Krylov-Schwarz algorithm are used to solve a model problem representing low Mach number compressible fluid flow over a backward-facing step. The Mach number is specifically selected to result in a numerically {open_quote}stiff{close_quotes} matrix problem, based on an implicit finite volume discretization of the compressible 2D Navier-Stokes/energy equations using primitive variables. Newton`s method is used to linearize the discrete system, and a preconditioned Krylov projection technique is used to solve the resulting linear system. Domain decomposition enables the development of a global preconditioner via the parallel construction of contributions derived from subdomains. Formation of the global preconditioner is based upon additive and multiplicative Schwarz algorithms, with and without subdomain overlap. The degree of parallelism of this technique is further enhanced with the use of a matrix-free approximation for the Jacobian used in the Krylov technique (in this case, GMRES(k)). Of paramount interest to this study is the implementation and optimization of these techniques on parallel shared-memory hardware, namely the Cray C90 and SGI Challenge architectures. These architectures were chosen as representative and commonly available to researchers interested in the solution of problems of this type. The Newton-Krylov-Schwarz solution technique is increasingly being investigated for computational fluid dynamics (CFD) applications due to the advantages of full coupling of all variables and equations, rapid non-linear convergence, and moderate memory requirements. A parallel version of this method that scales effectively on the above architectures would be extremely attractive to practitioners, resulting in efficient, cost-effective, parallel solutions exhibiting the benefits of the solution technique.

  3. Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts

    SciTech Connect

    1997-12-31

    This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.

  4. Initial operating capability for the hypercluster parallel-processing test bed

    NASA Technical Reports Server (NTRS)

    Cole, Gary L.; Blech, Richard A.; Quealy, Angela

    1989-01-01

    The NASA Lewis Research Center is investigating the benefits of parallel processing to applications in computational fluid and structural mechanics. To aid this investigation, NASA Lewis is developing the Hypercluster, a multi-architecture, parallel-processing test bed. The initial operating capability (IOC) being developed for the Hypercluster is described. The IOC will provide a user with a programming/operating environment that is interactive, responsive, and easy to use. The IOC effort includes the development of the Hypercluster Operating System (HYCLOPS). HYCLOPS runs in conjunction with a vendor-supplied disk operating system on a Front-End Processor (FEP) to provide interactive, run-time operations such as program loading, execution, memory editing, and data retrieval. Run-time libraries, that augment the FEP FORTRAN libraries, are being developed to support parallel and vector processing on the Hypercluster. Special utilities are being provided to enable passage of information about application programs and their mapping to the operating system. Communications between the FEP and the Hypercluster are being handled by dedicated processors, each running a Message-Passing Kernel, (MPK). A shared-memory interface allows rapid data exchange between HYCLOPS and the communications processors. Input/output handlers are built into the HYCLOPS-MPK interface, eliminating the need for the user to supply separate I/O support programs on the FEP.

  5. Parallel particle swarm optimization on a graphics processing unit with application to trajectory optimization

    NASA Astrophysics Data System (ADS)

    Wu, Q.; Xiong, F.; Wang, F.; Xiong, Y.

    2016-10-01

    In order to reduce the computational time, a fully parallel implementation of the particle swarm optimization (PSO) algorithm on a graphics processing unit (GPU) is presented. Instead of being executed on the central processing unit (CPU) sequentially, PSO is executed in parallel via the GPU on the compute unified device architecture (CUDA) platform. The processes of fitness evaluation, updating of velocity and position of all particles are all parallelized and introduced in detail. Comparative studies on the optimization of four benchmark functions and a trajectory optimization problem are conducted by running PSO on the GPU (GPU-PSO) and CPU (CPU-PSO). The impact of design dimension, number of particles and size of the thread-block in the GPU and their interactions on the computational time is investigated. The results show that the computational time of the developed GPU-PSO is much shorter than that of CPU-PSO, with comparable accuracy, which demonstrates the remarkable speed-up capability of GPU-PSO.

  6. Fast phase processing in off-axis holography by CUDA including parallel phase unwrapping.

    PubMed

    Backoach, Ohad; Kariv, Saar; Girshovitz, Pinhas; Shaked, Natan T

    2016-02-22

    We present parallel processing implementation for rapid extraction of the quantitative phase maps from off-axis holograms on the Graphics Processing Unit (GPU) of the computer using computer unified device architecture (CUDA) programming. To obtain efficient implementation, we parallelized both the wrapped phase map extraction algorithm and the two-dimensional phase unwrapping algorithm. In contrast to previous implementations, we utilized unweighted least squares phase unwrapping algorithm that better suits parallelism. We compared the proposed algorithm run times on the CPU and the GPU of the computer for various sizes of off-axis holograms. Using the GPU implementation, we extracted the unwrapped phase maps from the recorded off-axis holograms at 35 frames per second (fps) for 4 mega pixel holograms, and at 129 fps for 1 mega pixel holograms, which presents the fastest processing framerates obtained so far, to the best of our knowledge. We then used common-path off-axis interferometric imaging to quantitatively capture the phase maps of a micro-organism with rapid flagellum movements. PMID:26906982

  7. Understanding decimal proportions: discrete representations, parallel access, and privileged processing of zero.

    PubMed

    Varma, Sashank; Karl, Stacy R

    2013-05-01

    Much of the research on mathematical cognition has focused on the numbers 1, 2, 3, 4, 5, 6, 7, 8, and 9, with considerably less attention paid to more abstract number classes. The current research investigated how people understand decimal proportions--rational numbers between 0 and 1 expressed in the place-value symbol system. The results demonstrate that proportions are represented as discrete structures and processed in parallel. There was a semantic interference effect: When understanding a proportion expression (e.g., "0.29"), both the correct proportion referent (e.g., 0.29) and the incorrect natural number referent (e.g., 29) corresponding to the visually similar natural number expression (e.g., "29") are accessed in parallel, and when these referents lead to conflicting judgments, performance slows. There was also a syntactic interference effect, generalizing the unit-decade compatibility effect for natural numbers: When comparing two proportions, their tenths and hundredths components are processed in parallel, and when the different components lead to conflicting judgments, performance slows. The results also reveal that zero decimals--proportions ending in zero--serve multiple cognitive functions, including eliminating semantic interference and speeding processing. The current research also extends the distance, semantic congruence, and SNARC effects from natural numbers to decimal proportions. These findings inform how people understand the place-value symbol system, and the mental implementation of mathematical symbol systems more generally.

  8. GWM-VI: groundwater management with parallel processing for multiple MODFLOW versions

    USGS Publications Warehouse

    Banta, Edward R.; Ahlfeld, David P.

    2013-01-01

    Groundwater Management–Version Independent (GWM–VI) is a new version of the Groundwater Management Process of MODFLOW. The Groundwater Management Process couples groundwater-flow simulation with a capability to optimize stresses on the simulated aquifer based on an objective function and constraints imposed on stresses and aquifer state. GWM–VI extends prior versions of Groundwater Management in two significant ways—(1) it can be used with any version of MODFLOW that meets certain requirements on input and output, and (2) it is structured to allow parallel processing of the repeated runs of the MODFLOW model that are required to solve the optimization problem. GWM–VI uses the same input structure for files that describe the management problem as that used by prior versions of Groundwater Management. GWM–VI requires only minor changes to the input files used by the MODFLOW model. GWM–VI uses the Joint Universal Parameter IdenTification and Evaluation of Reliability Application Programming Interface (JUPITER-API) to implement both version independence and parallel processing. GWM–VI communicates with the MODFLOW model by manipulating certain input files and interpreting results from the MODFLOW listing file and binary output files. Nearly all capabilities of prior versions of Groundwater Management are available in GWM–VI. GWM–VI has been tested with MODFLOW-2005, MODFLOW-NWT (a Newton formulation for MODFLOW-2005), MF2005-FMP2 (the Farm Process for MODFLOW-2005), SEAWAT, and CFP (Conduit Flow Process for MODFLOW-2005). This report provides sample problems that demonstrate a range of applications of GWM–VI and the directory structure and input information required to use the parallel-processing capability.

  9. Real-time processing of radar return on a parallel computer

    NASA Technical Reports Server (NTRS)

    Aalfs, David D.

    1992-01-01

    NASA is working with the FAA to demonstrate the feasibility of pulse Doppler radar as a candidate airborne sensor to detect low altitude windshears. The need to provide the pilot with timely information about possible hazards has motivated a demand for real-time processing of a radar return. Investigated here is parallel processing as a means of accommodating the high data rates required. A PC based parallel computer, called the transputer, is used to investigate issues in real time concurrent processing of radar signals. A transputer network is made up of an array of single instruction stream processors that can be networked in a variety of ways. They are easily reconfigured and software development is largely independent of the particular network topology. The performance of the transputer is evaluated in light of the computational requirements. A number of algorithms have been implemented on the transputers in OCCAM, a language specially designed for parallel processing. These include signal processing algorithms such as the Fast Fourier Transform (FFT), pulse-pair, and autoregressive modelling, as well as routing software to support concurrency. The most computationally intensive task is estimating the spectrum. Two approaches have been taken on this problem, the first and most conventional of which is to use the FFT. By using table look-ups for the basis function and other optimizing techniques, an algorithm has been developed that is sufficient for real time. The other approach is to model the signal as an autoregressive process and estimate the spectrum based on the model coefficients. This technique is attractive because it does not suffer from the spectral leakage problem inherent in the FFT. Benchmark tests indicate that autoregressive modeling is feasible in real time.

  10. Parallel processing of general and specific threat during early stages of perception.

    PubMed

    You, Yuqi; Li, Wen

    2016-03-01

    Differential processing of threat can consummate as early as 100 ms post-stimulus. Moreover, early perception not only differentiates threat from non-threat stimuli but also distinguishes among discrete threat subtypes (e.g. fear, disgust and anger). Combining spatial-frequency-filtered images of fear, disgust and neutral scenes with high-density event-related potentials and intracranial source estimation, we investigated the neural underpinnings of general and specific threat processing in early stages of perception. Conveyed in low spatial frequencies, fear and disgust images evoked convergent visual responses with similarly enhanced N1 potentials and dorsal visual (middle temporal gyrus) cortical activity (relative to neutral cues; peaking at 156 ms). Nevertheless, conveyed in high spatial frequencies, fear and disgust elicited divergent visual responses, with fear enhancing and disgust suppressing P1 potentials and ventral visual (occipital fusiform) cortical activity (peaking at 121 ms). Therefore, general and specific threat processing operates in parallel in early perception, with the ventral visual pathway engaged in specific processing of discrete threats and the dorsal visual pathway in general threat processing. Furthermore, selectively tuned to distinctive spatial-frequency channels and visual pathways, these parallel processes underpin dimensional and categorical threat characterization, promoting efficient threat response. These findings thus lend support to hybrid models of emotion.

  11. A targeted enrichment strategy for massively parallel sequencing of angiosperm plastid genomes1

    PubMed Central

    Stull, Gregory W.; Moore, Michael J.; Mandala, Venkata S.; Douglas, Norman A.; Kates, Heather-Rose; Qi, Xinshuai; Brockington, Samuel F.; Soltis, Pamela S.; Soltis, Douglas E.; Gitzendanner, Matthew A.

    2013-01-01

    • Premise of the study: We explored a targeted enrichment strategy to facilitate rapid and low-cost next-generation sequencing (NGS) of numerous complete plastid genomes from across the phylogenetic breadth of angiosperms. • Methods and Results: A custom RNA probe set including the complete sequences of 22 previously sequenced eudicot plastomes was designed to facilitate hybridization-based targeted enrichment of eudicot plastid genomes. Using this probe set and an Agilent SureSelect targeted enrichment kit, we conducted an enrichment experiment including 24 angiosperms (22 eudicots, two monocots), which were subsequently sequenced on a single lane of the Illumina GAIIx with single-end, 100-bp reads. This approach yielded nearly complete to complete plastid genomes with exceptionally high coverage (mean coverage: 717×), even for the two monocots. • Conclusions: Our enrichment experiment was highly successful even though many aspects of the capture process employed were suboptimal. Hence, significant improvements to this methodology are feasible. With this general approach and probe set, it should be possible to sequence more than 300 essentially complete plastid genomes in a single Illumina GAIIx lane (achieving ∼50× mean coverage). However, given the complications of pooling numerous samples for multiplex sequencing and the limited number of barcodes (e.g., 96) available in commercial kits, we recommend 96 samples as a current practical maximum for multiplex plastome sequencing. This high-throughput approach should facilitate large-scale plastid genome sequencing at any level of phylogenetic diversity in angiosperms. PMID:25202518

  12. Acculturation strategies, coping process and acculturative stress.

    PubMed

    Kosic, Ankica

    2004-09-01

    Using structural equation modeling, this study examines the influences of motivational factors (Need for Cognitive Closure--NCC--and Decisiveness), coping strategies and acculturation strategies on levels of acculturative stress. Two groups of immigrants in Rome (Croatians n= 156 and Poles n= 179) completed a questionnaire that included scales for the various factors. Although our initial hypothesized model was not confirmed, a modified model showed that the motivational factors of NCC and Decisiveness indirectly influence acculturative stress. The modified model with good fit indices indicated that the relationship between NCC and Decisiveness are mediated by coping strategies and acculturation strategies. Specifically, NCC is associated positively with avoidance coping, which in turn is negatively associated with the host group relationships and positively with the original culture maintenance. The last two dimensions predicted lower levels of acculturative stress. Decisiveness was positively associated with the problem-oriented coping and, negatively, with emotional and avoidance coping. PMID:15281915

  13. Real-time hybrid joint transform correlator with parallel processing architecture

    NASA Astrophysics Data System (ADS)

    Qin, Yuwen; Ge, Bao-Zhen; Zhang, Yimo; Zhao, Xiao-Dong; Huang, Zhanhua

    1996-12-01

    A real-time hybrid joint transform correlator (JTC) with parallel processing architecture that use two liquid crystal light valves spatial light modulators, two VP32 image boards and two optical wavefront-division multiplexers as the key parts was presented. Using this hybrid JTC< real-time high- efficiency joint transform correlation, high-speed joint transform correlation and four-channel joint transform correlation were realized. The hybrid JTC system has also been used in the domain of morphological complex-valued kernel scale-space image processing. In this paper, the principles of the above experiments are described, experimental results are also given and analyzed.

  14. Scheduling Jobs with Variable Job Processing Times on Unrelated Parallel Machines

    PubMed Central

    Zhang, Guang-Qian; Wang, Jian-Jun; Liu, Ya-Jing

    2014-01-01

    m unrelated parallel machines scheduling problems with variable job processing times are considered, where the processing time of a job is a function of its position in a sequence, its starting time, and its resource allocation. The objective is to determine the optimal resource allocation and the optimal schedule to minimize a total cost function that dependents on the total completion (waiting) time, the total machine load, the total absolute differences in completion (waiting) times on all machines, and total resource cost. If the number of machines is a given constant number, we propose a polynomial time algorithm to solve the problem. PMID:24982933

  15. Lamb wave propagation modelling and simulation using parallel processing architecture and graphical cards

    NASA Astrophysics Data System (ADS)

    Paćko, P.; Bielak, T.; Spencer, A. B.; Staszewski, W. J.; Uhl, T.; Worden, K.

    2012-07-01

    This paper demonstrates new parallel computation technology and an implementation for Lamb wave propagation modelling in complex structures. A graphical processing unit (GPU) and computer unified device architecture (CUDA), available in low-cost graphical cards in standard PCs, are used for Lamb wave propagation numerical simulations. The local interaction simulation approach (LISA) wave propagation algorithm has been implemented as an example. Other algorithms suitable for parallel discretization can also be used in practice. The method is illustrated using examples related to damage detection. The results demonstrate good accuracy and effective computational performance of very large models. The wave propagation modelling presented in the paper can be used in many practical applications of science and engineering.

  16. Development of Three-Dimensional Integration Technology for Highly Parallel Image-Processing Chip

    NASA Astrophysics Data System (ADS)

    Lee, Kang Wook; Nakamura, Tomonori; Sakuma, Katsuyuki; Park, Ki Tae; Shimazutsu, Hiroaki; Miyakawa, Nobuaki; Kim, Ki Yoon; Kurino, Hiroyuki; Koyanagi, Mitsumasa

    2000-04-01

    A new three-dimensional (3D) integration technology for realizing a highly parallel image-processing chip has been developed. Several LSI wafers are vertically stacked and glued to each other after thinning them using this new technology. This technology can be considered as both 3D LSI technology and wafer-scale 3D chip-on-chip packaging technology. The effective packaging density can be significantly increased by stacking the chips in a vertical direction. Several key techniques for this 3D integration have been developed. In this paper, we demonstrate the highly parallel image sensor chip with a 3D structure. The 3D image sensor test chip was fabricated using this new 3D integration technology and its basic performance was evaluated.

  17. Parallel performance of the fine-grain pipeline FPGA image processing system

    NASA Astrophysics Data System (ADS)

    Gorgoń, M.

    2012-06-01

    The use of FPGA circuits in imaging systems increases. They compete with other computing environments. The article describes the indications to be followed while choosing the type of image processing computing system taking under consideration the advantages and disadvantages of each technology: general purpose processor, digital signal processor, graphical processing unit, application specific Integrated circuit and field programmable gate array. Attention is drawn to various video transmission standards. The state of research and development trends in the field of FPGA-based image processing are briefly presented. A defining processing performance method for image processing is proposed. It is proven that for a pipeline architecture implemented in FPGA, a linear speedup is achieved and parallel efficiency is equal to one.

  18. Parallel architecture for labeling, segmentation, and lexical processing in speech understanding

    SciTech Connect

    Bronson, E.C.; Siegel, L.J.

    1983-01-01

    Speech understanding is a complex task which requires extensive computation. To increase the processing speed, a speech understanding system is decomposed into tasks which can be performed by a series of distributed processing subsystems. An architecture to perform labeling, segmentation, and lexical processing is described. Using a parametric characterization of the speech signal, this system divides an utterance into labeled homogeneous regions. The system then performs dictionary lookups based on all probable labelings and segmentations in order to generate a complete set of word hypotheses. Using realistic assumptions from existing speech understanding systems, a statistical model of speech input, and simulations of the speech processing algorithms, the attributes of the parallel system to perform labeling, segmentation, and lexical processing for real-time speech understanding are derived. 36 references.

  19. A Parallel and Distributed Processing Model of Joint Attention, Social-Cognition and Autism

    PubMed Central

    Mundy, Peter; Sullivan, Lisa; Mastergeorge, Ann M.

    2009-01-01

    Scientific Abstract The impaired development of joint attention is a cardinal feature of autism. Therefore, understanding the nature of joint attention is a central to research on this disorder. Joint attention may be best defined in terms of an information processing system that begins to develop by 4–6 months of age. This system integrates the parallel processing of internal information about one’s own visual attention with external information about the visual attention of other people. This type of joint encoding of information about self and other attention requires the activation of a distributed anterior and posterior cortical attention network. Genetic regulation, in conjunction with self-organizing behavioral activity guides the development of functional connectivity in this network. With practice in infancy the joint processing of self-other attention becomes automatically engaged as an executive function. It can be argued that this executive joint-attention is fundamental to human learning, as well as the development of symbolic thought, social-cognition and social-competence throughout the life span. One advantage of this parallel and distributed processing model of joint attention (PDPM) is that it directly connects theory on social pathology to a range of phenomenon in autism associated with neural connectivity, constructivist and connectionist models of cognitive development, early intervention, activity-dependent gene expression, and atypical ocular motor control. PMID:19358304

  20. An experimental research on the mixing process of supersonic oxygen-iodine parallel streams

    NASA Astrophysics Data System (ADS)

    Wang, Zengqiang; Sang, Fengting; Zhang, Yuelong; Hui, Xiaokang; Xu, Mingxiu; Zhang, Peng; Zhao, Weili; Fang, Benjie; Duo, Liping; Jin, Yuqi

    2014-12-01

    The O2(1Δ)/I2 mixing process is one of the most important steps in chemical oxygen-iodine laser (COIL). Based on the chemical fluorescence method (CFM), a diagnostic system was set up to image electronically excited fluorescent I2(B3П0) by means of a high speed camera. An optimized data analysis approach was proposed to analyze the mixing process of supersonic oxygen-iodine parallel streams, employing a set of qualitative and quantitative parameters and a proper percentage boundary threshold of the fluorescence zone. A slit nozzle bank with supersonic parallel streams and a trip tab set for enhancing the mixing process were designed and fabricated. With the diagnostic system and the data analysis approach, the performance of the trip tab set was examined and is demonstrated in this work. With the mixing enhancement, the fluorescence zone area was enlarged 3.75 times. We have studied the mixing process under different flow conditions and demonstrated the mixing properties with different iodine buffer gases, including N2, Ar, He and CO2. It was found that, among the four tested gases, Ar had the best penetration ability, whilst He showed the best free diffusion ability, and both of them could be well used as the buffer gas in our experiments. These experimental results can be useful for designing and optimizing COIL systems.

  1. Locality-Aware Parallel Process Mapping for Multi-Core HPC Systems

    SciTech Connect

    Hursey, Joshua J; Squyres, Jeffrey M.; Dontje, Terry

    2011-01-01

    High Performance Computing (HPC) systems are composed of servers containing an ever-increasing number of cores. With such high processor core counts, non-uniform memory access (NUMA) architectures are almost universally used to reduce inter-processor and memory communication bottlenecks by distributing processors and memory throughout a server-internal networking topology. Application studies have shown that the tuning of processes placement in a server s NUMA networking topology to the application can have a dramatic impact on performance. The performance implications are magnified when running a parallel job across multiple server nodes, especially with large scale HPC applications. This paper presents the Locality-Aware Mapping Algorithm (LAMA) for distributing the individual processes of a parallel application across processing resources in an HPC system, paying particular attention to the internal server NUMA topologies. The algorithm is able to support both homogeneous and heterogeneous hardware systems, and dynamically adapts to the available hardware and user-specified process layout at run-time. As implemented in Open MPI, the LAMA provides 362,880 mapping permutations and is able to naturally scale out to additional hardware resources as they become available in future architectures.

  2. Parallel Optical Control of Spatiotemporal Neuronal Spike Activity Using High-Speed Digital Light Processing

    PubMed Central

    Jerome, Jason; Foehring, Robert C.; Armstrong, William E.; Spain, William J.; Heck, Detlef H.

    2011-01-01

    Neurons in the mammalian neocortex receive inputs from and communicate back to thousands of other neurons, creating complex spatiotemporal activity patterns. The experimental investigation of these parallel dynamic interactions has been limited due to the technical challenges of monitoring or manipulating neuronal activity at that level of complexity. Here we describe a new massively parallel photostimulation system that can be used to control action potential firing in in vitro brain slices with high spatial and temporal resolution while performing extracellular or intracellular electrophysiological measurements. The system uses digital light processing technology to generate 2-dimensional (2D) stimulus patterns with >780,000 independently controlled photostimulation sites that operate at high spatial (5.4 μm) and temporal (>13 kHz) resolution. Light is projected through the quartz–glass bottom of the perfusion chamber providing access to a large area (2.76 mm × 2.07 mm) of the slice preparation. This system has the unique capability to induce temporally precise action potential firing in large groups of neurons distributed over a wide area covering several cortical columns. Parallel photostimulation opens up new opportunities for the in vitro experimental investigation of spatiotemporal neuronal interactions at a broad range of anatomical scales. PMID:21904526

  3. FAST Observations of Acceleration Processes in the Cusp--Evidence for Parallel Electric Fields

    NASA Technical Reports Server (NTRS)

    Pfaff, R. F.. Jr.; Carlson, C.; McFadden, J.; Ergun, R.; Clemmons, J.; Klumpar D.; Strangeway, R.

    1999-01-01

    The existence of precipitating keV ions in the Earth's cusp originating at the magnetosheath provide unique means to test our understanding of particle acceleration and parallel electric fields in the lower altitude acceleration region. On numerous occasions, the FAST (The Fast Auroral Snapshot) spacecraft has encountered the Earth's cusp regions near its apogee of 4175 km which are characterized by their signatures of dispersed keV ion injections. The FAST instruments also reveal a complex microphysics inherent to many, but not all, of the cusp regions encountered by the spacecraft, that include upgoing ion beams and conics, inverted-V electrons, upgoing electron beams, and spikey DC-coupled electric fields and plasma waves. Detailed inspection of the FAST data often show clear modulation of the precipitating magnetosheath ions that indicate that they are affected by local electric potentials. For example, the magnetosheath ion precipitation is sometimes abruptly shut off precisely in regions where downgoing localized inverted-V electrons are observed. Such observations support the existence of a localized process, such as parallel electric fields, above the spacecraft which accelerate the electrons downward and consequently impede the precipitating ion precipitation. Other acceleration events in the cusp are sometimes organized with an apparent cellular structure that suggests Alfven waves or other large-scale phenomena are controlling the localized potentials. We examine several cusp encounters by the FAST satellite where the modulation of energetic session on acceleration particle populations reveals evidence of localized acceleration, most likely by parallel electric fields.

  4. Mobile Monitoring Data Processing & Analysis Strategies

    EPA Science Inventory

    The development of portable, high-time resolution instruments for measuring the concentrations of a variety of air pollutants has made it possible to collect data while in motion. This strategy, known as mobile monitoring, involves mounting air sensors on variety of different pla...

  5. Mobile Monitoring Data Processing and Analysis Strategies

    EPA Science Inventory

    The development of portable, high-time resolution instruments for measuring the concentrations of a variety of air pollutants has made it possible to collect data while in motion. This strategy, known as mobile monitoring, involves mounting air sensors on variety of different pla...

  6. A Pervasive Parallel Processing Framework for Data Visualization and Analysis at Extreme Scale

    SciTech Connect

    Moreland, Kenneth; Geveci, Berk

    2014-11-01

    The evolution of the computing world from teraflop to petaflop has been relatively effortless, with several of the existing programming models scaling effectively to the petascale. The migration to exascale, however, poses considerable challenges. All industry trends infer that the exascale machine will be built using processors containing hundreds to thousands of cores per chip. It can be inferred that efficient concurrency on exascale machines requires a massive amount of concurrent threads, each performing many operations on a localized piece of data. Currently, visualization libraries and applications are based off what is known as the visualization pipeline. In the pipeline model, algorithms are encapsulated as filters with inputs and outputs. These filters are connected by setting the output of one component to the input of another. Parallelism in the visualization pipeline is achieved by replicating the pipeline for each processing thread. This works well for today’s distributed memory parallel computers but cannot be sustained when operating on processors with thousands of cores. Our project investigates a new visualization framework designed to exhibit the pervasive parallelism necessary for extreme scale machines. Our framework achieves this by defining algorithms in terms of worklets, which are localized stateless operations. Worklets are atomic operations that execute when invoked unlike filters, which execute when a pipeline request occurs. The worklet design allows execution on a massive amount of lightweight threads with minimal overhead. Only with such fine-grained parallelism can we hope to fill the billions of threads we expect will be necessary for efficient computation on an exascale machine.

  7. A Theory of Interactive Parallel Processing: New Capacity Measures and Predictions for a Response Time Inequality Series

    ERIC Educational Resources Information Center

    Townsend, James T.; Wenger, Michael J.

    2004-01-01

    The authors present a theory of stochastic interactive parallel processing with special emphasis on channel interactions and their relation to system capacity. The approach is based both on linear systems theory augmented with stochastic elements and decisional operators and on a metatheory of parallel channels' dependencies that incorporates…

  8. Parallel Demand-Withdraw Processes in Family Therapy for Adolescent Drug Abuse

    PubMed Central

    Rynes, Kristina N.; Rohrbaugh, Michael J.; Lebensohn-Chialvo, Florencia; Shoham, Varda

    2013-01-01

    Isomorphism, or parallel process, occurs in family therapy when patterns of therapist-client interaction replicate problematic interaction patterns within the family. This study investigated parallel demand-withdraw processes in Brief Strategic Family Therapy (BSFT) for adolescent drug abuse, hypothesizing that therapist-demand/adolescent-withdraw interaction (TD/AW) cycles observed early in treatment would predict poor adolescent outcomes at follow-up for families who exhibited entrenched parent-demand/adolescent-withdraw interaction (PD/AW) before treatment began. Participants were 91 families who received at least 4 sessions of BSFT in a multi-site clinical trial on adolescent drug abuse (Robbins et al., 2011). Prior to receiving therapy, families completed videotaped family interaction tasks from which trained observers coded PD/AW. Another team of raters coded TD/AW during two early BSFT sessions. The main dependent variable was the number of drug use days that adolescents reported in Timeline Follow-Back interviews 7 to 12 months after family therapy began. Zero-inflated Poisson (ZIP) regression analyses supported the main hypothesis, showing that PD/AW and TD/AW interacted to predict adolescent drug use at follow-up. For adolescents in high PD/AW families, higher levels of TD/AW predicted significant increases in drug use at follow-up, whereas for low PD/AW families, TD/AW and follow-up drug use were unrelated. Results suggest that attending to parallel demand-withdraw processes in parent/adolescent and therapist/adolescent dyads may be useful in family therapy for substance-using adolescents. PMID:23438248

  9. Parallel demand-withdraw processes in family therapy for adolescent drug abuse.

    PubMed

    Rynes, Kristina N; Rohrbaugh, Michael J; Lebensohn-Chialvo, Florencia; Shoham, Varda

    2014-06-01

    Isomorphism, or parallel process, occurs in family therapy when patterns of therapist-client interaction replicate problematic interaction patterns within the family. This study investigated parallel demand-withdraw processes in brief strategic family therapy (BSFT) for adolescent drug abuse, hypothesizing that therapist-demand/adolescent-withdraw interaction (TD/AW) cycles observed early in treatment would predict poor adolescent outcomes at follow-up for families who exhibited entrenched parent-demand/adolescent-withdraw interaction (PD/AW) before treatment began. Participants were 91 families who received at least four sessions of BSFT in a multisite clinical trial on adolescent drug abuse (Robbins et al., 2011). Prior to receiving therapy, families completed videotaped family interaction tasks from which trained observers coded PD/AW. Another team of raters coded TD/AW during two early BSFT sessions. The main dependent variable was the number of drug-use days that adolescents reported in timeline follow-back interviews 7 to 12 months after family therapy began. Zero-inflated Poisson regression analyses supported the main hypothesis, showing that PD/AW and TD/AW interacted to predict adolescent drug use at follow-up. For adolescents in high PD/AW families, higher levels of TD/AW predicted significant increases in drug use at follow-up, whereas for low PD/AW families, TD/AW and follow-up drug use were unrelated. Results suggest that attending to parallel demand-withdraw processes in parent-adolescent and therapist-adolescent dyads may be useful in family therapy for substance-using adolescents.

  10. On the Control of Automatic Processes: A Parallel Distributed Processing Account of the Stroop Effect.

    ERIC Educational Resources Information Center

    Cohen, Jonathan D.; And Others

    1990-01-01

    It is proposed that attributes of automatization depend on the strength of a processing pathway, and that strength increases with training. With the Stroop effect as an example, automatic processes are shown through simulation to be continuous and to emerge gradually with practice. (SLD)

  11. Strong Asymmetric Coupling of Two Parallel Exclusion Processes: Effect of Unequal Injection Rates

    NASA Astrophysics Data System (ADS)

    Xiao, Song; Dong, Peng; Zhang, Yingjie; Liu, Yanna

    2016-03-01

    In this letter, strong asymmetric coupling of two parallel exclusion processes: effect of unequal injection rates will be investigated. It is a generalization of the work of Xiao et al. (Phys. Lett. A 8, 374 (2009)), in which the particles only move on two lanes with rate 1 toward right. We can obtain the diverse phase diagram and density profiles of the system. The vertical cluster mean-field approach and extensively Monte Carlo simulations are used to study the system, and theoretical predictions are in excellent agreement with simulation results.

  12. Percolation and anomalous transport as tools in analyzing parallel processing interconnection networks

    SciTech Connect

    McLeod, R.D.; Schellenberg, J.J. ); Hortensius, P.D. )

    1990-04-01

    It is quite apparent that much of the future advances in computation will be derived through the exploitation of parallel processing. Although a wide variety of topologies have been studied and proposed for both general-purpose and algorithm specific applications, there is still considerable discussion over which architectures are better and why. In this paper the authors discuss the application of percolation theory and anomalous transport to the issues of defective computer arrays. Percolation theory is used to discuss the static properties of the defective arrays and anomalous transport theory is used to discuss the dynamics of message passing on the defective array.

  13. Adventures in Parallel Processing: Entry, Descent and Landing Simulation for the Genesis and Stardust Missions

    NASA Technical Reports Server (NTRS)

    Lyons, Daniel T.; Desai, Prasun N.

    2005-01-01

    This paper will describe the Entry, Descent and Landing simulation tradeoffs and techniques that were used to provide the Monte Carlo data required to approve entry during a critical period just before entry of the Genesis Sample Return Capsule. The same techniques will be used again when Stardust returns on January 15, 2006. Only one hour was available for the simulation which propagated 2000 dispersed entry states to the ground. Creative simulation tradeoffs combined with parallel processing were needed to provide the landing footprint statistics that were an essential part of the Go/NoGo decision that authorized release of the Sample Return Capsule a few hours before entry.

  14. Comprehensive massive parallel DNA sequencing strategy for the genetic diagnosis of the neuro-cardio-facio-cutaneous syndromes.

    PubMed

    Justino, Ana; Dias, Patrícia; João Pina, Maria; Sousa, Sónia; Cirnes, Luís; Berta Sousa, Ana; Carlos Machado, José; Costa, José Luis

    2015-03-01

    Variants in 11 genes of the RAS/MAPK signaling pathway have been causally linked to the neuro-cardio-facio-cutaneous syndromes group (NCFCS). Recently, A2ML1 and RIT1 were also associated with these syndromes. Because of the genetic and clinical heterogeneity of NCFCS, it is challenging to define strategies for their molecular diagnosis. The aim of this study was to develop and validate a massive parallel sequencing (MPS)-based strategy for the molecular diagnosis of NCFCS. A multiplex PCR-based strategy for the enrichment of the 13 genes and a variant prioritization pipeline was established. Two sets of genomic DNA samples were studied using the Ion PGM System: (1) training set (n =15) to optimize the strategy and (2) validation set (n = 20) to validate and evaluate the power of the new methodology. Sanger sequencing was performed to confirm all variants and low covered regions. All variants identified by Sanger sequencing were detected with our MPS approach. The methodology resulted in an experimental approach with a specificity of 99.0% and a maximum analytical sensitivity of ≥ 98.2% with a confidence of 99%. Importantly, two patients (out of 20) harbored described disease-causing variants in genes that are not routinely tested (RIT1 and SHOC2). The addition of less frequently altered genes increased in ≈ 10% the diagnostic yield of the strategy currently used. The presented workflow provides a comprehensive genetic screening strategy for patients with NCFCS in a fast and cost-efficient manner. This approach demonstrates the potential of a combined MPS-Sanger sequencing-based strategy as an effective diagnostic tool for heterogeneous diseases.

  15. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics.

    PubMed

    Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter

    2015-01-01

    While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/. PMID:25600152

  16. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics.

    PubMed

    Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter

    2015-01-20

    While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.

  17. Leveraging human oversight and intervention in large-scale parallel processing of open-source data

    NASA Astrophysics Data System (ADS)

    Casini, Enrico; Suri, Niranjan; Bradshaw, Jeffrey M.

    2015-05-01

    The popularity of cloud computing along with the increased availability of cheap storage have led to the necessity of elaboration and transformation of large volumes of open-source data, all in parallel. One way to handle such extensive volumes of information properly is to take advantage of distributed computing frameworks like Map-Reduce. Unfortunately, an entirely automated approach that excludes human intervention is often unpredictable and error prone. Highly accurate data processing and decision-making can be achieved by supporting an automatic process through human collaboration, in a variety of environments such as warfare, cyber security and threat monitoring. Although this mutual participation seems easily exploitable, human-machine collaboration in the field of data analysis presents several challenges. First, due to the asynchronous nature of human intervention, it is necessary to verify that once a correction is made, all the necessary reprocessing is done in chain. Second, it is often needed to minimize the amount of reprocessing in order to optimize the usage of resources due to limited availability. In order to improve on these strict requirements, this paper introduces improvements to an innovative approach for human-machine collaboration in the processing of large amounts of open-source data in parallel.

  18. Target-specific IPSC kinetics promote temporal processing in auditory parallel pathways

    PubMed Central

    Xie, Ruili; Manis, Paul B.

    2013-01-01

    The acoustic environment contains biologically relevant information on time scales from microseconds to tens of seconds. The auditory brainstem nuclei process this temporal information through parallel pathways that originate in the cochlear nucleus from different classes of cells. While the roles of ion channels and excitatory synapses in temporal processing have been well studied, the contribution of inhibition is less well understood. Here, we show in CBA/CaJ mice that the two major projection neurons of the ventral cochlear nucleus, the bushy and T-stellate cells, receive glycinergic inhibition with different synaptic conductance time courses. Bushy cells, which provide precisely timed spike trains used in sound localization and pitch identification, receive slow inhibitory inputs. In contrast, T-stellate cells, which encode slower envelope information, receive inhibition that is eight-fold faster. Both types of inhibition improved the precision of spike timing, but engage different cellular mechanisms and operate on different time scales. Computer models reveal that slow IPSCs in bushy cells can improve spike timing on the scale of tens of microseconds. While fast and slow IPSCs in T-stellate cells improve spike timing on the scale of milliseconds, only fast IPSCs can enhance the detection of narrowband acoustic signals in a complex background. Our results suggest that target-specific IPSC kinetics are critical for the segregated parallel processing of temporal information from the sensory environment. PMID:23345233

  19. Recurrent modification of floral morphology in heterantherous Solanum reveals a parallel shift in reproductive strategy.

    PubMed

    Vallejo-Marín, Mario; Walker, Catriona; Friston-Reilly, Philip; Solís-Montero, Lislie; Igic, Boris

    2014-08-19

    Floral morphology determines the pattern of pollen transfer within and between individuals. In hermaphroditic species, the spatial arrangement of sexual organs influences the rate of self-pollination as well as the placement of pollen in different areas of the pollinator's body. Studying the evolutionary modification of floral morphology in closely related species offers an opportunity to investigate the causes and consequences of floral variation. Here, we investigate the recurrent modification of flower morphology in three closely related pairs of taxa in Solanum section Androceras (Solanaceae), a group characterized by the presence of two morphologically distinct types of anthers in the same flower (heteranthery). We use morphometric analyses of plants grown in a common garden to characterize and compare the changes in floral morphology observed in parallel evolutionary transitions from relatively larger to smaller flowers. Our results indicate that the transition to smaller flowers is associated with a reduction in the spatial separation of anthers and stigma, changes in the allometric relationships among floral traits, shifts in pollen allocation to the two anther morphs and reduced pollen : ovule ratios. We suggest that floral modification in this group reflects parallel evolution towards increased self-fertilization and discuss potential selective scenarios that may favour this recurrent shift in floral morphology and function.

  20. Recurrent modification of floral morphology in heterantherous Solanum reveals a parallel shift in reproductive strategy

    PubMed Central

    Vallejo-Marín, Mario; Walker, Catriona; Friston-Reilly, Philip; Solís-Montero, Lislie; Igic, Boris

    2014-01-01

    Floral morphology determines the pattern of pollen transfer within and between individuals. In hermaphroditic species, the spatial arrangement of sexual organs influences the rate of self-pollination as well as the placement of pollen in different areas of the pollinator's body. Studying the evolutionary modification of floral morphology in closely related species offers an opportunity to investigate the causes and consequences of floral variation. Here, we investigate the recurrent modification of flower morphology in three closely related pairs of taxa in Solanum section Androceras (Solanaceae), a group characterized by the presence of two morphologically distinct types of anthers in the same flower (heteranthery). We use morphometric analyses of plants grown in a common garden to characterize and compare the changes in floral morphology observed in parallel evolutionary transitions from relatively larger to smaller flowers. Our results indicate that the transition to smaller flowers is associated with a reduction in the spatial separation of anthers and stigma, changes in the allometric relationships among floral traits, shifts in pollen allocation to the two anther morphs and reduced pollen : ovule ratios. We suggest that floral modification in this group reflects parallel evolution towards increased self-fertilization and discuss potential selective scenarios that may favour this recurrent shift in floral morphology and function. PMID:25002701

  1. Parenting and the parallel processes in parents' counseling supervision for eating-related problems.

    PubMed

    Golan, Moria

    2014-04-01

    This paper presents an integrative model for supervising counselors of parents who face eating-related problems in their families. The model is grounded in the theory of parallel processes which occur during the supervision of health-care professionals as well as the counseling of parents and patients. The aim of this model is to conceptualize components and processes in the supervision space, in order to: (a) create a nurturing environment for health-care facilitators, parents and children, (b) better understand the complex and difficult nature of parenting, the challenge counselors face, and the skills and practices used in parenting and in counseling, and (c) better own practices and oppose the judgment that often dominates in counseling and supervision. This paper reflects upon the tradition of supervision and offers a comprehensive view of this process, including its challenges, skills and practices.

  2. Towards a Standard Mixed-Signal Parallel Processing Architecture for Miniature and Microrobotics

    PubMed Central

    Sadler, Brian M; Hoyos, Sebastian

    2014-01-01

    The conventional analog-to-digital conversion (ADC) and digital signal processing (DSP) architecture has led to major advances in miniature and micro-systems technology over the past several decades. The outlook for these systems is significantly enhanced by advances in sensing, signal processing, communications and control, and the combination of these technologies enables autonomous robotics on the miniature to micro scales. In this article we look at trends in the combination of analog and digital (mixed-signal) processing, and consider a generalized sampling architecture. Employing a parallel analog basis expansion of the input signal, this scalable approach is adaptable and reconfigurable, and is suitable for a large variety of current and future applications in networking, perception, cognition, and control. PMID:26601042

  3. Parallel photonic information processing at gigabyte per second data rates using transient states

    PubMed Central

    Brunner, Daniel; Soriano, Miguel C.; Mirasso, Claudio R.; Fischer, Ingo

    2013-01-01

    The increasing demands on information processing require novel computational concepts and true parallelism. Nevertheless, hardware realizations of unconventional computing approaches never exceeded a marginal existence. While the application of optics in super-computing receives reawakened interest, new concepts, partly neuro-inspired, are being considered and developed. Here we experimentally demonstrate the potential of a simple photonic architecture to process information at unprecedented data rates, implementing a learning-based approach. A semiconductor laser subject to delayed self-feedback and optical data injection is employed to solve computationally hard tasks. We demonstrate simultaneous spoken digit and speaker recognition and chaotic time-series prediction at data rates beyond 1 Gbyte/s. We identify all digits with very low classification errors and perform chaotic time-series prediction with 10% error. Our approach bridges the areas of photonic information processing, cognitive and information science. PMID:23322052

  4. MC64-ClustalWP2: A Highly-Parallel Hybrid Strategy to Align Multiple Sequences in Many-Core Architectures

    PubMed Central

    Díaz, David; Esteban, Francisco J.; Hernández, Pilar; Caballero, Juan Antonio; Guevara, Antonio

    2014-01-01

    We have developed the MC64-ClustalWP2 as a new implementation of the Clustal W algorithm, integrating a novel parallelization strategy and significantly increasing the performance when aligning long sequences in architectures with many cores. It must be stressed that in such a process, the detailed analysis of both the software and hardware features and peculiarities is of paramount importance to reveal key points to exploit and optimize the full potential of parallelism in many-core CPU systems. The new parallelization approach has focused into the most time-consuming stages of this algorithm. In particular, the so-called progressive alignment has drastically improved the performance, due to a fine-grained approach where the forward and backward loops were unrolled and parallelized. Another key approach has been the implementation of the new algorithm in a hybrid-computing system, integrating both an Intel Xeon multi-core CPU and a Tilera Tile64 many-core card. A comparison with other Clustal W implementations reveals the high-performance of the new algorithm and strategy in many-core CPU architectures, in a scenario where the sequences to align are relatively long (more than 10 kb) and, hence, a many-core GPU hardware cannot be used. Thus, the MC64-ClustalWP2 runs multiple alignments more than 18x than the original Clustal W algorithm, and more than 7x than the best x86 parallel implementation to date, being publicly available through a web service. Besides, these developments have been deployed in cost-effective personal computers and should be useful for life-science researchers, including the identification of identities and differences for mutation/polymorphism analyses, biodiversity and evolutionary studies and for the development of molecular markers for paternity testing, germplasm management and protection, to assist breeding, illegal traffic control, fraud prevention and for the protection of the intellectual property (identification

  5. Parallelized multi-graphics processing unit framework for high-speed Gabor-domain optical coherence microscopy.

    PubMed

    Tankam, Patrice; Santhanam, Anand P; Lee, Kye-Sung; Won, Jungeun; Canavesi, Cristina; Rolland, Jannick P

    2014-07-01

    Gabor-domain optical coherence microscopy (GD-OCM) is a volumetric high-resolution technique capable of acquiring three-dimensional (3-D) skin images with histological resolution. Real-time image processing is needed to enable GD-OCM imaging in a clinical setting. We present a parallelized and scalable multi-graphics processing unit (GPU) computing framework for real-time GD-OCM image processing. A parallelized control mechanism was developed to individually assign computation tasks to each of the GPUs. For each GPU, the optimal number of amplitude-scans (A-scans) to be processed in parallel was selected to maximize GPU memory usage and core throughput. We investigated five computing architectures for computational speed-up in processing 1000×1000 A-scans. The proposed parallelized multi-GPU computing framework enables processing at a computational speed faster than the GD-OCM image acquisition, thereby facilitating high-speed GD-OCM imaging in a clinical setting. Using two parallelized GPUs, the image processing of a 1×1×0.6  mm3 skin sample was performed in about 13 s, and the performance was benchmarked at 6.5 s with four GPUs. This work thus demonstrates that 3-D GD-OCM data may be displayed in real-time to the examiner using parallelized GPU processing.

  6. Parallelized multi–graphics processing unit framework for high-speed Gabor-domain optical coherence microscopy

    PubMed Central

    Tankam, Patrice; Santhanam, Anand P.; Lee, Kye-Sung; Won, Jungeun; Canavesi, Cristina; Rolland, Jannick P.

    2014-01-01

    Abstract. Gabor-domain optical coherence microscopy (GD-OCM) is a volumetric high-resolution technique capable of acquiring three-dimensional (3-D) skin images with histological resolution. Real-time image processing is needed to enable GD-OCM imaging in a clinical setting. We present a parallelized and scalable multi-graphics processing unit (GPU) computing framework for real-time GD-OCM image processing. A parallelized control mechanism was developed to individually assign computation tasks to each of the GPUs. For each GPU, the optimal number of amplitude-scans (A-scans) to be processed in parallel was selected to maximize GPU memory usage and core throughput. We investigated five computing architectures for computational speed-up in processing 1000×1000 A-scans. The proposed parallelized multi-GPU computing framework enables processing at a computational speed faster than the GD-OCM image acquisition, thereby facilitating high-speed GD-OCM imaging in a clinical setting. Using two parallelized GPUs, the image processing of a 1×1×0.6  mm3 skin sample was performed in about 13 s, and the performance was benchmarked at 6.5 s with four GPUs. This work thus demonstrates that 3-D GD-OCM data may be displayed in real-time to the examiner using parallelized GPU processing. PMID:24695868

  7. Development of Parallel Learning Strategies Curricula Using Videodisc and Standard Off-Line Formats. Final Report.

    ERIC Educational Resources Information Center

    Ramsberger, Peter F.; And Others

    One of a series of developmental projects that have produced and evaluated applications of an advanced multimedia, computer-based technology for basic skills education for Army enlisted personnel, this project focused on the development of an integrated curriculum to teach learning strategies and problem-solving skills. The first task was to…

  8. A cortical architecture on parallel hardware for motion processing in real time.

    PubMed

    Pauwels, Karl; Krüger, Norbert; Lappe, Markus; Wörgötter, Florentin; Van Hulle, Marc M

    2010-01-01

    Walking through a crowd or driving on a busy street requires monitoring your own movement and that of others. The segmentation of these other, independently moving, objects is one of the most challenging tasks in vision as it requires fast and accurate computations for the disentangling of independent motion from egomotion, often in cluttered scenes. This is accomplished in our brain by the dorsal visual stream relying on heavy parallel-hierarchical processing across many areas. This study is the first to utilize the potential of such design in an artificial vision system. We emulate large parts of the dorsal stream in an abstract way and implement an architecture with six interdependent feature extraction stages (e.g., edges, stereo, optical flow, etc.). The computationally highly demanding combination of these features is used to reliably extract moving objects in real time. This way-utilizing the advantages of parallel-hierarchical design-we arrive at a novel and powerful artificial vision system that approaches richness, speed, and accuracy of visual processing in biological systems.

  9. Distributed representation of social odors indicates parallel processing in the antennal lobe of ants.

    PubMed

    Brandstaetter, Andreas Simon; Kleineidam, Christoph Johannes

    2011-11-01

    In colonies of eusocial Hymenoptera cooperation is organized through social odors, and particularly ants rely on a sophisticated odor communication system. Neuronal information about odors is represented in spatial activity patterns in the primary olfactory neuropile of the insect brain, the antennal lobe (AL), which is analog to the vertebrate olfactory bulb. The olfactory system is characterized by neuroanatomical compartmentalization, yet the functional significance of this organization is unclear. Using two-photon calcium imaging, we investigated the neuronal representation of multicomponent colony odors, which the ants assess to discriminate friends (nestmates) from foes (nonnestmates). In the carpenter ant Camponotus floridanus, colony odors elicited spatial activity patterns distributed across different AL compartments. Activity patterns in response to nestmate and nonnestmate colony odors were overlapping. This was expected since both consist of the same components at differing ratios. Colony odors change over time and the nervous system has to constantly adjust for this (template reformation). Measured activity patterns were variable, and variability was higher in response to repeated nestmate than to repeated nonnestmate colony odor stimulation. Variable activity patterns may indicate neuronal plasticity within the olfactory system, which is necessary for template reformation. Our results indicate that information about colony odors is processed in parallel in different neuroanatomical compartments, using the computational power of the whole AL network. Parallel processing might be advantageous, allowing reliable discrimination of highly complex social odors.

  10. Distributed representation of social odors indicates parallel processing in the antennal lobe of ants.

    PubMed

    Brandstaetter, Andreas Simon; Kleineidam, Christoph Johannes

    2011-11-01

    In colonies of eusocial Hymenoptera cooperation is organized through social odors, and particularly ants rely on a sophisticated odor communication system. Neuronal information about odors is represented in spatial activity patterns in the primary olfactory neuropile of the insect brain, the antennal lobe (AL), which is analog to the vertebrate olfactory bulb. The olfactory system is characterized by neuroanatomical compartmentalization, yet the functional significance of this organization is unclear. Using two-photon calcium imaging, we investigated the neuronal representation of multicomponent colony odors, which the ants assess to discriminate friends (nestmates) from foes (nonnestmates). In the carpenter ant Camponotus floridanus, colony odors elicited spatial activity patterns distributed across different AL compartments. Activity patterns in response to nestmate and nonnestmate colony odors were overlapping. This was expected since both consist of the same components at differing ratios. Colony odors change over time and the nervous system has to constantly adjust for this (template reformation). Measured activity patterns were variable, and variability was higher in response to repeated nestmate than to repeated nonnestmate colony odor stimulation. Variable activity patterns may indicate neuronal plasticity within the olfactory system, which is necessary for template reformation. Our results indicate that information about colony odors is processed in parallel in different neuroanatomical compartments, using the computational power of the whole AL network. Parallel processing might be advantageous, allowing reliable discrimination of highly complex social odors. PMID:21849606

  11. Parallel processing and image analysis in the eyes of mantis shrimps.

    PubMed

    Cronin, T W; Marshall, J

    2001-04-01

    The compound eyes of mantis shrimps, a group of tropical marine crustaceans, incorporate principles of serial and parallel processing of visual information that may be applicable to artificial imaging systems. Their eyes include numerous specializations for analysis of the spectral and polarizational properties of light, and include more photoreceptor classes for analysis of ultraviolet light, color, and polarization than occur in any other known visual system. This is possible because receptors in different regions of the eye are anatomically diverse and incorporate unusual structural features, such as spectral filters, not seen in other compound eyes. Unlike eyes of most other animals, eyes of mantis shrimps must move to acquire some types of visual information and to integrate color and polarization with spatial vision. Information leaving the retina appears to be processed into numerous parallel data streams leading into the central nervous system, greatly reducing the analytical requirements at higher levels. Many of these unusual features of mantis shrimp vision may inspire new sensor designs for machine vision. PMID:11341580

  12. A New Parallel Processing Scheme Enabling Full Monte Carlo EAS Simulation in the GZK Energy Region

    NASA Astrophysics Data System (ADS)

    Kasahara, K.; Cohen, F.

    We developed a new parallel processing method enabling full M.C EAS simulation (say, with minimum energy of 500 keV) without using thin sampling even at 1019 eV. Normally, distributed-parallel processing needs a specific software and programs must be organized to match with such system. During the computation such a scheme also requires complex communications among many computer hosts. Our scheme first creates a skeleton of a shower, and smashes it into n-peaces and distributes the peaces to n- cpu to flesh them. After each peace is completely fleshed, they are assembled to make a complete picture of the shower. Thus, during the computation need no communication. With n=50, a 1019 eV shower can be simulated in ~10 days. For a 1020 eV shower, we may randomly sample a fraction of n-peases (say, 100 for n=1000), and safely econstruct whole picture of the shower. The scheme dose not use any weight on each particle and very much stable. The scheme has been implemented in Cosmos code. To produce a number of showers with full fluctuations, we have also developed a new method which utilizes the present result. The latter is used for the TA experiment and is described in an accompanying paper.

  13. Research on B Cell Algorithm for Learning to Rank Method Based on Parallel Strategy

    PubMed Central

    Tian, Yuling; Zhang, Hongxian

    2016-01-01

    For the purposes of information retrieval, users must find highly relevant documents from within a system (and often a quite large one comprised of many individual documents) based on input query. Ranking the documents according to their relevance within the system to meet user needs is a challenging endeavor, and a hot research topic–there already exist several rank-learning methods based on machine learning techniques which can generate ranking functions automatically. This paper proposes a parallel B cell algorithm, RankBCA, for rank learning which utilizes a clonal selection mechanism based on biological immunity. The novel algorithm is compared with traditional rank-learning algorithms through experimentation and shown to outperform the others in respect to accuracy, learning time, and convergence rate; taken together, the experimental results show that the proposed algorithm indeed effectively and rapidly identifies optimal ranking functions. PMID:27487242

  14. Research on B Cell Algorithm for Learning to Rank Method Based on Parallel Strategy.

    PubMed

    Tian, Yuling; Zhang, Hongxian

    2016-01-01

    For the purposes of information retrieval, users must find highly relevant documents from within a system (and often a quite large one comprised of many individual documents) based on input query. Ranking the documents according to their relevance within the system to meet user needs is a challenging endeavor, and a hot research topic-there already exist several rank-learning methods based on machine learning techniques which can generate ranking functions automatically. This paper proposes a parallel B cell algorithm, RankBCA, for rank learning which utilizes a clonal selection mechanism based on biological immunity. The novel algorithm is compared with traditional rank-learning algorithms through experimentation and shown to outperform the others in respect to accuracy, learning time, and convergence rate; taken together, the experimental results show that the proposed algorithm indeed effectively and rapidly identifies optimal ranking functions. PMID:27487242

  15. Research on B Cell Algorithm for Learning to Rank Method Based on Parallel Strategy.

    PubMed

    Tian, Yuling; Zhang, Hongxian

    2016-01-01

    For the purposes of information retrieval, users must find highly relevant documents from within a system (and often a quite large one comprised of many individual documents) based on input query. Ranking the documents according to their relevance within the system to meet user needs is a challenging endeavor, and a hot research topic-there already exist several rank-learning methods based on machine learning techniques which can generate ranking functions automatically. This paper proposes a parallel B cell algorithm, RankBCA, for rank learning which utilizes a clonal selection mechanism based on biological immunity. The novel algorithm is compared with traditional rank-learning algorithms through experimentation and shown to outperform the others in respect to accuracy, learning time, and convergence rate; taken together, the experimental results show that the proposed algorithm indeed effectively and rapidly identifies optimal ranking functions.

  16. FPGA implementation of current-sharing strategy for parallel-connected SEPICs

    NASA Astrophysics Data System (ADS)

    Ezhilarasi, A.; Ramaswamy, M.

    2016-01-01

    The attempt echoes to evolve an equal current-sharing algorithm over a number of single-ended primary inductance converters connected in parallel. The methodology involves the development of state-space model to predict the condition for the existence of a stable equilibrium portrait. It acquires the role of a variable structure controller to guide the trajectory, with a view to circumvent the circuit non-linearities and arrive at a stable performance through a preferred operating range. The design elicits an acceptable servo and regulatory characteristics, the desired time response and ensures regulation of the load voltage. The simulation results validated through a field programmable gate array-based prototype serves to illustrate its suitability for present-day applications.

  17. Improving Learning Processes: Principles, Strategies and Techniques.

    ERIC Educational Resources Information Center

    Cox, Philip

    This guide, which examines the relationship between learning processes and learning outcomes, is aimed at senior managers, quality managers, and others at colleges and other post-16 learning providers in the United Kingdom. It is intended to help them define the key processes undertaken by learning providers, understand the critical relationships…

  18. Efficient Process Migration for Parallel Processing on Non-Dedicated Networks of Workstations

    NASA Technical Reports Server (NTRS)

    Chanchio, Kasidit; Sun, Xian-He

    1996-01-01

    This paper presents the design and preliminary implementation of MpPVM, a software system that supports process migration for PVM application programs in a non-dedicated heterogeneous computing environment. New concepts of migration point as well as migration point analysis and necessary data analysis are introduced. In MpPVM, process migrations occur only at previously inserted migration points. Migration point analysis determines appropriate locations to insert migration points; whereas, necessary data analysis provides a minimum set of variables to be transferred at each migration pint. A new methodology to perform reliable point-to-point data communications in a migration environment is also discussed. Finally, a preliminary implementation of MpPVM and its experimental results are presented, showing the correctness and promising performance of our process migration mechanism in a scalable non-dedicated heterogeneous computing environment. While MpPVM is developed on top of PVM, the process migration methodology introduced in this study is general and can be applied to any distributed software environment.

  19. Investigation of Mediational Processes Using Parallel Process Latent Growth Curve Modeling.

    ERIC Educational Resources Information Center

    Cheong, JeeWon; MacKinnon, David P.; Khoo, Siek Toon

    2003-01-01

    Investigated a method to evaluate mediational processes using latent growth curve modeling and tested it with empirical data from a longitudinal steroid use prevention program focusing on 1,506 high school football players over 4 years. Findings suggest the usefulness of the approach. (SLD)

  20. Parallel distributed processing and neuropsychology: a neural network model of Wisconsin Card Sorting and verbal fluency.

    PubMed

    Parks, R W; Levine, D S; Long, D L; Crockett, D J; Dalton, I E; Weingartner, H; Fedio, P; Coburn, K L; Siler, G; Matthews, J R

    1992-06-01

    Neural networks can be used as a tool in the explanation of neuropsychological data. Using the Hebbian Learning Rule and other such principles as competition and modifiable interlevel feedback, researchers have successfully modeled a widely used neuropsychological test, the Wisconsin Card Sorting Test. One of these models is reviewed here and extended to a qualitative analysis of how verbal fluency might be modeled, which demonstrates the importance of accounting for the attentional components of both tests. Difficulties remain in programming sequential cognitive processes within a parallel distributed processing (PDP) framework and integrating exceedingly complex neuropsychological tests such as Proverbs. PDP neural network methodology offers neuropsychologists co-validation procedures within narrowly defined areas of reliability and validity.

  1. MiniGhost : a miniapp for exploring boundary exchange strategies using stencil computations in scientific parallel computing.

    SciTech Connect

    Barrett, Richard Frederick; Heroux, Michael Allen; Vaughan, Courtenay Thomas

    2012-04-01

    A broad range of scientific computation involves the use of difference stencils. In a parallel computing environment, this computation is typically implemented by decomposing the spacial domain, inducing a 'halo exchange' of process-owned boundary data. This approach adheres to the Bulk Synchronous Parallel (BSP) model. Because commonly available architectures provide strong inter-node bandwidth relative to latency costs, many codes 'bulk up' these messages by aggregating data into a message as a means of reducing the number of messages. A renewed focus on non-traditional architectures and architecture features provides new opportunities for exploring alternatives to this programming approach. In this report we describe miniGhost, a 'miniapp' designed for exploration of the capabilities of current as well as emerging and future architectures within the context of these sorts of applications. MiniGhost joins the suite of miniapps developed as part of the Mantevo project.

  2. Teaching ethics to engineers: ethical decision making parallels the engineering design process.

    PubMed

    Bero, Bridget; Kuhlman, Alana

    2011-09-01

    In order to fulfill ABET requirements, Northern Arizona University's Civil and Environmental engineering programs incorporate professional ethics in several of its engineering courses. This paper discusses an ethics module in a 3rd year engineering design course that focuses on the design process and technical writing. Engineering students early in their student careers generally possess good black/white critical thinking skills on technical issues. Engineering design is the first time students are exposed to "grey" or multiple possible solution technical problems. To identify and solve these problems, the engineering design process is used. Ethical problems are also "grey" problems and present similar challenges to students. Students need a practical tool for solving these ethical problems. The step-wise engineering design process was used as a model to demonstrate a similar process for ethical situations. The ethical decision making process of Martin and Schinzinger was adapted for parallelism to the design process and presented to students as a step-wise technique for identification of the pertinent ethical issues, relevant moral theories, possible outcomes and a final decision. Students had greatest difficulty identifying the broader, global issues presented in an ethical situation, but by the end of the module, were better able to not only identify the broader issues, but also to more comprehensively assess specific issues, generate solutions and a desired response to the issue.

  3. Integration of optoelectronic technologies for chip-to- chip interconnections and parallel pipeline processing

    NASA Astrophysics Data System (ADS)

    Wu, Jenming

    Digital information services such as multimedia systems and data communications require the processing and transfer of tremendous amount of data. These data need to be stored, accessed and delivered efficiently and reliably at high speed for various user applications. This represents a great challenge for current electronic systems. Electronics is effective in providing high performance processing and computation, but its input/outputs (I/Os) bandwidth is unable to scale with its processing power. The signal I/Os or interconnections are needed between processors and input devices, between processors for multiprocessor systems, and between processors and storage devices. Novel chip-to-chip interconnect technologies are needed to meet this challenge. This work integrates optoelectronic technologies for chip-to-chip interconnects and parallel pipeline processing. Photonic and electronic technologies are complementary to each other in the sense that electronics is more suitable for high-speed, low cost computation, and photonics is more suitable for high-bandwidth information transmission. Smart pixel technology uses electronics for logic switching and optics for chip-to- chip interconnects, thus combining the abilities of photonics and electronics nicely. This work describes both vertical and horizontal integration of smart pixel technologies for chip-to-chip optical interconnects and its applications. We present smart pixel VLSI designs in both hybrid CMOS/MQW smart pixel and monolithic GaAs smart pixel technologies. We use the CMOS/MQW technology for smart pixel array cellular logic (SPARCL) processors for SIMD parallel pipeline processing. We have tested the chip and constructed a prototype system for device characterization and system demonstration. We have verified the functionality of the system and characterized the electrical functions of the chip and the optoelectronic properties of the MQW devices. We have developed algorithms that utilize SPARCL for various

  4. Molecular tailoring approach for geometry optimization of large molecules: Energy evaluation and parallelization strategies

    NASA Astrophysics Data System (ADS)

    Ganesh, V.; Dongare, Rameshwar K.; Balanarayan, P.; Gadre, Shridhar R.

    2006-09-01

    A linear-scaling scheme for estimating the electronic energy, gradients, and Hessian of a large molecule at ab initio level of theory based on fragment set cardinality is presented. With this proposition, a general, cardinality-guided molecular tailoring approach (CG-MTA) for ab initio geometry optimization of large molecules is implemented. The method employs energy gradients extracted from fragment wave functions, enabling computations otherwise impractical on PC hardware. Further, the method is readily amenable to large scale coarse-grain parallelization with minimal communication among nodes, resulting in a near-linear speedup. CG-MTA is applied for density-functional-theory-based geometry optimization of a variety of molecules including α-tocopherol, taxol, γ-cyclodextrin, and two conformations of polyglycine. In the tests performed, energy and gradient estimates obtained from CG-MTA during optimization runs show an excellent agreement with those obtained from actual computation. Accuracy of the Hessian obtained employing CG-MTA provides good hope for the application of Hessian-based geometry optimization to large molecules.

  5. Calculating Floquet states of large quantum systems: A parallelization strategy and its cluster implementation

    NASA Astrophysics Data System (ADS)

    Laptyeva, T. V.; Kozinov, E. A.; Meyerov, I. B.; Ivanchenko, M. V.; Denisov, S. V.; Hänggi, P.

    2016-04-01

    We present a numerical approach to calculate non-equilibrium eigenstates of a periodically time-modulated quantum system. The approach is based on the use of a chain of single-step propagating operators. Each operator is time-specific and constructed by combining the Magnus expansion of the time-dependent system Hamiltonian with the Chebyshev expansion of an operator exponent. The construction of the unitary Floquet operator, which evolves a system state over the full modulation period, is performed by propagating the identity matrix over the period. The independence of the evolution of basis vectors makes the propagation stage suitable for realization on a parallel cluster. Once the propagation stage is completed, a routine diagonalization of the Floquet matrix is performed. Finally, an additional propagation round, now involving the eigenvectors as the initial states, allows to resolve the time-dependence of the Floquet states and calculate their characteristics. We demonstrate the accuracy and scalability of the algorithm by applying it to calculate the Floquet states of two quantum models, namely (i) a synthesized random-matrix Hamiltonian and (ii) a many-body Bose-Hubbard dimer, both of the size up to 104 states.

  6. Mars sampling strategy and aeolian processes

    NASA Technical Reports Server (NTRS)

    Greeley, Ronald

    1988-01-01

    It is critical that the geological context of planetary samples (both in situ analyses and return samples) be well known and documented. Apollo experience showed that this goal is often difficult to achieve even for a planet on which surficial processes are relatively restricted. On Mars, the variety of present and past surface processes is much greater than on the Moon and establishing the geological context of samples will be much more difficult. In addition to impact hardening, Mars has been modified by running water, periglacial activity, wind, and other processes, all of which have the potential for profoundly affecting the geological integrity of potential samples. Aeolian, or wind, processes are ubiquitous on Mars. In the absence of liquid water on the surface, aeolian activity dominates the present surface as documented by frequent dust storms (both local and global), landforms such as dunes, and variable features, i.e., albedo patterns which change their size, shape, and position with time in response to the wind.

  7. Neural processes in symmetry perception: a parallel spatio-temporal model.

    PubMed

    Zhu, Tao

    2014-04-01

    Symmetry is usually computationally expensive to detect reliably, while it is relatively easy to perceive. In spite of many attempts to understand the neurofunctional properties of symmetry processing, no symmetry-specific activation was found in earlier cortical areas. Psychophysical evidence relating to the processing mechanisms suggests that the basic processes of symmetry perception would not perform a serial, point-by-point comparison of structural features but rather operate in parallel. Here, modeling of neural processes in psychophysical detection of bilateral texture symmetry is considered. A simple fine-grained algorithm that is capable of performing symmetry estimation without explicit comparison of remote elements is introduced. A computational model of symmetry perception is then described to characterize the underlying mechanisms as one-dimensional spatio-temporal neural processes, each of which is mediated by intracellular horizontal connections in primary visual cortex and adopts the proposed algorithm for the neural computation. Simulated experiments have been performed to show the efficiency and the dynamics of the model. Model and human performances are comparable for symmetry perception of intensity images. Interestingly, the responses of V1 neurons to propagation activities reflecting higher-order perceptual computations have been reported in neurophysiologic experiments.

  8. Massively Parallel Signal Processing using the Graphics Processing Unit for Real-Time Brain-Computer Interface Feature Extraction.

    PubMed

    Wilson, J Adam; Williams, Justin C

    2009-01-01

    The clock speeds of modern computer processors have nearly plateaued in the past 5 years. Consequently, neural prosthetic systems that rely on processing large quantities of data in a short period of time face a bottleneck, in that it may not be possible to process all of the data recorded from an electrode array with high channel counts and bandwidth, such as electrocorticographic grids or other implantable systems. Therefore, in this study a method of using the processing capabilities of a graphics card [graphics processing unit (GPU)] was developed for real-time neural signal processing of a brain-computer interface (BCI). The NVIDIA CUDA system was used to offload processing to the GPU, which is capable of running many operations in parallel, potentially greatly increasing the speed of existing algorithms. The BCI system records many channels of data, which are processed and translated into a control signal, such as the movement of a computer cursor. This signal processing chain involves computing a matrix-matrix multiplication (i.e., a spatial filter), followed by calculating the power spectral density on every channel using an auto-regressive method, and finally classifying appropriate features for control. In this study, the first two computationally intensive steps were implemented on the GPU, and the speed was compared to both the current implementation and a central processing unit-based implementation that uses multi-threading. Significant performance gains were obtained with GPU processing: the current implementation processed 1000 channels of 250 ms in 933 ms, while the new GPU method took only 27 ms, an improvement of nearly 35 times.

  9. Distinct cerebellar lobules process arousal, valence and their interaction in parallel following a temporal hierarchy.

    PubMed

    Styliadis, Charis; Ioannides, Andreas A; Bamidis, Panagiotis D; Papadelis, Christos

    2015-04-15

    The cerebellum participates in emotion-related neural circuits formed by different cortical and subcortical areas, which sub-serve arousal and valence. Recent neuroimaging studies have shown a functional specificity of cerebellar lobules in the processing of emotional stimuli. However, little is known about the temporal component of this process. The goal of the current study is to assess the spatiotemporal profile of neural responses within the cerebellum during the processing of arousal and valence. We hypothesized that the excitation and timing of distinct cerebellar lobules is influenced by the emotional content of the stimuli. By using magnetoencephalography, we recorded magnetic fields from twelve healthy human individuals while passively viewing affective pictures rated along arousal and valence. By using a beamformer, we localized gamma-band activity in the cerebellum across time and we related the foci of activity to the anatomical organization of the cerebellum. Successive cerebellar activations were observed within distinct lobules starting ~160ms after the stimuli onset. Arousal was processed within both vermal (VI and VIIIa) and hemispheric (left Crus II) lobules. Valence (left VI) and its interaction (left V and left Crus I) with arousal were processed only within hemispheric lobules. Arousal processing was identified first at early latencies (160ms) and was long-lived (until 980ms). In contrast, the processing of valence and its interaction to arousal was short lived at later stages (420-530ms and 570-640ms respectively). Our findings provide for the first time evidence that distinct cerebellar lobules process arousal, valence, and their interaction in a parallel yet temporally hierarchical manner determined by the emotional content of the stimuli. PMID:25665964

  10. Extended parallel process model and H5N1 influenza virus.

    PubMed

    Siu, Wanda

    2008-04-01

    This study integrated the Extended Parallel Process Model and forewarning cues to assess the promotion of preventive measures against the H5N1 influenza virus, a significant health threat that affects Asia, Europe, and the USA. There are two types of forewarning, (1) telling the audience that they will hear messages intended to persuade them and (2) telling the audience the topic and stance of the impending persuasive message. Analysis of ratings by 265 undergraduates indicated that forewarnings of the topic and stance of a promotional message on the H5N1 virus facilitated elaboration of coping-related thoughts which enhance perceived self-efficacy and a stronger behavioral intention to combat H5N1. Conversely, the elaboration of danger-related thoughts evoked some fear but enhanced source perception.

  11. Creation of the BMA ensemble for SST using a parallel processing technique

    NASA Astrophysics Data System (ADS)

    Kim, Kwangjin; Lee, Yang Won

    2013-10-01

    Despite the same purpose, each satellite product has different value because of its inescapable uncertainty. Also the satellite products have been calculated for a long time, and the kinds of the products are various and enormous. So the efforts for reducing the uncertainty and dealing with enormous data will be necessary. In this paper, we create an ensemble Sea Surface Temperature (SST) using MODIS Aqua, MODIS Terra and COMS (Communication Ocean and Meteorological Satellite). We used Bayesian Model Averaging (BMA) as ensemble method. The principle of the BMA is synthesizing the conditional probability density function (PDF) using posterior probability as weight. The posterior probability is estimated using EM algorithm. The BMA PDF is obtained by weighted average. As the result, the ensemble SST showed the lowest RMSE and MAE, which proves the applicability of BMA for satellite data ensemble. As future work, parallel processing techniques using Hadoop framework will be adopted for more efficient computation of very big satellite data.

  12. Biological dosimetry by chromosome aberration scoring with parallel image processing with the Heidelberg POLYP Polyprocessor system

    SciTech Connect

    Bille, J.; Scharfenberg, H.; Maenner, R.

    1983-01-01

    Chromosome aberrations in human peripheral blood are recognized parameters of cellular damage and are used as indicators of exposure to ionizing radiation. In order to reach the low dose range, up to 10,000 metaphase cells each consisting of 46 chromosomes have to be analysed for each radiation exposed person. In order to perform this task within reasonable time limits the application of the Heidelberg POLYP Polyprocessor is considered. The POLYP consists of a number of processor modules and several global memory modules which are interconnected by a multi-common-bus for parallel data transfers and a multiple synchronization bus for processor/task-scheduling. The system is designed for handling large amounts of data in real time as is typical for image processing applications.

  13. When parallel processing in visual word recognition is not enough: new evidence from naming.

    PubMed

    Roberts, Martha Anne; Rastle, Kathleen; Coltheart, Max; Besner, Derek

    2003-06-01

    Low-frequency irregular words are named more slowly and are more error prone than low-frequency regular words (the regularity effect). Rastle and Coltheart (1999) reported that this irregularity cost is modulated by the serial position of the irregular grapheme-phoneme correspondence, such that words with early irregularities exhibit a larger cost than words with late ones. They argued that these data implicate rule-based serial processing, and they also reported a successful simulation with a model that has a rule-based serial component--the DRC model of reading aloud (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001). However, Zorzi (2000) also simulated these data with a model that operates solely in parallel. Furthermore, Kwantes and Mewhort (1999) simulated these data with a serial processing model that has no rules for converting orthography to phonology. The human data reported by Rastle and Coltheart therefore neither require a serial processing account, nor successfully discriminate among a number of computational models of reading aloud. New data are presented wherein an interaction between the effects of regularity and serial position of irregularity is again reported for human readers. The DRC model simulated this interaction; no other implemented computational model does so. The present results are thus consistent with rule-based serial processing in reading aloud.

  14. MultiScheme: a parallel-processing system based on MIT (Massachusetts Institute of Technology) scheme. Doctoral thesis

    SciTech Connect

    Miller, J.S.

    1987-09-01

    MultiScheme is a fully operational parallel-programming system based upon the Scheme dialect of Lisp. Like its Lisp ancestors, MultiScheme provides a conducive environment for prototyping and testing new linguistic structures and programming methodologies. MultiScheme supports a diverse community of users who have a wide range of interests in parallel programming. MultiScheme's flexible support for system-based experiments in parallel processing has enabled it to serve as a development vehicle for university and industrial research. At the same time, MultiScheme is sufficiently robust, and supports a sufficiently wide range of parallel-processing applications, that it has become the base for a commercial product, the Butterfly Lisp System produced by BBN Advanced Computers, Inc.

  15. In-Database Raster Analytics: Map Algebra and Parallel Processing in Oracle Spatial Georaster

    NASA Astrophysics Data System (ADS)

    Xie, Q. J.; Zhang, Z. Z.; Ravada, S.

    2012-07-01

    Over the past decade several products have been using enterprise database technology to store and manage geospatial imagery and raster data inside RDBMS, which in turn provides the best manageability and security. With the data volume growing exponentially, real-time or near real-time processing and analysis of such big data becomes more challenging. Oracle Spatial GeoRaster, different from most other products, takes the enterprise database-centric approach for both data management and data processing. This paper describes one of the central components of this database-centric approach: the processing engine built completely inside the database. Part of this processing engine is raster algebra, which we call the In-database Raster Analytics. This paper discusses the three key characteristics of this in-database analytics engine and the benefits. First, it moves the data processing closer to the data instead of moving the data to the processing, which helps achieve greater performance by overcoming the bottleneck of computer networks. Second, we designed and implemented a new raster algebra expression language. This language is based on PL/SQL and is currently focused on the "local" function type of map algebra. This language includes general arithmetic, logical and relational operators and any combination of them, which dramatically improves the analytical capability of the GeoRaster database. The third feature is the implementation of parallel processing of such operations to further improve performance. This paper also presents some sample use cases. The testing results demonstrate that this in-database approach for raster analytics can effectively help solve the biggest performance challenges we are facing today with big raster and image data.

  16. NMR Spectroscopy: Processing Strategies (by Peter Bigler)

    NASA Astrophysics Data System (ADS)

    Mills, Nancy S.

    1998-06-01

    Peter Bigler. VCH: New York, 1997. 249 pp. ISBN 3-527-28812-0. $99.00. This book, part of a four-volume series planned to deal with all aspects of a standard NMR experiment, is almost the exact book I have been hoping to find. My department has acquired, as have hundreds of other undergraduate institutions, high-field NMR instrumentation and the capability of doing extremely sophisticated experiments. However, the training is often a one- or two-day experience in which the material retained by the faculty trained is garbled and filled with holes, not unlike the information our students seem to retain. This text, and the accompanying exercises based on data contained on a CD-ROM, goes a long way to fill in the gaps and clarify misunderstandings about NMR processing.

  17. A corporate strategy for the control of information processing.

    PubMed

    Lucas, H C; Turner, J A

    1982-01-01

    Although the use of information processing has become widespread, many organizations have developed systems that are basically independent of the firm's strategy. However, the authors in this article argue that the greatest benefits come when information technology is merged with strategy formulation. The article includes examples of how this has been done and presents a framework for top management direction and control of information processing.

  18. Comparing Binaural Pre-processing Strategies II

    PubMed Central

    Hu, Hongmei; Krawczyk-Becker, Martin; Marquardt, Daniel; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Bomke, Katrin; Plotz, Karsten; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

    2015-01-01

    Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users. PMID:26721921

  19. Neural decoding using a parallel sequential Monte Carlo method on point processes with ensemble effect.

    PubMed

    Xu, Kai; Wang, Yiwen; Wang, Fang; Liao, Yuxi; Zhang, Qiaosheng; Li, Hongbao; Zheng, Xiaoxiang

    2014-01-01

    Sequential Monte Carlo estimation on point processes has been successfully applied to predict the movement from neural activity. However, there exist some issues along with this method such as the simplified tuning model and the high computational complexity, which may degenerate the decoding performance of motor brain machine interfaces. In this paper, we adopt a general tuning model which takes recent ensemble activity into account. The goodness-of-fit analysis demonstrates that the proposed model can predict the neuronal response more accurately than the one only depending on kinematics. A new sequential Monte Carlo algorithm based on the proposed model is constructed. The algorithm can significantly reduce the root mean square error of decoding results, which decreases 23.6% in position estimation. In addition, we accelerate the decoding speed by implementing the proposed algorithm in a massive parallel manner on GPU. The results demonstrate that the spike trains can be decoded as point process in real time even with 8000 particles or 300 neurons, which is over 10 times faster than the serial implementation. The main contribution of our work is to enable the sequential Monte Carlo algorithm with point process observation to output the movement estimation much faster and more accurately.

  20. Architecture and design of a 500-MHz gallium-arsenide processing element for a parallel supercomputer

    NASA Technical Reports Server (NTRS)

    Fouts, Douglas J.; Butner, Steven E.

    1991-01-01

    The design of the processing element of GASP, a GaAs supercomputer with a 500-MHz instruction issue rate and 1-GHz subsystem clocks, is presented. The novel, functionally modular, block data flow architecture of GASP is described. The architecture and design of a GASP processing element is then presented. The processing element (PE) is implemented in a hybrid semiconductor module with 152 custom GaAs ICs of eight different types. The effects of the implementation technology on both the system-level architecture and the PE design are discussed. SPICE simulations indicate that parts of the PE are capable of being clocked at 1 GHz, while the rest of the PE uses a 500-MHz clock. The architecture utilizes data flow techniques at a program block level, which allows efficient execution of parallel programs while maintaining reasonably good performance on sequential programs. A simulation study of the architecture indicates that an instruction execution rate of over 30,000 MIPS can be attained with 65 PEs.

  1. Neural Decoding Using a Parallel Sequential Monte Carlo Method on Point Processes with Ensemble Effect

    PubMed Central

    Wang, Fang; Liao, Yuxi; Zheng, Xiaoxiang

    2014-01-01

    Sequential Monte Carlo estimation on point processes has been successfully applied to predict the movement from neural activity. However, there exist some issues along with this method such as the simplified tuning model and the high computational complexity, which may degenerate the decoding performance of motor brain machine interfaces. In this paper, we adopt a general tuning model which takes recent ensemble activity into account. The goodness-of-fit analysis demonstrates that the proposed model can predict the neuronal response more accurately than the one only depending on kinematics. A new sequential Monte Carlo algorithm based on the proposed model is constructed. The algorithm can significantly reduce the root mean square error of decoding results, which decreases 23.6% in position estimation. In addition, we accelerate the decoding speed by implementing the proposed algorithm in a massive parallel manner on GPU. The results demonstrate that the spike trains can be decoded as point process in real time even with 8000 particles or 300 neurons, which is over 10 times faster than the serial implementation. The main contribution of our work is to enable the sequential Monte Carlo algorithm with point process observation to output the movement estimation much faster and more accurately. PMID:24949462

  2. Parallel flow accumulation algorithms for graphical processing units with application to RUSLE model

    NASA Astrophysics Data System (ADS)

    Sten, Johan; Lilja, Harri; Hyväluoma, Jari; Westerholm, Jan; Aspnäs, Mats

    2016-04-01

    Digital elevation models (DEMs) are widely used in the modeling of surface hydrology, which typically includes the determination of flow directions and flow accumulation. The use of high-resolution DEMs increases the accuracy of flow accumulation computation, but as a drawback, the computational time may become excessively long if large areas are analyzed. In this paper we investigate the use of graphical processing units (GPUs) for efficient flow accumulation calculations. We present two new parallel flow accumulation algorithms based on dependency transfer and topological sorting and compare them to previously published flow transfer and indegree-based algorithms. We benchmark the GPU implementations against industry standards, ArcGIS and SAGA. With the flow-transfer D8 flow routing model and binary input data, a speed up of 19 is achieved compared to ArcGIS and 15 compared to SAGA. We show that on GPUs the topological sort-based flow accumulation algorithm leads on average to a speedup by a factor of 7 over the flow-transfer algorithm. Thus a total speed up of the order of 100 is achieved. We test the algorithms by applying them to the Revised Universal Soil Loss Equation (RUSLE) erosion model. For this purpose we present parallel versions of the slope, LS factor and RUSLE algorithms and show that the RUSLE erosion results for an area of 12 km x 24 km containing 72 million cells can be calculated in less than a second. Since flow accumulation is needed in many hydrological models, the developed algorithms may find use in many other applications than RUSLE modeling. The algorithm based on topological sorting is particularly promising for dynamic hydrological models where flow accumulations are repeatedly computed over an unchanged DEM.

  3. Membrane Transport Processes Analyzed by a Highly Parallel Nanopore Chip System at Single Protein Resolution.

    PubMed

    Urban, Michael; Vor der Brüggen, Marc; Tampé, Robert

    2016-01-01

    Membrane protein transport on the single protein level still evades detailed analysis, if the substrate translocated is non-electrogenic. Considerable efforts have been made in this field, but techniques enabling automated high-throughput transport analysis in combination with solvent-free lipid bilayer techniques required for the analysis of membrane transporters are rare. This class of transporters however is crucial in cell homeostasis and therefore a key target in drug development and methodologies to gain new insights desperately needed. The here presented manuscript describes the establishment and handling of a novel biochip for the analysis of membrane protein mediated transport processes at single transporter resolution. The biochip is composed of microcavities enclosed by nanopores that is highly parallel in its design and can be produced in industrial grade and quantity. Protein-harboring liposomes can directly be applied to the chip surface forming self-assembled pore-spanning lipid bilayers using SSM-techniques (solid supported lipid membranes). Pore-spanning parts of the membrane are freestanding, providing the interface for substrate translocation into or out of the cavity space, which can be followed by multi-spectral fluorescent readout in real-time. The establishment of standard operating procedures (SOPs) allows the straightforward establishment of protein-harboring lipid bilayers on the chip surface of virtually every membrane protein that can be reconstituted functionally. The sole prerequisite is the establishment of a fluorescent read-out system for non-electrogenic transport substrates. High-content screening applications are accomplishable by the use of automated inverted fluorescent microscopes recording multiple chips in parallel. Large data sets can be analyzed using the freely available custom-designed analysis software. Three-color multi spectral fluorescent read-out furthermore allows for unbiased data discrimination into different

  4. SIAM Conference on Parallel Processing for Scientific Computing - March 12-14, 2008

    SciTech Connect

    2008-09-08

    The themes of the 2008 conference included, but were not limited to: Programming languages, models, and compilation techniques; The transition to ubiquitous multicore/manycore processors; Scientific computing on special-purpose processors (Cell, GPUs, etc.); Architecture-aware algorithms; From scalable algorithms to scalable software; Tools for software development and performance evaluation; Global perspectives on HPC; Parallel computing in industry; Distributed/grid computing; Fault tolerance; Parallel visualization and large scale data management; and The future of parallel architectures.

  5. Parallelizing serial code for a distributed processing environment with an application to high frequency electromagnetic scattering

    NASA Astrophysics Data System (ADS)

    Work, Paul R.

    1991-12-01

    This thesis investigates the parallelization of existing serial programs in computational electromagnetics for use in a parallel environment. Existing algorithms for calculating the radar cross section of an object are covered, and a ray-tracing code is chosen for implementation on a parallel machine. Current parallel architectures are introduced and a suitable parallel machine is selected for the implementation of the chosen ray-tracing algorithm. The standard techniques for the parallelization of serial codes are discussed, including load balancing and decomposition considerations, and appropriate methods for the parallelization effort are selected. A load balancing algorithm is modified to increase the efficiency of the application, and a high level design of the structure of the serial program is presented. A detailed design of the modifications for the parallel implementation is also included, with both the high level and the detailed design specified in a high level design language called UNITY. The correctness of the design is proven using UNITY and standard logic operations. The theoretical and empirical results show that it is possible to achieve an efficient parallel application for a serial computational electromagnetic program where the characteristics of the algorithm and the target architecture critically influence the development of such an implementation.

  6. Parallel processing approach for radiative heat transfer prediction in participating media

    NASA Astrophysics Data System (ADS)

    Saltiel, C.; Naraghi, M. H. N.

    1993-10-01

    Numerical analysis of radiative transfer in participating media can be very complex. Computer simulations of practical situations often require both large computer memory and long calculation times. The use of massively parallel machines has proven very effective in simulating large complex systems. This technical note presents a unified matrix formulation for node-to-node-based radiative exchange in isotropically scattering homogeneous media using the discrete exchange factor (DEF) method. Computational implementation is compared between serial and parallel computing machines. The results demonstrate that parallel computing has the potential for changing the nature of radiative transfer calculations. Parallel computing allows for faster, more manageable calculations; it is especially effective for nonlinear problems.

  7. Parallel neural network-fuzzy expert system strategy for short-term load forecasting: System implementation and performance evaluation

    SciTech Connect

    Srinivasan, D.; Tan, S.S.; Chang, C.S.; Chan, E.K.

    1999-08-01

    The on-line implementation and results from a hybrid short-term electrical load forecaster that is being evaluated by a power utility are documented in this paper. This forecaster employs a new approach involving a parallel neural-fuzzy expert system, whereby Kohonen`s self organizing feature map with unsupervised learning, is used to classify daily load patterns. Post-processing of the neural network outputs is performed with fuzzy expert system which successfully corrects the load deviations caused by the effects of weather and holiday activity. Being highly automated, little human interference is required during the process of load forecasting. A comparison made between this model and a regression-based model currently being used in the Control Centre has shown a market improvement in load forecasting results.

  8. Rapid LC-MS drug metabolite profiling using microsomal enzyme bioreactors in a parallel processing format.

    PubMed

    Bajrami, Besnik; Zhao, Linlin; Schenkman, John B; Rusling, James F

    2009-12-15

    Silica nanoparticle bioreactors featuring thin films of enzymes and polyions were utilized in a novel high-throughput 96-well plate format for drug metabolism profiling. The utility of the approach was illustrated by investigating the metabolism of the drugs diclofenac (DCF), troglitazone (TGZ), and raloxifene, for which we observed known metabolic oxidation and bioconjugation pathways and turnover rates. A broad range of enzymes was included by utilizing human liver (HLM), rat liver (RLM) and bicistronic human-cyt P450 3A4 (bicis.-3A4) microsomes as enzyme sources. This parallel approach significantly shortens sample preparation steps compared to an earlier manual processing with nanoparticle bioreactors, allowing a range of significant enzyme reactions to be processed simultaneously. Enzyme turnover rates using the microsomal bioreactors were 2-3 fold larger compared to using conventional microsomal dispersions, most likely because of better accessibility of the enzymes. Ketoconazole (KET) and quinidine (QIN), substrates specific to cyt P450 3A enzymes, were used to demonstrate applicability to establish potentially toxic drug-drug interactions involving enzyme inhibition and acceleration. PMID:19904994

  9. T3PS v1.0: Tool for Parallel Processing in Parameter Scans

    NASA Astrophysics Data System (ADS)

    Maurer, Vinzenz

    2016-01-01

    T3PS is a program that can be used to quickly design and perform parameter scans while easily taking advantage of the multi-core architecture of current processors. It takes an easy to read and write parameter scan definition file format as input. Based on the parameter ranges and other options contained therein, it distributes the calculation of the parameter space over multiple processes and possibly computers. The derived data is saved in a plain text file format readable by most plotting software. The supported scanning strategies include: grid scan, random scan, Markov Chain Monte Carlo, numerical optimization. Several example parameter scans are shown and compared with results in the literature.

  10. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamically adjusting local routing strategies

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-03-16

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Each node implements a respective routing strategy for routing data through the network, the routing strategies not necessarily being the same in every node. The routing strategies implemented in the nodes are dynamically adjusted during application execution to shift network workload as required. Preferably, adjustment of routing policies in selective nodes is performed at synchronization points. The network may be dynamically monitored, and routing strategies adjusted according to detected network conditions.

  11. Change Processes and Strategies at the Local Level.

    ERIC Educational Resources Information Center

    Fullan, Michael

    Change processes at the school building level are considered in order to formulate a number of locally based strategies, derived from research, for significantly improving schools and classrooms. Part I of the three-part analysis examines, through illustration, what is known about successful change processes at the school and classroom levels.…

  12. The Myriad Strategies for Seeking Control in the Dying Process

    ERIC Educational Resources Information Center

    Schroepfer, Tracy A.; Noh, Hyunjin; Kavanaugh, Melinda

    2009-01-01

    Purpose: This study explored the role control plays in the dying process of terminally ill elders by investigating the aspects of the dying process over which they seek to exercise control, the strategies they use, and whether they desire to exercise more control. Design and Methods: In-depth face-to-face interviews were conducted with 84…

  13. Design of a massively parallel computer using bit serial processing elements

    NASA Technical Reports Server (NTRS)

    Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing

    1995-01-01

    A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.

  14. Information-Limited Parallel Processing in Difficult Heterogeneous Covert Visual Search

    ERIC Educational Resources Information Center

    Dosher, Barbara Anne; Han, Songmei; Lu, Zhong-Lin

    2010-01-01

    Difficult visual search is often attributed to time-limited serial attention operations, although neural computations in the early visual system are parallel. Using probabilistic search models (Dosher, Han, & Lu, 2004) and a full time-course analysis of the dynamics of covert visual search, we distinguish unlimited capacity parallel versus serial…

  15. Parallel processing of Eulerian-Lagrangian, cell-based adaptive method for moving boundary problems

    NASA Astrophysics Data System (ADS)

    Kuan, Chih-Kuang

    In this study, issues and techniques related to the parallel processing of the Eulerian-Lagrangian method for multi-scale moving boundary computation are investigated. The scope of the study consists of the Eulerian approach for field equations, explicit interface-tracking, Lagrangian interface modification and reconstruction algorithms, and a cell-based unstructured adaptive mesh refinement (AMR) in a distributed-memory computation framework. We decomposed the Eulerian domain spatially along with AMR to balance the computational load of solving field equations, which is a primary cost of the entire solver. The Lagrangian domain is partitioned based on marker vicinities with respect to the Eulerian partitions to minimize inter-processor communication. Overall, the performance of an Eulerian task peaks at 10,000-20,000 cells per processor, and it is the upper bound of the performance of the Eulerian- Lagrangian method. Moreover, the load imbalance of the Lagrangian task is not as influential as the communication overhead of the Eulerian-Lagrangian tasks on the overall performance. To assess the parallel processing capabilities, a high Weber number drop collision is simulated. The high convective to viscous length scale ratios result in disparate length scale distributions; together with the moving and topologically irregular interfaces, the computational tasks require temporally and spatially resolved treatment adaptively. The techniques presented enable us to perform original studies to meet such computational requirements. Coalescence, stretch, and break-up of satellite droplets due to the interfacial instability are observed in current study, and the history of interface evolution is in good agreement with the experimental data. The competing mechanisms of the primary and secondary droplet break up, along with the gas-liquid interfacial dynamics are systematically investigated. This study shows that Rayleigh-Taylor instability on the edge of an extruding sheet

  16. Automatic analysis (aa): efficient neuroimaging workflows and parallel processing using Matlab and XML.

    PubMed

    Cusack, Rhodri; Vicente-Grabovetsky, Alejandro; Mitchell, Daniel J; Wild, Conor J; Auer, Tibor; Linke, Annika C; Peelle, Jonathan E

    2014-01-01

    Recent years have seen neuroimaging data sets becoming richer, with larger cohorts of participants, a greater variety of acquisition techniques, and increasingly complex analyses. These advances have made data analysis pipelines complicated to set up and run (increasing the risk of human error) and time consuming to execute (restricting what analyses are attempted). Here we present an open-source framework, automatic analysis (aa), to address these concerns. Human efficiency is increased by making code modular and reusable, and managing its execution with a processing engine that tracks what has been completed and what needs to be (re)done. Analysis is accelerated by optional parallel processing of independent tasks on cluster or cloud computing resources. A pipeline comprises a series of modules that each perform a specific task. The processing engine keeps track of the data, calculating a map of upstream and downstream dependencies for each module. Existing modules are available for many analysis tasks, such as SPM-based fMRI preprocessing, individual and group level statistics, voxel-based morphometry, tractography, and multi-voxel pattern analyses (MVPA). However, aa also allows for full customization, and encourages efficient management of code: new modules may be written with only a small code overhead. aa has been used by more than 50 researchers in hundreds of neuroimaging studies comprising thousands of subjects. It has been found to be robust, fast, and efficient, for simple-single subject studies up to multimodal pipelines on hundreds of subjects. It is attractive to both novice and experienced users. aa can reduce the amount of time neuroimaging laboratories spend performing analyses and reduce errors, expanding the range of scientific questions it is practical to address.

  17. Automatic analysis (aa): efficient neuroimaging workflows and parallel processing using Matlab and XML

    PubMed Central

    Cusack, Rhodri; Vicente-Grabovetsky, Alejandro; Mitchell, Daniel J.; Wild, Conor J.; Auer, Tibor; Linke, Annika C.; Peelle, Jonathan E.

    2015-01-01

    Recent years have seen neuroimaging data sets becoming richer, with larger cohorts of participants, a greater variety of acquisition techniques, and increasingly complex analyses. These advances have made data analysis pipelines complicated to set up and run (increasing the risk of human error) and time consuming to execute (restricting what analyses are attempted). Here we present an open-source framework, automatic analysis (aa), to address these concerns. Human efficiency is increased by making code modular and reusable, and managing its execution with a processing engine that tracks what has been completed and what needs to be (re)done. Analysis is accelerated by optional parallel processing of independent tasks on cluster or cloud computing resources. A pipeline comprises a series of modules that each perform a specific task. The processing engine keeps track of the data, calculating a map of upstream and downstream dependencies for each module. Existing modules are available for many analysis tasks, such as SPM-based fMRI preprocessing, individual and group level statistics, voxel-based morphometry, tractography, and multi-voxel pattern analyses (MVPA). However, aa also allows for full customization, and encourages efficient management of code: new modules may be written with only a small code overhead. aa has been used by more than 50 researchers in hundreds of neuroimaging studies comprising thousands of subjects. It has been found to be robust, fast, and efficient, for simple-single subject studies up to multimodal pipelines on hundreds of subjects. It is attractive to both novice and experienced users. aa can reduce the amount of time neuroimaging laboratories spend performing analyses and reduce errors, expanding the range of scientific questions it is practical to address. PMID:25642185

  18. Automatic analysis (aa): efficient neuroimaging workflows and parallel processing using Matlab and XML.

    PubMed

    Cusack, Rhodri; Vicente-Grabovetsky, Alejandro; Mitchell, Daniel J; Wild, Conor J; Auer, Tibor; Linke, Annika C; Peelle, Jonathan E

    2014-01-01

    Recent years have seen neuroimaging data sets becoming richer, with larger cohorts of participants, a greater variety of acquisition techniques, and increasingly complex analyses. These advances have made data analysis pipelines complicated to set up and run (increasing the risk of human error) and time consuming to execute (restricting what analyses are attempted). Here we present an open-source framework, automatic analysis (aa), to address these concerns. Human efficiency is increased by making code modular and reusable, and managing its execution with a processing engine that tracks what has been completed and what needs to be (re)done. Analysis is accelerated by optional parallel processing of independent tasks on cluster or cloud computing resources. A pipeline comprises a series of modules that each perform a specific task. The processing engine keeps track of the data, calculating a map of upstream and downstream dependencies for each module. Existing modules are available for many analysis tasks, such as SPM-based fMRI preprocessing, individual and group level statistics, voxel-based morphometry, tractography, and multi-voxel pattern analyses (MVPA). However, aa also allows for full customization, and encourages efficient management of code: new modules may be written with only a small code overhead. aa has been used by more than 50 researchers in hundreds of neuroimaging studies comprising thousands of subjects. It has been found to be robust, fast, and efficient, for simple-single subject studies up to multimodal pipelines on hundreds of subjects. It is attractive to both novice and experienced users. aa can reduce the amount of time neuroimaging laboratories spend performing analyses and reduce errors, expanding the range of scientific questions it is practical to address. PMID:25642185

  19. Processes in arithmetic strategy selection: a fMRI study.

    PubMed

    Taillan, Julien; Ardiale, Eléonore; Anton, Jean-Luc; Nazarian, Bruno; Félician, Olivier; Lemaire, Patrick

    2015-01-01

    This neuroimaging (functional magnetic resonance imaging) study investigated neural correlates of strategy selection. Young adults performed an arithmetic task in two different conditions. In both conditions, participants had to provide estimates of two-digit multiplication problems like 54 × 78. In the choice condition, participants had to select the better of two available rounding strategies, rounding-up (RU) strategy (i.e., doing 60 × 80 = 4,800) or rounding-down (RD) strategy (i.e., doing 50 × 70 = 3,500 to estimate product of 54 × 78). In the no-choice condition, participants did not have to select strategy on each problem but were told which strategy to use; they executed RU and RD strategies each on a series of problems. Participants also had a control task (i.e., providing correct products of multiplication problems like 40 × 50). Brain activations and performance were analyzed as a function of these conditions. Participants were able to frequently choose the better strategy in the choice condition; they were also slower when they executed the difficult RU than the easier RD. Neuroimaging data showed greater brain activations in right anterior cingulate cortex (ACC), dorso-lateral prefrontal cortex (DLPFC), and angular gyrus (ANG), when selecting (relative to executing) the better strategy on each problem. Moreover, RU was associated with more parietal cortex activation than RD. These results suggest an important role of fronto-parietal network in strategy selection and have important implications for our further understanding and modeling cognitive processes underlying strategy selection.

  20. Optimization of Parallel Legendre Transform using Graphics Processing Unit (GPU) for a Geodynamo Code

    NASA Astrophysics Data System (ADS)

    Lokavarapu, H. V.; Matsui, H.

    2015-12-01

    Convection and magnetic field of the Earth's outer core are expected to have vast length scales. To resolve these flows, high performance computing is required for geodynamo simulations using spherical harmonics transform (SHT), a significant portion of the execution time is spent on the Legendre transform. Calypso is a geodynamo code designed to model magnetohydrodynamics of a Boussinesq fluid in a rotating spherical shell, such as the outer core of the Earth. The code has been shown to scale well on computer clusters capable of computing at the order of 10⁵ cores using Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) parallelization for CPUs. To further optimize, we investigate three different algorithms of the SHT using GPUs. One is to preemptively compute the Legendre polynomials on the CPU before executing SHT on the GPU within the time integration loop. In the second approach, both the Legendre polynomials and the SHT are computed on the GPU simultaneously. In the third approach , we initially partition the radial grid for the forward transform and the harmonic order for the backward transform between the CPU and GPU. There after, the partitioned works are simultaneously computed in the time integration loop. We examine the trade-offs between space and time, memory bandwidth and GPU computations on Maverick, a Texas Advanced Computing Center (TACC) supercomputer. We have observed improved performance using a GPU enabled Legendre transform. Furthermore, we will compare and contrast the different algorithms in the context of GPUs.

  1. Medical ultrasound digital beamforming on a massively parallel processing array platform

    NASA Astrophysics Data System (ADS)

    Chen, Paul; Butts, Mike; Budlong, Brad

    2008-03-01

    Digital beamforming has been widely used in modern medical ultrasound instruments. Flexibility is the key advantage of a digital beamformer over the traditional analog approach. Unlike analog delay lines, digital delay can be programmed to implement new ways of beam shaping and beam steering without hardware modification. Digital beamformers can also be focused dynamically by tracking the depth and focusing the receive beam as the depth increases. By constantly updating an element weight table, a digital beamformer can dynamically increase aperture size with depth to maintain constant lateral resolution and reduce sidelobe noise. Because ultrasound digital beamformers have high I/O bandwidth and processing requirements, traditionally they have been implemented using ASICs or FPGAs that are costly both in time and in money. This paper introduces a sample implementation of a digital beamformer that is programmed in software on a Massively Parallel Processor Array (MPPA). The system consists of a host PC and a PCI Express-based beamformer accelerator with an Ambric Am2045 MPPA chip and 512 Mbytes of external memory. The Am2045 has 336 asynchronous RISCDSP processors that communicate through a configurable structure of channels, using a self-synchronizing communication protocol.

  2. Early parallel processing in reading: a connectionist approach. Technical report, April-November 1986

    SciTech Connect

    Mozer, M.C.

    1986-12-01

    To what extent can information distributed across the visual field be processed in parallel. A connectionist model capable of recognizing multiple words appearing simultaneously on its retina is described that addresses this question. The model relies on the notion of a hierarchy of detectors, starting at the lowest level with position-specific primitive-feature detectors, and progressing to a level composed of position-independent letter cluster detectors. Intervening levels register successively higher-order features and also collapse over local spatial regions of the level below resulting in less positional specificity of the detectors. Using an associative learning rule, the model has been taught to recognize a large sample of words in arbitrary retinal locations. Following this training, it is also able to recognize several words simultaneously, although under certain conditions crosstalk among words can become unmanageable. The model includes an attentional mechanism, which can limit crosstalk, and a serial readout mechanism, which is necessary for a word to reach awareness. While exhaustive simulation experiments have yet to be carried out, there are a variety of phenomena, both experimental and anecdotal, that the model appears well-equipped to account for, including: translation and scale invariant recognition, positional uncertainty at the letter and word levels, the recognition of misspelled words, the integration of information across fixations, similarity-based interference effects, and the role of focal attention in localization.

  3. A Parallel Process Growth Mixture Model of Conduct Problems and Substance Use with Risky Sexual Behavior

    PubMed Central

    Wu, Johnny; Witkiewitz, Katie; McMahon, Robert J.; Dodge, Kenneth A.

    2010-01-01

    Conduct problems, substance use, and risky sexual behavior have been shown to coexist among adolescents, which may lead to significant health problems. The current study was designed to examine relations among these problem behaviors in a community sample of children at high risk for conduct disorder. A latent growth model of childhood conduct problems showed a decreasing trend from grades K to 5. During adolescence, four concurrent conduct problem and substance use trajectory classes were identified (high conduct problems and high substance use, increasing conduct problems and increasing substance use, minimal conduct problems and increasing substance use, and minimal conduct problems and minimal substance use) using a parallel process growth mixture model. Across all substances (tobacco, binge drinking, and marijuana use), higher levels of childhood conduct problems during kindergarten predicted a greater probability of classification into more problematic adolescent trajectory classes relative to less problematic classes. For tobacco and binge drinking models, increases in childhood conduct problems over time also predicted a greater probability of classification into more problematic classes. For all models, individuals classified into more problematic classes showed higher proportions of early sexual intercourse, infrequent condom use, receiving money for sexual services, and ever contracting an STD. Specifically, tobacco use and binge drinking during early adolescence predicted higher levels of sexual risk taking into late adolescence. Results highlight the importance of studying the conjoint relations among conduct problems, substance use, and risky sexual behavior in a unified model. PMID:20558013

  4. Parallel Distributed Processing at 25: further explorations in the microstructure of cognition.

    PubMed

    Rogers, Timothy T; McClelland, James L

    2014-08-01

    This paper introduces a special issue of Cognitive Science initiated on the 25th anniversary of the publication of Parallel Distributed Processing (PDP), a two-volume work that introduced the use of neural network models as vehicles for understanding cognition. The collection surveys the core commitments of the PDP framework, the key issues the framework has addressed, and the debates the framework has spawned, and presents viewpoints on the current status of these issues. The articles focus on both historical roots and contemporary developments in learning, optimality theory, perception, memory, language, conceptual knowledge, cognitive control, and consciousness. Here we consider the approach more generally, reviewing the original motivations, the resulting framework, and the central tenets of the underlying theory. We then evaluate the impact of PDP both on the field at large and within specific subdomains of cognitive science and consider the current role of PDP models within the broader landscape of contemporary theoretical frameworks in cognitive science. Looking to the future, we consider the implications for cognitive science of the recent success of machine learning systems called "deep networks"-systems that build on key ideas presented in the PDP volumes.

  5. Extensive separations (CLEAN) processing strategy compared to TRUEX strategy and sludge wash ion exchange

    SciTech Connect

    Knutson, B.J.; Jansen, G.; Zimmerman, B.D.; Seeman, S.E.; Lauerhass, L.; Hoza, M.

    1994-08-01

    Numerous pretreatment flowsheets have been proposed for processing the radioactive wastes in Hanford`s 177 underground storage tanks. The CLEAN Option is examined along with two other flowsheet alternatives to quantify the trade-off of greater capital equipment and operating costs for aggressive separations with the reduced waste disposal costs and decreased environmental/health risks. The effect on the volume of HLW glass product and radiotoxicity of the LLW glass or grout product is predicted with current assumptions about waste characteristics and separations processes using a mass balance model. The prediction is made on three principal processing options: washing of tank wastes with removal of cesium and technetium from the supernatant, with washed solids routed directly to the glass (referred to as the Sludge Wash C processing strategy); the previous steps plus dissolution of the solids and removal of transuranic (TRU) elements, uranium, and strontium using solvent extraction processes (referred to as the Transuranic Extraction Option C (TRUEX-C) processing strategy); and an aggressive yet feasible processing strategy for separating the waste components to meet several main goals or objectives (referred to as the CLEAN Option processing strategy), such as the LLW is required to meet the US Nuclear Regulatory Commission Class A limits; concentrations of technetium, iodine, and uranium are reduced as low as reasonably achievable; and HLW will be contained within 1,000 borosilicate glass canisters that meet current Hanford Waste Vitrification Plant glass specifications.

  6. Stochastic dynamics of small ensembles of non-processive molecular motors: The parallel cluster model

    NASA Astrophysics Data System (ADS)

    Erdmann, Thorsten; Albert, Philipp J.; Schwarz, Ulrich S.

    2013-11-01

    Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes, or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors in equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of the fraction of bound motors under load and thus to firm attachment even for small ensembles. This adaptation to load results in a concave force-velocity relation described by a Hill relation. For external load provided by a linear spring, myosin II ensembles dynamically adjust themselves towards an isometric state with constant average position and load. The dynamics of the ensembles is now determined mainly by the distribution of motors over the different kinds of bound states. For increasing stiffness of the external spring, there is a sharp transition beyond which myosin II can no longer perform the power stroke. Slow unbinding from the pre-power-stroke state protects the ensembles against detachment.

  7. Stochastic dynamics of small ensembles of non-processive molecular motors: The parallel cluster model

    SciTech Connect

    Erdmann, Thorsten; Albert, Philipp J.; Schwarz, Ulrich S.

    2013-11-07

    Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes, or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors in equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of the fraction of bound motors under load and thus to firm attachment even for small ensembles. This adaptation to load results in a concave force-velocity relation described by a Hill relation. For external load provided by a linear spring, myosin II ensembles dynamically adjust themselves towards an isometric state with constant average position and load. The dynamics of the ensembles is now determined mainly by the distribution of motors over the different kinds of bound states. For increasing stiffness of the external spring, there is a sharp transition beyond which myosin II can no longer perform the power stroke. Slow unbinding from the pre-power-stroke state protects the ensembles against detachment.

  8. Reconstruction for time-domain in vivo EPR 3D multigradient oximetric imaging--a parallel processing perspective.

    PubMed

    Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C

    2009-01-01

    Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  9. The fast multipole method on parallel clusters, multicore processors, and graphics processing units

    NASA Astrophysics Data System (ADS)

    Darve, Eric; Cecka, Cris; Takahashi, Toru

    2011-02-01

    In this article, we discuss how the fast multipole method (FMM) can be implemented on modern parallel computers, ranging from computer clusters to multicore processors and graphics cards (GPU). The FMM is a somewhat difficult application for parallel computing because of its tree structure and the fact that it requires many complex operations which are not regularly structured. Computational linear algebra with dense matrices for example allows many optimizations that leverage the regular computation pattern. FMM can be similarly optimized but we will see that the complexity of the optimization steps is greater. The discussion will start with a general presentation of FMMs. We briefly discuss parallel methods for the FMM, such as building the FMM tree in parallel, and reducing communication during the FMM procedure. Finally, we will focus on porting and optimizing the FMM on GPUs.

  10. Applications of Parallel Process HiMAP for Large Scale Multidisciplinary Problems

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru P.; Potsdam, Mark; Rodriguez, David; Kwak, Dochay (Technical Monitor)

    2000-01-01

    HiMAP is a three level parallel middleware that can be interfaced to a large scale global design environment for code independent, multidisciplinary analysis using high fidelity equations. Aerospace technology needs are rapidly changing. Computational tools compatible with the requirements of national programs such as space transportation are needed. Conventional computation tools are inadequate for modern aerospace design needs. Advanced, modular computational tools are needed, such as those that incorporate the technology of massively parallel processors (MPP).

  11. Parallel versus Serial Processing Dependencies in the Perisylvian Speech Network: A Granger Analysis of Intracranial EEG Data

    ERIC Educational Resources Information Center

    Gow, David W., Jr.; Keller, Corey J.; Eskandar, Emad; Meng, Nate; Cash, Sydney S.

    2009-01-01

    In this work, we apply Granger causality analysis to high spatiotemporal resolution intracranial EEG (iEEG) data to examine how different components of the left perisylvian language network interact during spoken language perception. The specific focus is on the characterization of serial versus parallel processing dependencies in the dominant…

  12. Testing the Theoretical Design of a Health Risk Message: Reexamining the Major Tenets of the Extended Parallel Process Model

    ERIC Educational Resources Information Center

    Gore, Thomas D.; Bracken, Cheryl Campanella

    2005-01-01

    This study examined the fear control/danger control responses that are predicted by the Extended Parallel Process Model (EPPM). In a campaign designed to inform college students about the symptoms and dangers of meningitis, participants were given either a high-threat/no-efficacy or high-efficacy/no-threat health risk message, thus testing the…

  13. Parallel Processing of Big Point Clouds Using Z-Order Partitioning

    NASA Astrophysics Data System (ADS)

    Alis, C.; Boehm, J.; Liu, K.

    2016-06-01

    As laser scanning technology improves and costs are coming down, the amount of point cloud data being generated can be prohibitively difficult and expensive to process on a single machine. This data explosion is not only limited to point cloud data. Voluminous amounts of high-dimensionality and quickly accumulating data, collectively known as Big Data, such as those generated by social media, Internet of Things devices and commercial transactions, are becoming more prevalent as well. New computing paradigms and frameworks are being developed to efficiently handle the processing of Big Data, many of which utilize a compute cluster composed of several commodity grade machines to process chunks of data in parallel. A central concept in many of these frameworks is data locality. By its nature, Big Data is large enough that the entire dataset would not fit on the memory and hard drives of a single node hence replicating the entire dataset to each worker node is impractical. The data must then be partitioned across worker nodes in a manner that minimises data transfer across the network. This is a challenge for point cloud data because there exist different ways to partition data and they may require data transfer. We propose a partitioning based on Z-order which is a form of locality-sensitive hashing. The Z-order or Morton code is computed by dividing each dimension to form a grid then interleaving the binary representation of each dimension. For example, the Z-order code for the grid square with coordinates (x = 1 = 012, y = 3 = 112) is 10112 = 11. The number of points in each partition is controlled by the number of bits per dimension: the more bits, the fewer the points. The number of bits per dimension also controls the level of detail with more bits yielding finer partitioning. We present this partitioning method by implementing it on Apache Spark and investigating how different parameters affect the accuracy and running time of the k nearest neighbour algorithm

  14. Using Monte Carlo techniques and parallel processing for fragmentation analysis of explosive payloads

    SciTech Connect

    LaFarge, R.A.

    1992-01-01

    Sandia National Laboratories (SNL) launched the Los Alamos National Laboratory (LANL) sponsored ZEST flight test program from the SNL Kauai Test Facility (KTF) in the summer of 1991. The ZEST program had about 255 pounds of high explosive (HE) aboard a Talos-Castor launch vehicle. Naturally, such undertakings raise questions about the safety of personnel and the environment in the event of a premature detonation of the HE. These questions pertain not only to KTF and the island of Kauai but to the neighboring islands as well. The ability to determine realistically P{sub I}, the probability of an explosively generated fragment impacting a given exclusion area, is an important factor in the safety analysis of any flight test involving explosive payloads. Once P{sub I} is known, the casualty expectations C{sub E} can be computed based on local demographics. A set of two computer codes was developed to determine P{sub I} based on computed fragment impacts. One of these codes, SAFETIE1 (Sandia Analysis of FragmEnt TrajectorIEs), computes files of trajectory initial conditions generated in a Monte Carlo sense for a set of n explosions each containing m fragments. These initial condition files are then used to compute trajectories in a parallel processing environment using a local area network (LAN) of 40 Sun workstations. This approach saves the equivalent of 40 hours of Cray YMP time. The other code, SAFETIE2, is a postprocessor that uses an AMEER output file generated by SAFETIE1, to determine how many explosions have at least one fragment in a user defined exclusion area. The average number of fragments per explosion in the exclusion area ({bar N}) is also computed (for C{sub E} considerations). 12 refs.

  15. Parallel processing streams for motor output and sensory prediction during action preparation

    PubMed Central

    Bauer, Markus; Heinze, Hans-Jochen; Haggard, Patrick; Dolan, Raymond J.

    2014-01-01

    Sensory consequences of one's own actions are perceived as less intense than identical, externally generated stimuli. This is generally taken as evidence for sensory prediction of action consequences. Accordingly, recent theoretical models explain this attenuation by an anticipatory modulation of sensory processing prior to stimulus onset (Roussel et al. 2013) or even action execution (Brown et al. 2013). Experimentally, prestimulus changes that occur in anticipation of self-generated sensations are difficult to disentangle from more general effects of stimulus expectation, attention and task load (performing an action). Here, we show that an established manipulation of subjective agency over a stimulus leads to a predictive modulation in sensory cortex that is independent of these factors. We recorded magnetoencephalography while subjects performed a simple action with either hand and judged the loudness of a tone caused by the action. Effector selection was manipulated by subliminal motor priming. Compatible priming is known to enhance a subjective experience of agency over a consequent stimulus (Chambon and Haggard 2012). In line with this effect on subjective agency, we found stronger sensory attenuation when the action that caused the tone was compatibly primed. This perceptual effect was reflected in a transient phase-locked signal in auditory cortex before stimulus onset and motor execution. Interestingly, this sensory signal emerged at a time when the hemispheric lateralization of motor signals in M1 indicated ongoing effector selection. Our findings confirm theoretical predictions of a sensory modulation prior to self-generated sensations and support the idea that a sensory prediction is generated in parallel to motor output (Walsh and Haggard 2010), before an efference copy becomes available. PMID:25540223

  16. Parallel processing streams for motor output and sensory prediction during action preparation.

    PubMed

    Stenner, Max-Philipp; Bauer, Markus; Heinze, Hans-Jochen; Haggard, Patrick; Dolan, Raymond J

    2015-03-15

    Sensory consequences of one's own actions are perceived as less intense than identical, externally generated stimuli. This is generally taken as evidence for sensory prediction of action consequences. Accordingly, recent theoretical models explain this attenuation by an anticipatory modulation of sensory processing prior to stimulus onset (Roussel et al. 2013) or even action execution (Brown et al. 2013). Experimentally, prestimulus changes that occur in anticipation of self-generated sensations are difficult to disentangle from more general effects of stimulus expectation, attention and task load (performing an action). Here, we show that an established manipulation of subjective agency over a stimulus leads to a predictive modulation in sensory cortex that is independent of these factors. We recorded magnetoencephalography while subjects performed a simple action with either hand and judged the loudness of a tone caused by the action. Effector selection was manipulated by subliminal motor priming. Compatible priming is known to enhance a subjective experience of agency over a consequent stimulus (Chambon and Haggard 2012). In line with this effect on subjective agency, we found stronger sensory attenuation when the action that caused the tone was compatibly primed. This perceptual effect was reflected in a transient phase-locked signal in auditory cortex before stimulus onset and motor execution. Interestingly, this sensory signal emerged at a time when the hemispheric lateralization of motor signals in M1 indicated ongoing effector selection. Our findings confirm theoretical predictions of a sensory modulation prior to self-generated sensations and support the idea that a sensory prediction is generated in parallel to motor output (Walsh and Haggard 2010), before an efference copy becomes available.

  17. Effects of Organizational Signals on Text-Processing Strategies.

    ERIC Educational Resources Information Center

    Lorch, Robert F., Jr.; Lorch, Elizabeth Pugzles

    1995-01-01

    Two hypotheses about how organizational signals influence text recall were tested with 274 college students who read and recalled a text with or without signals. Results are consistent with the hypothesis that organizational signals induce readers to change their text-processing strategies. (Author/SLD)

  18. Modeling Cognitive Strategies during Complex Task Performing Process

    ERIC Educational Resources Information Center

    Mazman, Sacide Guzin; Altun, Arif

    2012-01-01

    The purpose of this study is to examine individuals' computer based complex task performing processes and strategies in order to determine the reasons of failure by cognitive task analysis method and cued retrospective think aloud with eye movement data. Study group was five senior students from Computer Education and Instructional Technologies…

  19. Parallel processing of information about location in the amygdala, entorhinal cortex and hippocampus.

    PubMed

    Gaskin, Stephane; White, Norman M

    2013-11-01

    The conditioned cue preference paradigm was used to study how rats use extra-maze cues to discriminate between 2 adjacent arms on an 8-arm radial maze, a situation in which most of the same cues can be seen from both arms but only one arm contains food. Since the food-restricted rats eat while passively confined on the food-paired arm no responses are reinforced, so the discrimination is due to Pavlovian stimulus-reward (or outcome) learning. Consistent with other evidence that rats must move around in an environment to acquire a spatial map, we found that learning the adjacent arms CCP (ACCP) required a minimum amount of active exploration of the maze with no reinforcers present prior to passive pairing of the extra-maze cues with the food reinforcer, an instance of latent learning. Temporary inactivation of the hippocampus during the pre-exposure sessions had no effect on ACCP learning, confirming other evidence that the hippocampus is not involved in latent learning. A series of experiments indentified a circuit involving fimbria-fornix and dorsal entorhinal cortex as the neural basis of latent learning in this situation. In contrast, temporary inactivation of the entorhinal cortex or hippocampus during passive training or during testing blocked ACCP learning and expression, respectively, suggesting that these two structures co-operate in using spatial information to learn the location of food on the maze during passive pairing and to express this combined information during testing. In parallel with these processes we found that the amygdala processes information leading to an equal tendency to enter both adjacent arms (even though only one was paired with food) suggesting that the stimulus information available to this structure is not sufficiently precise to discriminate between the ambiguous cues visible from the adjacent arms. Expression of the ACCP in normal rats depends on hippocampus-based learning to avoid the unpaired arm which competes with the

  20. A parallel real-time computing-cluster implementation of spotlight SAR processing

    NASA Astrophysics Data System (ADS)

    Mathew, Bipin; Rabinkin, Daniel

    2005-05-01

    The high resolution imaging capability of Synthetic Aperture Radar (SAR) is largely unaffected by atmospheric conditions and has proven to be an indispensable asset in a variety of military and civilian applications. Application of SAR methodology for real-time imaging however carries with it the large computational complexity and storage requirements of the image-forming algorithms. Recently however, the rapidly diminishing cost of computing hardware and the related ascent of cluster-based computing, has made parallelization of these algorithms an appealing area of investigation. This paper describes a parallel SAR processor developed at MIT Lincoln Laboratory. Several novel technologies were employed in it's implementation, including pMatlab which is a parallel extension of standard Matlab that is also being developed at MIT Lincoln Laboratory. These technologies will be described later in the document. We begin with a brief description of the basic SAR algorithm.

  1. Information-limited parallel processing in difficult heterogeneous covert visual search.

    PubMed

    Dosher, Barbara Anne; Han, Songmei; Lu, Zhong-Lin

    2010-10-01

    Difficult visual search is often attributed to time-limited serial attention operations, although neural computations in the early visual system are parallel. Using probabilistic search models (Dosher, Han, & Lu, 2004) and a full time-course analysis of the dynamics of covert visual search, we distinguish unlimited capacity parallel versus serial search mechanisms. Performance is measured for difficult and error-prone searches among heterogeneous background elements and for easy and accurate searches among homogeneous background elements. Contrary to the claims of time-limited serial attention, searches in heterogeneous backgrounds instead exhibited nearly identical search dynamics for display sizes up to 12 items. A review and new analyses indicate that most difficult as well as easy visual searches operate as an unlimited-capacity parallel analysis over the visual field within a single eye fixation, which suggests limitations in the availability of information, not temporal bottlenecks in analysis or comparison. Serial properties likely reflect overt attention expressed in eye movements.

  2. A Strategy to Support Design Processes for Fibre Reinforced Thermoset Composite Materials

    NASA Astrophysics Data System (ADS)

    Gascons, Marc; Blanco, Norbert; Mayugo, Joan Andreu; Matthys, Koen

    2012-06-01

    The concept stage in the design for a new composite part is a time when several fundamental decisions must be taken and a considerable amount of the budget is spent. Specialized commercial software packages can be used to support the decision making process in particular aspects of the project (e.g. material selection, numerical analysis, cost prediction,...). However, a complete and integrated virtual environment that covers all the steps in the process is not yet available for the composite design and manufacturing industry. This paper does not target the creation of such an overarching virtual tool, but instead presents a strategy that handles the information generated in each step of the design process, independently of the commercial packages used. Having identified a suitable design parameter shared in common with all design steps, the proposed strategy is able to evaluate the effects of design variations throughout all the design steps in parallel. A case study illustrating the strategy on an industrial part is presented.

  3. Strategy for a flexible and noncontact measuring process for freeforms

    NASA Astrophysics Data System (ADS)

    Beutler, Andreas

    2016-07-01

    The cylindrical coordinate measuring machine MarForm MFU200 can measure not only rotationally symmetric aspheric samples but also nonrotationally symmetric freeform surfaces. Applying both an optical and a tactile probe system, the measuring processes of the optical freeform surface and fiducials can be combined in a very flexible way. A strategy to measure freeforms including the determination of reference coordinate systems, the measuring process, and the analysis are discussed. In this process, fiducials defining a reference coordinate system are of fundamental importance. It is shown how different positions of fiducials can be measured.

  4. Parallel Processing of Numerical Tsunami Simulations on a High Performance Cluster based on the GDAL Library

    NASA Astrophysics Data System (ADS)

    Schroeder, Matthias; Jankowski, Cedric; Hammitzsch, Martin; Wächter, Joachim

    2014-05-01

    Thousands of numerical tsunami simulations allow the computation of inundation and run-up along the coast for vulnerable areas over the time. A so-called Matching Scenario Database (MSDB) [1] contains this large number of simulations in text file format. In order to visualize these wave propagations the scenarios have to be reprocessed automatically. In the TRIDEC project funded by the seventh Framework Programme of the European Union a Virtual Scenario Database (VSDB) and a Matching Scenario Database (MSDB) were established amongst others by the working group of the University of Bologna (UniBo) [1]. One part of TRIDEC was the developing of a new generation of a Decision Support System (DSS) for tsunami Early Warning Systems (TEWS) [2]. A working group of the GFZ German Research Centre for Geosciences was responsible for developing the Command and Control User Interface (CCUI) as central software application which support operator activities, incident management and message disseminations. For the integration and visualization in the CCUI, the numerical tsunami simulations from MSDB must be converted into the shapefiles format. The usage of shapefiles enables a much easier integration into standard Geographic Information Systems (GIS). Since also the CCUI is based on two widely used open source products (GeoTools library and uDig), whereby the integration of shapefiles is provided by these libraries a priori. In this case, for an example area around the Western Iberian margin several thousand tsunami variations were processed. Due to the mass of data only a program-controlled process was conceivable. In order to optimize the computing efforts and operating time the use of an existing GFZ High Performance Computing Cluster (HPC) had been chosen. Thus, a geospatial software was sought after that is capable for parallel processing. The FOSS tool Geospatial Data Abstraction Library (GDAL/OGR) was used to match the coordinates with the wave heights and generates the

  5. Parallel processing of real-time dynamic systems simulation on OSCAR (Optimally SCheduled Advanced multiprocessoR)

    NASA Technical Reports Server (NTRS)

    Kasahara, Hironori; Honda, Hiroki; Narita, Seinosuke

    1989-01-01

    Parallel processing of real-time dynamic systems simulation on a multiprocessor system named OSCAR is presented. In the simulation of dynamic systems, generally, the same calculation are repeated every time step. However, we cannot apply to Do-all or the Do-across techniques for parallel processing of the simulation since there exist data dependencies from the end of an iteration to the beginning of the next iteration and furthermore data-input and data-output are required every sampling time period. Therefore, parallelism inside the calculation required for a single time step, or a large basic block which consists of arithmetic assignment statements, must be used. In the proposed method, near fine grain tasks, each of which consists of one or more floating point operations, are generated to extract the parallelism from the calculation and assigned to processors by using optimal static scheduling at compile time in order to reduce large run time overhead caused by the use of near fine grain tasks. The practicality of the scheme is demonstrated on OSCAR (Optimally SCheduled Advanced multiprocessoR) which has been developed to extract advantageous features of static scheduling algorithms to the maximum extent.

  6. Parallel and serial processing of haptic information in man: effects of parietal lesions on sensorimotor hand function.

    PubMed

    Knecht, S; Kunesch, E; Schnitzler, A

    1996-07-01

    Recent animal studies have shown that there is an evolutionary shift within the order of primates from parallel to serial processing of haptic information. In an attempt to determine whether there is also evidence of serial processing in humans 10 patients with parietal cortical lesions, three patients with subcortical lesions and one patient after hemispherectomy, were examined. Case-by-case and across subject analysis of lesion type, sensorimotor profile and electrophysiological findings showed that in unihemispheric lesions: (a) there is little impairment of thermesthesia, nociception and vibration sense: (b) two-point discrimination and integrity of the N20 somatosensory component are highly correlated; (c) a loss of the N20 component is accompanied by a severe impairment of stereognosis; (d) conversely, in more posterior lesions astereognosis can occur with an intact N20 component; and (e) if the lesion is in the right hemisphere there is frequently impairment of graphesthesia in both hands. These data are taken to indicate serial processing from SI (as evidenced by an intact N20 component) to posterior parietal cortex allowing progressive spatial and temporal integration. In graphesthesia our data suggest an integrative function of the right parietal cortex for both sides of the body. Other sensory qualities like vibration nociception and thermesthesia are apparently processed in a non-serial, probably parallel way involving both hemispheres. The effects of cerebral lesions in our series suggest parallel as well as serial processing of somesthetic information in man underlying the perception of different haptic features.

  7. Parallel processing approach for radiative heat transfer prediction in participating media

    SciTech Connect

    Saltiel, C.; Naraghi, M.H.N. Manhattan College, Riverdale, NY )

    1993-12-01

    A unified matrix formulation for node-to-node-based radiative exchange in isotropically scattering inhomogeneous media is developed using the discrete exchange factor method. Computational implementations of the unified matrix formulation on serial and parallel computers are compared. 15 refs.

  8. Parallel Process Issues for Lesbian and Gay Adoptive Parents and Their Adopted Children

    ERIC Educational Resources Information Center

    Matthews, John D.; Cramer, Elizabeth P.

    2005-01-01

    Gays and lesbians, both single and coupled, are increasingly turning to adoption to create or expand their families. This manuscript specifically addresses the continuing needs of adoptees and adoptive parents by exploring key issues in the life course of gays and lesbians and their adopted children, and identifying potential parallel development…

  9. Multi-Zone Liquid Thrust Chamber Performance Code with Domain Decomposition for Parallel Processing

    NASA Technical Reports Server (NTRS)

    Navaz, Homayun K.

    2002-01-01

    -equation turbulence model, and two-phase flow. To overcome these limitations, the LTCP code is rewritten to include the multi-zone capability with domain decomposition that makes it suitable for parallel processing, i.e., enabling the code to run every zone or sub-domain on a separate processor. This can reduce the run time by a factor of 6 to 8, depending on the problem.

  10. Parallel algorithm development

    SciTech Connect

    Adams, T.F.

    1996-06-01

    Rapid changes in parallel computing technology are causing significant changes in the strategies being used for parallel algorithm development. One approach is simply to write computer code in a standard language like FORTRAN 77 or with the expectation that the compiler will produce executable code that will run in parallel. The alternatives are: (1) to build explicit message passing directly into the source code; or (2) to write source code without explicit reference to message passing or parallelism, but use a general communications library to provide efficient parallel execution. Application of these strategies is illustrated with examples of codes currently under development.

  11. [A new strategy for Chinese medicine processing technologies: coupled with individuation processed and cybernetics].

    PubMed

    Zhang, Ding-kun; Yang, Ming; Han, Xue; Lin, Jun-zhi; Wang, Jia-bo; Xiao, Xiao-he

    2015-08-01

    The stable and controllable quality of decoction pieces is an important factor to ensure the efficacy of clinical medicine. Considering the dilemma that the existing standardization of processing mode cannot effectively eliminate the variability of quality raw ingredients, and ensure the stability between different batches, we first propose a new strategy for Chinese medicine processing technologies that coupled with individuation processed and cybernetics. In order to explain this thinking, an individual study case about different grades aconite is provided. We hope this strategy could better serve for clinical medicine, and promote the inheritance and innovation of Chinese medicine processing skills and theories. PMID:26790315

  12. [A new strategy for Chinese medicine processing technologies: coupled with individuation processed and cybernetics].

    PubMed

    Zhang, Ding-kun; Yang, Ming; Han, Xue; Lin, Jun-zhi; Wang, Jia-bo; Xiao, Xiao-he

    2015-08-01

    The stable and controllable quality of decoction pieces is an important factor to ensure the efficacy of clinical medicine. Considering the dilemma that the existing standardization of processing mode cannot effectively eliminate the variability of quality raw ingredients, and ensure the stability between different batches, we first propose a new strategy for Chinese medicine processing technologies that coupled with individuation processed and cybernetics. In order to explain this thinking, an individual study case about different grades aconite is provided. We hope this strategy could better serve for clinical medicine, and promote the inheritance and innovation of Chinese medicine processing skills and theories.

  13. Comparisons of elastic and rigid blade-element rotor models using parallel processing technology for piloted simulations

    NASA Technical Reports Server (NTRS)

    Hill, Gary; Duval, Ronald W.; Green, John A.; Huynh, Loc C.

    1991-01-01

    A piloted comparison of rigid and aeroelastic blade-element rotor models was conducted at the Crew Station Research and Development Facility (CSRDF) at Ames Research Center. A simulation development and analysis tool, FLIGHTLAB, was used to implement these models in real time using parallel processing technology. Pilot comments and quantitative analysis performed both on-line and off-line confirmed that elastic degrees of freedom significantly affect perceived handling qualities. Trim comparisons show improved correlation with flight test data when elastic modes are modeled. The results demonstrate the efficiency with which the mathematical modeling sophistication of existing simulation facilities can be upgraded using parallel processing, and the importance of these upgrades to simulation fidelity.

  14. Combining message-passing and inter-process communication in SMP-hybrid cluster for efficient parallel medical image analysis

    NASA Astrophysics Data System (ADS)

    Tan, Sean C. S.; Schmidt, Bertil

    2004-11-01

    Efficient analysis of medical images to assist physician"s decision making is an important task. However, the analysis of such images often requires sophisticated segmentation and classification algorithms. An approach to speed up these time consuming operations is to use parallel processing. In this paper a new parallel system for medical image analysis is presented. The system combines distributed and shared memory architectures using MPI and the inter-process communication switching mechanism (IPC). MPI is used to communicate between nodes and shared-memory IPC is used to perform shared memory operations among processors within a node. We show how to map a clinical endoscopic image analysis algorithm efficiently onto this architecture. This results in an implementation with significant runtime savings.

  15. Nonlinear Memory Capacity of Parallel Time-Delay Reservoir Computers in the Processing of Multidimensional Signals.

    PubMed

    Grigoryeva, Lyudmila; Henriques, Julie; Larger, Laurent; Ortega, Juan-Pablo

    2016-07-01

    This letter addresses the reservoir design problem in the context of delay-based reservoir computers for multidimensional input signals, parallel architectures, and real-time multitasking. First, an approximating reservoir model is presented in those frameworks that provides an explicit functional link between the reservoir architecture and its performance in the execution of a specific task. Second, the inference properties of the ridge regression estimator in the multivariate context are used to assess the impact of finite sample training on the decrease of the reservoir capacity. Finally, an empirical study is conducted that shows the adequacy of the theoretical results with the empirical performances exhibited by various reservoir architectures in the execution of several nonlinear tasks with multidimensional inputs. Our results confirm the robustness properties of the parallel reservoir architecture with respect to task misspecification and parameter choice already documented in the literature.

  16. Nonlinear Memory Capacity of Parallel Time-Delay Reservoir Computers in the Processing of Multidimensional Signals.

    PubMed

    Grigoryeva, Lyudmila; Henriques, Julie; Larger, Laurent; Ortega, Juan-Pablo

    2016-07-01

    This letter addresses the reservoir design problem in the context of delay-based reservoir computers for multidimensional input signals, parallel architectures, and real-time multitasking. First, an approximating reservoir model is presented in those frameworks that provides an explicit functional link between the reservoir architecture and its performance in the execution of a specific task. Second, the inference properties of the ridge regression estimator in the multivariate context are used to assess the impact of finite sample training on the decrease of the reservoir capacity. Finally, an empirical study is conducted that shows the adequacy of the theoretical results with the empirical performances exhibited by various reservoir architectures in the execution of several nonlinear tasks with multidimensional inputs. Our results confirm the robustness properties of the parallel reservoir architecture with respect to task misspecification and parameter choice already documented in the literature. PMID:27172266

  17. Optimized Parallelization for Nonlocal Means Based Low Dose CT Image Processing.

    PubMed

    Zhang, Libo; Yang, Benqiang; Zhuang, Zhikun; Hu, Yining; Chen, Yang; Luo, Limin; Shu, Huazhong

    2015-01-01

    Low dose CT (LDCT) images are often significantly degraded by severely increased mottled noise/artifacts, which can lead to lowered diagnostic accuracy in clinic. The nonlocal means (NLM) filtering can effectively remove mottled noise/artifacts by utilizing large-scale patch similarity information in LDCT images. But the NLM filtering application in LDCT imaging also requires high computation cost because intensive patch similarity calculation within a large searching window is often required to be used to include enough structure-similarity information for noise/artifact suppression. To improve its clinical feasibility, in this study we further optimize the parallelization of NLM filtering by avoiding the repeated computation with the row-wise intensity calculation and the symmetry weight calculation. The shared memory with fast I/O speed is also used in row-wise intensity calculation for the proposed method. Quantitative experiment demonstrates that significant acceleration can be achieved with respect to the traditional straight pixel-wise parallelization.

  18. On the Optimality of Serial and Parallel Processing in the Psychological Refractory Period Paradigm: Effects of the Distribution of Stimulus Onset Asynchronies

    ERIC Educational Resources Information Center

    Miller, Jeff; Ulrich, Rolf; Rolke, Bettina

    2009-01-01

    Within the context of the psychological refractory period (PRP) paradigm, we developed a general theoretical framework for deciding when it is more efficient to process two tasks in serial and when it is more efficient to process them in parallel. This analysis suggests that a serial mode is more efficient than a parallel mode under a wide variety…

  19. Diagrammatic many-body perturbation expansion for atoms and molecules: VI Experiments in vector processing and parallel processing for second-order energy calculations

    NASA Astrophysics Data System (ADS)

    Moncrieff, David; Baker, David J.; Wilson, Stephen

    1989-08-01

    The efficient evaluation of the second-order expression in the many-body perturbation theory expansion for the correlation energy on vector processing and parallel processing computers is discussed. It is argued that the linked diagram theorem not only leads to the well known theoretical advantages of the many-body perturbation theory approach which allows the calculation of correlation energies for large (i.e. extended molecules or species containing heavy atoms) systems but also decouples the many-electron problem allowing efficient implementation on parallel processing machines. Furthermore, the computation associated with each of the resulting subproblems is very well suited to vector processing machines. Timing tests are reported for the CRAY 1 and CDC Cyber 205 vector processors, for a 1 processor implementation on the CRAY X-MP/48 and the ETA-10E, and for a 4 processor implementation on the Cray X-MP/48.

  20. Data quality and processing for decision making: divergence between corporate strategy and manufacturing processes

    NASA Astrophysics Data System (ADS)

    McNeil, Ronald D.; Miele, Renato; Shaul, Dennis

    2000-10-01

    Information technology is driving improvements in manufacturing systems. Results are higher productivity and quality. However, corporate strategy is driven by a number of factors and includes data and pressure from multiple stakeholders, which includes employees, managers, executives, stockholders, boards, suppliers and customers. It is also driven by information about competitors and emerging technology. Much information is based on processing of data and the resulting biases of the processors. Thus, stakeholders can base inputs on faulty perceptions, which are not reality based. Prior to processing, data used may be inaccurate. Sources of data and information may include demographic reports, statistical analyses, intelligence reports (e.g., marketing data), technology and primary data collection. The reliability and validity of data as well as the management of sources and information is critical element to strategy formulation. The paper explores data collection, processing and analyses from secondary and primary sources, information generation and report presentation for strategy formulation and contrast this with data and information utilized to drive internal process such as manufacturing. The hypothesis is that internal process, such as manufacturing, are subordinate to corporate strategies. The impact of possible divergence in quality of decisions at the corporate level on IT driven, quality-manufacturing processes based on measurable outcomes is significant. Recommendations for IT improvements at the corporate strategy level are given.

  1. Process Simulation of Complex Biological Pathways in Physical Reactive Space and Reformulated for Massively Parallel Computing Platforms.

    PubMed

    Ganesan, Narayan; Li, Jie; Sharma, Vishakha; Jiang, Hanyu; Compagnoni, Adriana

    2016-01-01

    Biological systems encompass complexity that far surpasses many artificial systems. Modeling and simulation of large and complex biochemical pathways is a computationally intensive challenge. Traditional tools, such as ordinary differential equations, partial differential equations, stochastic master equations, and Gillespie type methods, are all limited either by their modeling fidelity or computational efficiency or both. In this work, we present a scalable computational framework based on modeling biochemical reactions in explicit 3D space, that is suitable for studying the behavior of large and complex biological pathways. The framework is designed to exploit parallelism and scalability offered by commodity massively parallel processors such as the graphics processing units (GPUs) and other parallel computing platforms. The reaction modeling in 3D space is aimed at enhancing the realism of the model compared to traditional modeling tools and framework. We introduce the Parallel Select algorithm that is key to breaking the sequential bottleneck limiting the performance of most other tools designed to study biochemical interactions. The algorithm is designed to be computationally tractable, handle hundreds of interacting chemical species and millions of independent agents by considering all-particle interactions within the system. We also present an implementation of the framework on the popular graphics processing units and apply it to the simulation study of JAK-STAT Signal Transduction Pathway. The computational framework will offer a deeper insight into various biological processes within the cell and help us observe key events as they unfold in space and time. This will advance the current state-of-the-art in simulation study of large scale biological systems and also enable the realistic simulation study of macro-biological cultures, where inter-cellular interactions are prevalent.

  2. Queueing Network Models for Parallel Processing of Task Systems: an Operational Approach

    NASA Technical Reports Server (NTRS)

    Mak, Victor W. K.

    1986-01-01

    Computer performance modeling of possibly complex computations running on highly concurrent systems is considered. Earlier works in this area either dealt with a very simple program structure or resulted in methods with exponential complexity. An efficient procedure is developed to compute the performance measures for series-parallel-reducible task systems using queueing network models. The procedure is based on the concept of hierarchical decomposition and a new operational approach. Numerical results for three test cases are presented and compared to those of simulations.

  3. Nonlinear Optical Microscopy Signal Processing Strategies in Cancer

    PubMed Central

    Adur, Javier; Carvalho, Hernandes F; Cesar, Carlos L; Casco, Víctor H

    2014-01-01

    This work reviews the most relevant present-day processing methods used to improve the accuracy of multimodal nonlinear images in the detection of epithelial cancer and the supporting stroma. Special emphasis has been placed on methods of non linear optical (NLO) microscopy image processing such as: second harmonic to autofluorescence ageing index of dermis (SAAID), tumor-associated collagen signatures (TACS), fast Fourier transform (FFT) analysis, and gray level co-occurrence matrix (GLCM)-based methods. These strategies are presented as a set of potential valuable diagnostic tools for early cancer detection. It may be proposed that the combination of NLO microscopy and informatics based image analysis approaches described in this review (all carried out on free software) may represent a powerful tool to investigate collagen organization and remodeling of extracellular matrix in carcinogenesis processes. PMID:24737930

  4. SEMICONDUCTOR DEVICES Parallel readout of two-element CdZnTe detectors with real-time digital signal processing

    NASA Astrophysics Data System (ADS)

    Zhubin, Shi; Linjun, Wang; Kaifeng, Qin; Jiahua, Min; Jijun, Zhang; Xiaoyan, Liang; Jian, Huang; Ke, Tang; Yiben, Xia

    2010-12-01

    Readout electronics, especially digital electronics, for two-element CdZnTe (CZT) detectors in parallel are developed. The preliminary results show the detection efficiency of the two-element CZT detectors in parallel with analog electronics is as many as 1.8 and 2.1 times the single ones, and the energy resolution (FWHM) is limited by that of the single one by the means of analog electronics. However, the digital method for signal processing will be sufficiently better by contrast with an analog method especially in energy resolution. The energy resolution by the means of digital electronics can be improved by about 26.67%, compared to that only with analog electronics, while their detection efficiency is almost the same. The cause for this difference is also discussed.

  5. Investigation of the applicability of a functional programming model to fault-tolerant parallel processing for knowledge-based systems

    NASA Technical Reports Server (NTRS)

    Harper, Richard

    1989-01-01

    In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper Fault-Tolerant Parallel Processor (FTPP). When used in conjunction with the FTPP's fault detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms have been implemented and are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence and recovery. This user interface is described and its use demonstrated. The applicability of the functional programming style to the Activation Framework, a paradigm for intelligent systems, is then briefly described.

  6. Distributed Parallel Processing and Dynamic Load Balancing Techniques for Multidisciplinary High Speed Aircraft Design

    NASA Technical Reports Server (NTRS)

    Krasteva, Denitza T.

    1998-01-01

    Multidisciplinary design optimization (MDO) for large-scale engineering problems poses many challenges (e.g., the design of an efficient concurrent paradigm for global optimization based on disciplinary analyses, expensive computations over vast data sets, etc.) This work focuses on the application of distributed schemes for massively parallel architectures to MDO problems, as a tool for reducing computation time and solving larger problems. The specific problem considered here is configuration optimization of a high speed civil transport (HSCT), and the efficient parallelization of the embedded paradigm for reasonable design space identification. Two distributed dynamic load balancing techniques (random polling and global round robin with message combining) and two necessary termination detection schemes (global task count and token passing) were implemented and evaluated in terms of effectiveness and scalability to large problem sizes and a thousand processors. The effect of certain parameters on execution time was also inspected. Empirical results demonstrated stable performance and effectiveness for all schemes, and the parametric study showed that the selected algorithmic parameters have a negligible effect on performance.

  7. Optimized Parallelization for Nonlocal Means Based Low Dose CT Image Processing.

    PubMed

    Zhang, Libo; Yang, Benqiang; Zhuang, Zhikun; Hu, Yining; Chen, Yang; Luo, Limin; Shu, Huazhong

    2015-01-01

    Low dose CT (LDCT) images are often significantly degraded by severely increased mottled noise/artifacts, which can lead to lowered diagnostic accuracy in clinic. The nonlocal means (NLM) filtering can effectively remove mottled noise/artifacts by utilizing large-scale patch similarity information in LDCT images. But the NLM filtering application in LDCT imaging also requires high computation cost because intensive patch similarity calculation within a large searching window is often required to be used to include enough structure-similarity information for noise/artifact suppression. To improve its clinical feasibility, in this study we further optimize the parallelization of NLM filtering by avoiding the repeated computation with the row-wise intensity calculation and the symmetry weight calculation. The shared memory with fast I/O speed is also used in row-wise intensity calculation for the proposed method. Quantitative experiment demonstrates that significant acceleration can be achieved with respect to the traditional straight pixel-wise parallelization. PMID:26078781

  8. Optimized Parallelization for Nonlocal Means Based Low Dose CT Image Processing

    PubMed Central

    Zhang, Libo; Yang, Benqiang; Zhuang, Zhikun; Hu, Yining; Chen, Yang; Luo, Limin; Shu, Huazhong

    2015-01-01

    Low dose CT (LDCT) images are often significantly degraded by severely increased mottled noise/artifacts, which can lead to lowered diagnostic accuracy in clinic. The nonlocal means (NLM) filtering can effectively remove mottled noise/artifacts by utilizing large-scale patch similarity information in LDCT images. But the NLM filtering application in LDCT imaging also requires high computation cost because intensive patch similarity calculation within a large searching window is often required to be used to include enough structure-similarity information for noise/artifact suppression. To improve its clinical feasibility, in this study we further optimize the parallelization of NLM filtering by avoiding the repeated computation with the row-wise intensity calculation and the symmetry weight calculation. The shared memory with fast I/O speed is also used in row-wise intensity calculation for the proposed method. Quantitative experiment demonstrates that significant acceleration can be achieved with respect to the traditional straight pixel-wise parallelization. PMID:26078781

  9. Modularisation of Vocational Training in Germany, Austria and Switzerland: Parallels and Disparities in a Modernisation Process

    ERIC Educational Resources Information Center

    Pilz, Matthias

    2012-01-01

    This article considers the modularisation of initial vocational training (including apprenticeships) as a modernisation strategy in Germany, Austria and Switzerland. Training systems are similarly structured in these three countries with the apprenticeship system at their heart, and the three national philosophies of education and training are…

  10. Massively parallel low-cost pick-and-place of optoelectronic devices by electrochemical fluidic processing.

    PubMed

    Ozkan, M; Kibar, O; Ozkan, C S; Esener, S C

    2000-09-01

    We describe a novel electrochemical technique for the nonlithographic, fluidic pick-and-place assembly of optoelectronic devices by electrical and optical addressing. An electrochemical cell was developed that consists of indium tin oxide (ITO) and n -type silicon substrates as the two electrode materials and deionized water (R = 18 MOmega) as the electrolytic medium between the two electrodes. 0.8-20-microm-diameter negatively charged polystyrene beads, 50-100-microm-diameter SiO(2) pucks, and 50-microm LED's were successfully integrated upon a patterned silicon substrate by electrical addressing. In addition, 0.8-microm-diameter beads were integrated upon a homogeneous silicon substrate by optical addressing. This method can be applied to massively parallel assembly (>1000 x 1000 arrays) of multiple types of devices (of a wide size range) with very fast (a few seconds) and accurate positioning.

  11. Use of electron-trapping materials in optical signal processing. IV - Parallel incoherent image subtraction

    NASA Astrophysics Data System (ADS)

    Jutamulia, Suganda; Storti, George M.; Seiderman, William; Lindmayer, Joseph; Gregory, Don A.

    1993-02-01

    The application of electron trapping (ET) materials to parallel incoherent image subtraction over a wide dynamic range is examined in detail. A new incoherent image-subtraction technique based on ET materials is presented which can be applied to automation for microcircuit manufacture and inspection and potentially to data compression for videophones, teleconferencing, and high-definition TV. It is suggested that a high-quality ET thin-film could be coupled directly with a CCD chip to perform real-time image subtraction between two simultaneous scenes or subsequent frames. The advantages of the ET-based technique over the incoherent image-subtraction technique based on two liquid-crystal light valves include absence of coherent noise, high resolution, high space-bandwidth product, high speed, and cost effectiveness.

  12. Parallel processing demonstrator with plug-on-top free-space interconnect optics

    NASA Astrophysics Data System (ADS)

    Berger, Christoph; Wang, Xiaoqing; Ekman, Jeremy T.; Marchand, Philippe J.; Spaanenburg, Henk; Wang, Mark M.; Kiamilev, Fouad E.; Esener, Sadik C.

    2001-05-01

    We demonstrate a setup with 10 optically interconnected chips,k which can perform a distributed radix-2-butterfly calculation for fast Fourier transformation. The setup consists of a motherboard, five multi-chip-modules (MCMs, with processor/transceiver chips and laser/detector chips), four plug-on-top optics modules that provide the bi- directional optical links between the MCMs, and external control electronics. The design of the optics and optomechanics satisfies numerous real-world constraints, such as compact size (< 1 inch thick), suitability for mass-production, suitability for large arrays (up to 103 parallel channels), compatibility with standard electronics fabrication and packaging technology, and potential for active misalignment compensation by integrating MEMS technology.

  13. Parallel Algorithm for GPU Processing; for use in High Speed Machine Vision Sensing of Cotton Lint Trash

    PubMed Central

    Pelletier, Mathew G.

    2008-01-01

    One of the main hurdles standing in the way of optimal cleaning of cotton lint is the lack of sensing systems that can react fast enough to provide the control system with real-time information as to the level of trash contamination of the cotton lint. This research examines the use of programmable graphic processing units (GPU) as an alternative to the PC's traditional use of the central processing unit (CPU). The use of the GPU, as an alternative computation platform, allowed for the machine vision system to gain a significant improvement in processing time. By improving the processing time, this research seeks to address the lack of availability of rapid trash sensing systems and thus alleviate a situation in which the current systems view the cotton lint either well before, or after, the cotton is cleaned. This extended lag/lead time that is currently imposed on the cotton trash cleaning control systems, is what is responsible for system operators utilizing a very large dead-band safety buffer in order to ensure that the cotton lint is not under-cleaned. Unfortunately, the utilization of a large dead-band buffer results in the majority of the cotton lint being over-cleaned which in turn causes lint fiber-damage as well as significant losses of the valuable lint due to the excessive use of cleaning machinery. This research estimates that upwards of a 30% reduction in lint loss could be gained through the use of a tightly coupled trash sensor to the cleaning machinery control systems. This research seeks to improve processing times through the development of a new algorithm for cotton trash sensing that allows for implementation on a highly parallel architecture. Additionally, by moving the new parallel algorithm onto an alternative computing platform, the graphic processing unit “GPU”, for processing of the cotton trash images, a speed up of over 6.5 times, over optimized code running on the PC's central processing unit “CPU”, was gained. The new

  14. A review of advanced small-scale parallel bioreactor technology for accelerated process development: current state and future need.

    PubMed

    Bareither, Rachel; Pollard, David

    2011-01-01

    The pharmaceutical and biotech industries face continued pressure to reduce development costs and accelerate process development. This challenge occurs alongside the need for increased upstream experimentation to support quality by design initiatives and the pursuit of predictive models from systems biology. A small scale system enabling multiple reactions in parallel (n ≥ 20), with automated sampling and integrated to purification, would provide significant improvement (four to fivefold) to development timelines. State of the art attempts to pursue high throughput process development include shake flasks, microfluidic reactors, microtiter plates and small-scale stirred reactors. The limitations of these systems are compared to desired criteria to mimic large scale commercial processes. The comparison shows that significant technological improvement is still required to provide automated solutions that can speed upstream process development.

  15. Materials processing in space - A strategy for commercialization

    NASA Technical Reports Server (NTRS)

    Naumann, R. J.

    1978-01-01

    Major aerospace companies are talking about space factories manufacturing billions of dollars worth of high technology materials per year. On the other hand, a recent National Academy of Sciences study team saw little prospect for space manufacturing because, in their opinion, most of the disturbing effects of gravity in the processes they considered could be overcome on the ground for much less expenditure. This paper presents a current assessment of the problems and promises of the Materials Processing in Space Program and outlines a strategy for developing the first products of commercial value. These early products are expected to serve as paradigms of what can be accomplished by manufacturing in space and should stimulate industry to develop space manufacturing to whatever degree is economically justifiable.

  16. Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum

    PubMed Central

    Ito, Makoto; Doya, Kenji

    2015-01-01

    Previous theoretical studies of animal and human behavioral learning have focused on the dichotomy of the value-based strategy using action value functions to predict rewards and the model-based strategy using internal models to predict environmental states. However, animals and humans often take simple procedural behaviors, such as the “win-stay, lose-switch” strategy without explicit prediction of rewards or states. Here we consider another strategy, the finite state-based strategy, in which a subject selects an action depending on its discrete internal state and updates the state depending on the action chosen and the reward outcome. By analyzing choice behavior of rats in a free-choice task, we found that the finite state-based strategy fitted their behavioral choices more accurately than value-based and model-based strategies did. When fitted models were run autonomously with the same task, only the finite state-based strategy could reproduce the key feature of choice sequences. Analyses of neural activity recorded from the dorsolateral striatum (DLS), the dorsomedial striatum (DMS), and the ventral striatum (VS) identified significant fractions of neurons in all three subareas for which activities were correlated with individual states of the finite state-based strategy. The signal of internal states at the time of choice was found in DMS, and for clusters of states was found in VS. In addition, action values and state values of the value-based strategy were encoded in DMS and VS, respectively. These results suggest that both the value-based strategy and the finite state-based strategy are implemented in the striatum. PMID:26529522

  17. Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum.

    PubMed

    Ito, Makoto; Doya, Kenji

    2015-11-01

    Previous theoretical studies of animal and human behavioral learning have focused on the dichotomy of the value-based strategy using action value functions to predict rewards and the model-based strategy using internal models to predict environmental states. However, animals and humans often take simple procedural behaviors, such as the "win-stay, lose-switch" strategy without explicit prediction of rewards or states. Here we consider another strategy, the finite state-based strategy, in which a subject selects an action depending on its discrete internal state and updates the state depending on the action chosen and the reward outcome. By analyzing choice behavior of rats in a free-choice task, we found that the finite state-based strategy fitted their behavioral choices more accurately than value-based and model-based strategies did. When fitted models were run autonomously with the same task, only the finite state-based strategy could reproduce the key feature of choice sequences. Analyses of neural activity recorded from the dorsolateral striatum (DLS), the dorsomedial striatum (DMS), and the ventral striatum (VS) identified significant fractions of neurons in all three subareas for which activities were correlated with individual states of the finite state-based strategy. The signal of internal states at the time of choice was found in DMS, and for clusters of states was found in VS. In addition, action values and state values of the value-based strategy were encoded in DMS and VS, respectively. These results suggest that both the value-based strategy and the finite state-based strategy are implemented in the striatum. PMID:26529522

  18. Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum.

    PubMed

    Ito, Makoto; Doya, Kenji

    2015-11-01

    Previous theoretical studies of animal and human behavioral learning have focused on the dichotomy of the value-based strategy using action value functions to predict rewards and the model-based strategy using internal models to predict environmental states. However, animals and humans often take simple procedural behaviors, such as the "win-stay, lose-switch" strategy without explicit prediction of rewards or states. Here we consider another strategy, the finite state-based strategy, in which a subject selects an action depending on its discrete internal state and updates the state depending on the action chosen and the reward outcome. By analyzing choice behavior of rats in a free-choice task, we found that the finite state-based strategy fitted their behavioral choices more accurately than value-based and model-based strategies did. When fitted models were run autonomously with the same task, only the finite state-based strategy could reproduce the key feature of choice sequences. Analyses of neural activity recorded from the dorsolateral striatum (DLS), the dorsomedial striatum (DMS), and the ventral striatum (VS) identified significant fractions of neurons in all three subareas for which activities were correlated with individual states of the finite state-based strategy. The signal of internal states at the time of choice was found in DMS, and for clusters of states was found in VS. In addition, action values and state values of the value-based strategy were encoded in DMS and VS, respectively. These results suggest that both the value-based strategy and the finite state-based strategy are implemented in the striatum.

  19. Utilization of parallel processing in solving the inviscid form of the average-passage equation system for multistage turbomachinery

    NASA Technical Reports Server (NTRS)

    Mulac, Richard A.; Celestina, Mark L.; Adamczyk, John J.; Misegades, Kent P.; Dawson, Jef M.

    1987-01-01

    A procedure is outlined which utilizes parallel processing to solve the inviscid form of the average-passage equation system for multistage turbomachinery along with a description of its implementation in a FORTRAN computer code, MSTAGE. A scheme to reduce the central memory requirements of the program is also detailed. Both the multitasking and I/O routines referred to are specific to the Cray X-MP line of computers and its associated SSD (Solid-State Disk). Results are presented for a simulation of a two-stage rocket engine fuel pump turbine.

  20. The utilization of parallel processing in solving the inviscid form of the average-passage equation system for multistage turbomachinery

    NASA Technical Reports Server (NTRS)

    Mulac, Richard A.; Celestina, Mark L.; Adamczyk, John J.; Misegades, Kent P.; Dawson, Jef M.

    1987-01-01

    A procedure is outlined which utilizes parallel processing to solve the inviscid form of the average-passage equation system for multistage turbomachinery along with a description of its implementation in a FORTRAN computer code, MSTAGE. A scheme to reduce the central memory requirements of the program is also detailed. Both the multitasking and I/O routines referred to in this paper are specific to the Cray X-MP line of computers and its associated SSD (Solid-state Storage Device). Results are presented for a simulation of a two-stage rocket engine fuel pump turbine.

  1. Parallel Olfactory Processing in the Honey Bee Brain: Odor Learning and Generalization under Selective Lesion of a Projection Neuron Tract

    PubMed Central

    Carcaud, Julie; Giurfa, Martin; Sandoz, Jean Christophe

    2016-01-01

    The function of parallel neural processing is a fundamental problem in Neuroscience, as it is found across sensory modalities and evolutionary lineages, from insects to humans. Recently, parallel processing has attracted increased attention in the olfactory domain, with the demonstration in both insects and mammals that different populations of second-order neurons encode and/or process odorant information differently. Among insects, Hymenoptera present a striking olfactory system with a clear neural dichotomy from the periphery to higher-order centers, based on two main tracts of second-order (projection) neurons: the medial and lateral antennal lobe tracts (m-ALT and l-ALT). To unravel the functional role of these two pathways, we combined specific lesions of the m-ALT tract with behavioral experiments, using the classical conditioning of the proboscis extension response (PER conditioning). Lesioned and intact bees had to learn to associate an odorant (1-nonanol) with sucrose. Then the bees were subjected to a generalization procedure with a range of odorants differing in terms of their carbon chain length or functional group. We show that m-ALT lesion strongly affects acquisition of an odor-sucrose association. However, lesioned bees that still learned the association showed a normal gradient of decreasing generalization responses to increasingly dissimilar odorants. Generalization responses could be predicted to some extent by in vivo calcium imaging recordings of l-ALT neurons. The m-ALT pathway therefore seems necessary for normal classical olfactory conditioning performance. PMID:26834589

  2. Parallel Olfactory Processing in the Honey Bee Brain: Odor Learning and Generalization under Selective Lesion of a Projection Neuron Tract.

    PubMed

    Carcaud, Julie; Giurfa, Martin; Sandoz, Jean Christophe

    2015-01-01

    The function of parallel neural processing is a fundamental problem in Neuroscience, as it is found across sensory modalities and evolutionary lineages, from insects to humans. Recently, parallel processing has attracted increased attention in the olfactory domain, with the demonstration in both insects and mammals that different populations of second-order neurons encode and/or process odorant information differently. Among insects, Hymenoptera present a striking olfactory system with a clear neural dichotomy from the periphery to higher-order centers, based on two main tracts of second-order (projection) neurons: the medial and lateral antennal lobe tracts (m-ALT and l-ALT). To unravel the functional role of these two pathways, we combined specific lesions of the m-ALT tract with behavioral experiments, using the classical conditioning of the proboscis extension response (PER conditioning). Lesioned and intact bees had to learn to associate an odorant (1-nonanol) with sucrose. Then the bees were subjected to a generalization procedure with a range of odorants differing in terms of their carbon chain length or functional group. We show that m-ALT lesion strongly affects acquisition of an odor-sucrose association. However, lesioned bees that still learned the association showed a normal gradient of decreasing generalization responses to increasingly dissimilar odorants. Generalization responses could be predicted to some extent by in vivo calcium imaging recordings of l-ALT neurons. The m-ALT pathway therefore seems necessary for normal classical olfactory conditioning performance. PMID:26834589

  3. Strategies for optimal operation of the tellurium electrowinning process

    SciTech Connect

    Broderick, G.; Handle, B.; Paschen, P.

    1999-02-01

    Empirical models predicting the purity of electrowon tellurium have been developed using data from 36 pilot-plant trials. Based on these models, a numerical optimization of the process was performed to identify conditions which minimize the total contamination in Pb and Se while reducing electrical consumption per kilogram of electrowon tellurium. Results indicate that product quality can be maintained and even improved while operating at the much higher electroplating production rates obtained at high current densities. Using these same process settings, the electrical consumption of the process can be reduced by up to 10 pct by operating at midrange temperatures of close to 50 C. This is particularly attractive when waste heat is available at the plant to help preheat the electrolyte feed. When both Pb and Se are present as contaminants, the most energy-efficient strategy involves the use of a high current density, at a moderate temperature with high flow, for low concentrations of TeO{sub 2}. If Pb is removed prior to the electrowinning process, the use of a low current density and low electrolyte feed concentration, while operating at a low temperature and moderate flow rates, provides the most significant reduction in Se codeposition.

  4. Functionally Approached Body (FAB) Strategies for Young Children Who Have Behavioral and Sensory Processing Challenges

    ERIC Educational Resources Information Center

    Pagano, John

    2005-01-01

    Functionally Approached Body (FAB) Strategies offer a clinical approach to help parents of young children with behavioral and sensory processing strategies. This article introduces the FAB Strategies, clinical strategies developed by the author for understanding and addressing young children's behavioral and sensory processing challenges. The FAB…

  5. Parallel processing and learning in simple systems. Annual report, 10 January 1987-9 January 1988

    SciTech Connect

    Mpitsos, G.J.

    1988-03-11

    To date it has been demonstrated that an experimental animal, the sea slug Pleurobranchaea, is capable of one-trial food-aversion learning, and that the muscarinic antagonist scopolamine in low doses causes an enhancement of learning. Pharmacologic binding studies using a new, /sup 125/I-form of quinuclidinyl benzilate, in addition to studies using the /sup 3/H form of this ligand, have uncovered not only the classical types of muscarinic receptors that are typical of vertebrate cortex, but also a new form that is not found in other invertebrates tested. Usually muscarinic receptors are found in low densities in invertebrate neural membranes, but the density of the new form in this animal's neural membranes is similar to the density of the classic receptors in mammalian cortex. Neurophysiological studies of individual neurons in small groups of identifiable neurons have shown that their activity is variable, as is the behavior that they take part in generating, and that the variability fits the definition of low-dimensional chaos. Findings show that such variability is an important feature of the emergence of adaptive responses arising from parallel, distributed neural networks in biological systems.

  6. Parallel Processing Performance on Multi-Core PC Cluster Distributing Communication Load to Multiple Paths

    NASA Astrophysics Data System (ADS)

    Fukunaga, Takafumi

    Due to advent of powerful Multi-Core PC cluster the computation performance of each node is dramatically increassed and this trend will continue in the future. On the other hand, the use of powerful network systems (Myrinet, Infiniband, etc.) is expensive and tends to increase difficulty of programming and degrades portability because they need dedicated libraries and protocol stacks. This paper proposes a relatively simple method to improve bandwidth-oriented parallel applications by improving the communication performance without the above dedicated hardware, libraries, protocol stacks and IEEE802.3ad (LACP). Although there are similarities between this proposal and IEEE802.3ad in respect to using multiple Ethernet ports, the proposal performs equal to or better than IEEE802.3ad without LACP switches and drivers. Moreover the performance of LACP is influenced by the environment (MAC addresses, IP addresses, etc.) because its distribution algorithm uses these parameters, the proposed method shows the same effect in spite of them.

  7. PANMIN: sequential and parallel global optimization procedures with a variety of options for the local search strategy

    NASA Astrophysics Data System (ADS)

    Theos, F. V.; Lagaris, I. E.; Papageorgiou, D. G.

    2004-05-01

    We present two sequential and one parallel global optimization codes, that belong to the stochastic class, and an interface routine that enables the use of the Merlin/MCL environment as a non-interactive local optimizer. This interface proved extremely important, since it provides flexibility, effectiveness and robustness to the local search task that is in turn employed by the global procedures. We demonstrate the use of the parallel code to a molecular conformation problem. Program summaryTitle of program: PANMIN Catalogue identifier: ADSU Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADSU Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland Computer for which the program is designed and others on which it has been tested: PANMIN is designed for UNIX machines. The parallel code runs on either shared memory architectures or on a distributed system. The code has been tested on a SUN Microsystems ENTERPRISE 450 with four CPUs, and on a 48-node cluster under Linux, with both the GNU g77 and the Portland group compilers. The parallel implementation is based on MPI and has been tested with LAM MPI and MPICH Installation: University of Ioannina, Greece Programming language used: Fortran-77 Memory required to execute with typical data: Approximately O( n2) words, where n is the number of variables No. of bits in a word: 64 No. of processors used: 1 or many Has the code been vectorised or parallelized?: Parallelized using MPI No. of bytes in distributed program, including test data, etc.: 147163 No. of lines in distributed program, including the test data, etc.: 14366 Distribution format: gzipped tar file Nature of physical problem: A multitude of problems in science and engineering are often reduced to minimizing a function of many variables. There are instances that a local optimum does not correspond to the desired physical solution and hence the search for a better solution is required. Local optimization techniques can be

  8. Parallel Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences.

    ERIC Educational Resources Information Center

    McClelland, James L.; Kawamoto, Alan H.

    This paper describes and illustrates a simulation model for the processing of grammatical elements in a sentence, focusing on one aspect of sentence comprehension: the assignment of the constituent elements of a sentence to the correct thematic case roles. The model addresses questions about sentence processing from a perspective very different…

  9. MPP parallel forth

    NASA Technical Reports Server (NTRS)

    Dorband, John E.

    1987-01-01

    Massively Parallel Processor (MPP) Parallel FORTH is a derivative of FORTH-83 and Unified Software Systems' Uni-FORTH. The extension of FORTH into the realm of parallel processing on the MPP is described. With few exceptions, Parallel FORTH was made to follow the description of Uni-FORTH as closely as possible. Likewise, the parallel FORTH extensions were designed as philosophically similar to serial FORTH as possible. The MPP hardware characteristics, as viewed by the FORTH programmer, is discussed. Then a description is presented of how parallel FORTH is implemented on the MPP.

  10. Massively Parallel Geostatistical Inversion of Coupled Processes in Heterogeneous Porous Media

    NASA Astrophysics Data System (ADS)

    Ngo, A.; Schwede, R. L.; Li, W.; Bastian, P.; Ippisch, O.; Cirpka, O. A.

    2012-04-01

    another level of parallelization has been added.

  11. Parallel memory processing by the CA1 region of the dorsal hippocampus and the basolateral amygdala.

    PubMed

    Cammarota, Martín; Bevilaqua, Lia R; Rossato, Janine I; Lima, Ramón H; Medina, Jorge H; Izquierdo, Iván

    2008-07-29

    There is abundant literature on the role of the basolateral amygdala (BLA) and the CA1 region of the hippocampus in memory formation of inhibitory avoidance (IA) and other behaviorally arousing tasks. Here, we investigate molecular correlates of IA consolidation in the two structures and their relation to NMDA receptors (NMDArs) and beta-adrenergic receptors (beta-ADrs). The separate posttraining administration of antagonists of NMDAr and beta-ADr to BLA and CA1 is amnesic. IA training is followed by an increase of the phosphorylation of calcium and calmodulin-dependent protein kinase II (CaMKII) and ERK2 in CA1 but only an increase of the phosphorylation of ERK2 in BLA. The changes are blocked by NMDAr antagonists but not beta-ADr antagonists in CA1, and they are blocked by beta-ADr but not NMDAr antagonists in BLA. In addition, the changes are accompanied by increased phosphorylation of tyrosine hydroxylase in BLA but not in CA1, suggesting that beta-AD modulation results from local catecholamine synthesis in the former but not in the latter structure. NMDAr blockers in CA1 do not alter the learning-induced neurochemical changes in BLA, and beta-ADr blockade in BLA does not hinder those in CA1. When put together with other data from the literature, the present findings suggest that CA1 and BLA play a role in consolidation, but they operate to an extent in parallel, suggesting that each is probably involved with different aspects of the task studied.

  12. Selecting a Control Strategy for Plug and Process Loads

    SciTech Connect

    Lobato, C.; Sheppy, M.; Brackney, L.; Pless, S.; Torcellini, P.

    2012-09-01

    Plug and Process Loads (PPLs) are building loads that are not related to general lighting, heating, ventilation, cooling, and water heating, and typically do not provide comfort to the building occupants. PPLs in commercial buildings account for almost 5% of U.S. primary energy consumption. On an individual building level, they account for approximately 25% of the total electrical load in a minimally code-compliant commercial building, and can exceed 50% in an ultra-high efficiency building such as the National Renewable Energy Laboratory's (NREL) Research Support Facility (RSF) (Lobato et al. 2010). Minimizing these loads is a primary challenge in the design and operation of an energy-efficient building. A complex array of technologies that measure and manage PPLs has emerged in the marketplace. Some fall short of manufacturer performance claims, however. NREL has been actively engaged in developing an evaluation and selection process for PPLs control, and is using this process to evaluate a range of technologies for active PPLs management that will cap RSF plug loads. Using a control strategy to match plug load use to users' required job functions is a huge untapped potential for energy savings.

  13. The evolution of concepts of vestibular peripheral information processing: toward the dynamic, adaptive, parallel processing macular model.

    PubMed

    Ross, Muriel D

    2003-09-01

    In a letter to Robert Hooke, written on 5 February, 1675, Isaac Newton wrote "If I have seen further than certain other men it is by standing upon the shoulders of giants." In his context, Newton was referring to the work of Galileo and Kepler, who preceded him. However, every field has its own giants, those men and women who went before us and, often with few tools at their disposal, uncovered the facts that enabled later researchers to advance knowledge in a particular area. This review traces the history of the evolution of views from early giants in the field of vestibular research to modern concepts of vestibular organ organization and function. Emphasis will be placed on the mammalian maculae as peripheral processors of linear accelerations acting on the head. This review shows that early, correct findings were sometimes unfortunately disregarded, impeding later investigations into the structure and function of the vestibular organs. The central themes are that the macular organs are highly complex, dynamic, adaptive, distributed parallel processors of information, and that historical references can help us to understand our own place in advancing knowledge about their complicated structure and functions.

  14. The evolution of concepts of vestibular peripheral information processing: toward the dynamic, adaptive, parallel processing macular model

    NASA Technical Reports Server (NTRS)

    Ross, Muriel D.

    2003-01-01

    In a letter to Robert Hooke, written on 5 February, 1675, Isaac Newton wrote "If I have seen further than certain other men it is by standing upon the shoulders of giants." In his context, Newton was referring to the work of Galileo and Kepler, who preceded him. However, every field has its own giants, those men and women who went before us and, often with few tools at their disposal, uncovered the facts that enabled later researchers to advance knowledge in a particular area. This review traces the history of the evolution of views from early giants in the field of vestibular research to modern concepts of vestibular organ organization and function. Emphasis will be placed on the mammalian maculae as peripheral processors of linear accelerations acting on the head. This review shows that early, correct findings were sometimes unfortunately disregarded, impeding later investigations into the structure and function of the vestibular organs. The central themes are that the macular organs are highly complex, dynamic, adaptive, distributed parallel processors of information, and that historical references can help us to understand our own place in advancing knowledge about their complicated structure and functions.

  15. The evolution of concepts of vestibular peripheral information processing: toward the dynamic, adaptive, parallel processing macular model.

    PubMed

    Ross, Muriel D

    2003-09-01

    In a letter to Robert Hooke, written on 5 February, 1675, Isaac Newton wrote "If I have seen further than certain other men it is by standing upon the shoulders of giants." In his context, Newton was referring to the work of Galileo and Kepler, who preceded him. However, every field has its own giants, those men and women who went before us and, often with few tools at their disposal, uncovered the facts that enabled later researchers to advance knowledge in a particular area. This review traces the history of the evolution of views from early giants in the field of vestibular research to modern concepts of vestibular organ organization and function. Emphasis will be placed on the mammalian maculae as peripheral processors of linear accelerations acting on the head. This review shows that early, correct findings were sometimes unfortunately disregarded, impeding later investigations into the structure and function of the vestibular organs. The central themes are that the macular organs are highly complex, dynamic, adaptive, distributed parallel processors of information, and that historical references can help us to understand our own place in advancing knowledge about their complicated structure and functions. PMID:14575392

  16. Design of a high-speed digital processing element for parallel simulation

    NASA Technical Reports Server (NTRS)

    Milner, E. J.; Cwynar, D. S.

    1983-01-01

    A prototype of a custom designed computer to be used as a processing element in a multiprocessor based jet engine simulator is described. The purpose of the custom design was to give the computer the speed and versatility required to simulate a jet engine in real time. Real time simulations are needed for closed loop testing of digital electronic engine controls. The prototype computer has a microcycle time of 133 nanoseconds. This speed was achieved by: prefetching the next instruction while the current one is executing, transporting data using high speed data busses, and using state of the art components such as a very large scale integration (VLSI) multiplier. Included are discussions of processing element requirements, design philosophy, the architecture of the custom designed processing element, the comprehensive instruction set, the diagnostic support software, and the development status of the custom design.

  17. Do Sixth-Grade Writers Need Process Strategies?

    ERIC Educational Resources Information Center

    Torrance, Mark; Fidalgo, Raquel; Robledo, Patricia

    2015-01-01

    Background: Strategy-focused writing instruction trains students both to set explicit product goals and to adopt specific procedural strategies, particularly for planning text. A number of studies have demonstrated that strategy-focused writing instruction is effective in developing writing performance. Aim: This study aimed to determine whether…

  18. Parallel image compression

    NASA Technical Reports Server (NTRS)

    Reif, John H.

    1987-01-01

    A parallel compression algorithm for the 16,384 processor MPP machine was developed. The serial version of the algorithm can be viewed as a combination of on-line dynamic lossless test compression techniques (which employ simple learning strategies) and vector quantization. These concepts are described. How these concepts are combined to form a new strategy for performing dynamic on-line lossy compression is discussed. Finally, the implementation of this algorithm in a massively parallel fashion on the MPP is discussed.

  19. Parallel Courses: Comparison (and Convergence) of Adolescent Motivational Processes in Informal and Formal Science Education Settings.

    ERIC Educational Resources Information Center

    Pyle, Eric J.

    The purpose of the study reported in this paper was to describe adolescent motivational processes. Early adolescents (N=137), their accompanying adults, and venue staff members were interviewed and observed in specific informal science education venues. A typological analysis of the data revealed: (1) activities by early adolescents take place…

  20. Differential parallel processing of olfactory information in the honeybee, Apis mellifera L.

    PubMed

    Müller, D; Abel, R; Brandt, R; Zöckler, M; Menzel, R

    2002-06-01

    Two distinct neuronal pathways connect the first olfactory neuropil, the antennal lobe, with higher integration areas, such as the mushroom bodies, via antennal lobe projection neurons. Intracellular recordings were used to address the question whether neuroanatomical features affect odor-coding properties. We found that neurons in the median antennocerebral tract code odors by latency differences or specific inhibitory phases in combination with excitatory phases, have a more specific activity profile for different odors and convey the information with a delay. The neurons of the lateral antennocerebral tract code odors by spike rate differences, have a broader activity profile for different odors, and convey the information quickly. Thus, rather preliminary information about the olfactory stimulus first reaches the mushroom bodies and the lateral horn via neurons of the lateral antennocerebral tract and subsequently odor information becomes more specified by activities of neurons of the median antennocerebral tract. We conclude that this neuroanatomical feature is not related to the distinction between different odors, but rather reflects a dual coding of the same odor stimuli by two different neuronal strategies focusing different properties of the same stimulus. PMID:12073081

  1. A parallel offline CFD and closed-form approximation strategy for computationally efficient analysis of complex fluid flows

    NASA Astrophysics Data System (ADS)

    Allphin, Devin

    Computational fluid dynamics (CFD) solution approximations for complex fluid flow problems have become a common and powerful engineering analysis technique. These tools, though qualitatively useful, remain limited in practice by their underlying inverse relationship between simulation accuracy and overall computational expense. While a great volume of research has focused on remedying these issues inherent to CFD, one traditionally overlooked area of resource reduction for engineering analysis concerns the basic definition and determination of functional relationships for the studied fluid flow variables. This artificial relationship-building technique, called meta-modeling or surrogate/offline approximation, uses design of experiments (DOE) theory to efficiently approximate non-physical coupling between the variables of interest in a fluid flow analysis problem. By mathematically approximating these variables, DOE methods can effectively reduce the required quantity of CFD simulations, freeing computational resources for other analytical focuses. An idealized interpretation of a fluid flow problem can also be employed to create suitably accurate approximations of fluid flow variables for the purposes of engineering analysis. When used in parallel with a meta-modeling approximation, a closed-form approximation can provide useful feedback concerning proper construction, suitability, or even necessity of an offline approximation tool. It also provides a short-circuit pathway for further reducing the overall computational demands of a fluid flow analysis, again freeing resources for otherwise unsuitable resource expenditures. To validate these inferences, a design optimization problem was presented requiring the inexpensive estimation of aerodynamic forces applied to a valve operating on a simulated piston-cylinder heat engine. The determination of these forces was to be found using parallel surrogate and exact approximation methods, thus evidencing the comparative

  2. Multi-target parallel processing approach for gene-to-structure determination of the influenza polymerase PB2 subunit.

    PubMed

    Armour, Brianna L; Barnes, Steve R; Moen, Spencer O; Smith, Eric; Raymond, Amy C; Fairman, James W; Stewart, Lance J; Staker, Bart L; Begley, Darren W; Edwards, Thomas E; Lorimer, Donald D

    2013-01-01

    Pandemic outbreaks of highly virulent influenza strains can cause widespread morbidity and mortality in human populations worldwide. In the United States alone, an average of 41,400 deaths and 1.86 million hospitalizations are caused by influenza virus infection each year (1). Point mutations in the polymerase basic protein 2 subunit (PB2) have been linked to the adaptation of the viral infection in humans (2). Findings from such studies have revealed the biological significance of PB2 as a virulence factor, thus highlighting its potential as an antiviral drug target. The structural genomics program put forth by the National Institute of Allergy and Infectious Disease (NIAID) provides funding to Emerald Bio and three other Pacific Northwest institutions that together make up the Seattle Structural Genomics Center for Infectious Disease (SSGCID). The SSGCID is dedicated to providing the scientific community with three-dimensional protein structures of NIAID category A-C pathogens. Making such structural information available to the scientific community serves to accelerate structure-based drug design. Structure-based drug design plays an important role in drug development. Pursuing multiple targets in parallel greatly increases the chance of success for new lead discovery by targeting a pathway or an entire protein family. Emerald Bio has developed a high-throughput, multi-target parallel processing pipeline (MTPP) for gene-to-structure determination to support the consortium. Here we describe the protocols used to determine the structure of the PB2 subunit from four different influenza A strains. PMID:23851357

  3. Multi-target Parallel Processing Approach for Gene-to-structure Determination of the Influenza Polymerase PB2 Subunit

    PubMed Central

    Moen, Spencer O.; Smith, Eric; Raymond, Amy C.; Fairman, James W.; Stewart, Lance J.; Staker, Bart L.; Begley, Darren W.; Edwards, Thomas E.; Lorimer, Donald D.

    2013-01-01

    Pandemic outbreaks of highly virulent influenza strains can cause widespread morbidity and mortality in human populations worldwide. In the United States alone, an average of 41,400 deaths and 1.86 million hospitalizations are caused by influenza virus infection each year 1. Point mutations in the polymerase basic protein 2 subunit (PB2) have been linked to the adaptation of the viral infection in humans 2. Findings from such studies have revealed the biological significance of PB2 as a virulence factor, thus highlighting its potential as an antiviral drug target. The structural genomics program put forth by the National Institute of Allergy and Infectious Disease (NIAID) provides funding to Emerald Bio and three other Pacific Northwest institutions that together make up the Seattle Structural Genomics Center for Infectious Disease (SSGCID). The SSGCID is dedicated to providing the scientific community with three-dimensional protein structures of NIAID category A-C pathogens. Making such structural information available to the scientific community serves to accelerate structure-based drug design. Structure-based drug design plays an important role in drug development. Pursuing multiple targets in parallel greatly increases the chance of success for new lead discovery by targeting a pathway or an entire protein family. Emerald Bio has developed a high-throughput, multi-target parallel processing pipeline (MTPP) for gene-to-structure determination to support the consortium. Here we describe the protocols used to determine the structure of the PB2 subunit from four different influenza A strains. PMID:23851357

  4. Embedded parallel processing based ground control systems for small satellite telemetry

    NASA Technical Reports Server (NTRS)

    Forman, Michael L.; Hazra, Tushar K.; Troendly, Gregory M.; Nickum, William G.

    1994-01-01

    The use of networked terminals which utilize embedded processing techniques results in totally integrated, flexible, high speed, reliable, and scalable systems suitable for telemetry and data processing applications such as mission operations centers (MOC). Synergies of these terminals, coupled with the capability of terminal to receive incoming data, allow the viewing of any defined display by any terminal from the start of data acquisition. There is no single point of failure (other than with network input) such as exists with configurations where all input data goes through a single front end processor and then to a serial string of workstations. Missions dedicated to NASA's ozone measurements program utilize the methodologies which are discussed, and result in a multimission configuration of low cost, scalable hardware and software which can be run by one flight operations team with low risk.

  5. Parallel effects of processing fluency and positive affect on familiarity-based recognition decisions for faces

    PubMed Central

    Duke, Devin; Fiacconi, Chris M.; Köhler, Stefan

    2014-01-01

    According to attribution models of familiarity assessment, people can use a heuristic in recognition-memory decisions, in which they attribute the subjective ease of processing of a memory probe to a prior encounter with the stimulus in question. Research in social cognition suggests that experienced positive affect may be the proximal cue that signals fluency in various experimental contexts. In the present study, we compared the effects of positive affect and fluency on recognition-memory judgments for faces with neutral emotional expression. We predicted that if positive affect is indeed the critical cue that signals processing fluency at retrieval, then its manipulation should produce effects that closely mirror those produced by manipulations of processing fluency. In two experiments, we employed a masked-priming procedure in combination with a Remember-Know (RK) paradigm that aimed to separate familiarity- from recollection-based memory decisions. In addition, participants performed a prime-discrimination task that allowed us to take inter-individual differences in prime awareness into account. We found highly similar effects of our priming manipulations of processing fluency and of positive affect. In both cases, the critical effect was specific to familiarity-based recognition responses. Moreover, in both experiments it was reflected in a shift toward a more liberal response bias, rather than in changed discrimination. Finally, in both experiments, the effect was found to be related to prime awareness; it was present only in participants who reported a lack of such awareness on the prime-discrimination task. These findings add to a growing body of evidence that points not only to a role of fluency, but also of positive affect in familiarity assessment. As such they are consistent with the idea that fluency itself may be hedonically marked. PMID:24795678

  6. Optical diagnostics of a single evaporating droplet using fast parallel computing on graphics processing units

    NASA Astrophysics Data System (ADS)

    Jakubczyk, D.; Migacz, S.; Derkachov, G.; Woźniak, M.; Archer, J.; Kolwas, K.

    2016-09-01

    We report on the first application of the graphics processing units (GPUs) accelerated computing technology to improve performance of numerical methods used for the optical characterization of evaporating microdroplets. Single microdroplets of various liquids with different volatility and molecular weight (glycerine, glycols, water, etc.), as well as mixtures of liquids and diverse suspensions evaporate inside the electrodynamic trap under the chosen temperature and composition of atmosphere. The series of scattering patterns recorded from the evaporating microdroplets are processed by fitting complete Mie theory predictions with gradientless lookup table method. We showed that computations on GPUs can be effectively applied to inverse scattering problems. In particular, our technique accelerated calculations of the Mie scattering theory on a single-core processor in a Matlab environment over 800 times and almost 100 times comparing to the corresponding code in C language. Additionally, we overcame problems of the time-consuming data post-processing when some of the parameters (particularly the refractive index) of an investigated liquid are uncertain. Our program allows us to track the parameters characterizing the evaporating droplet nearly simultaneously with the progress of evaporation.

  7. Parallel computers

    SciTech Connect

    Treveaven, P.

    1989-01-01

    This book presents an introduction to object-oriented, functional, and logic parallel computing on which the fifth generation of computer systems will be based. Coverage includes concepts for parallel computing languages, a parallel object-oriented system (DOOM) and its language (POOL), an object-oriented multilevel VLSI simulator using POOL, and implementation of lazy functional languages on parallel architectures.

  8. On the flow processes in sharply inclined and stalled airfoils in parallel movement and rotation

    NASA Technical Reports Server (NTRS)

    Kohler, M.

    1984-01-01

    The purpose of this study is to obtain a deeper insight into the complicated flow processes on airfoils in the region of the buoyancy maxima. To this end calculated and experimental investigations are carried out on a straight stationary, a twisted stationary and a straight rotating rectangular wing. According to the available results the method gives results which can be applied sufficiently for flow applied firmly on all sides for all rotation values. The reliability of the method may be questioned for a flow undergoing transition from the attached to the separated state or for totally separated flow and higher rotation values.

  9. A real time, FEM based optimal control algorithm and its implementation using parallel processing hardware (transistors) in a microprocessor environment

    NASA Technical Reports Server (NTRS)

    Patten, William Neff

    1989-01-01

    There is an evident need to discover a means of establishing reliable, implementable controls for systems that are plagued by nonlinear and, or uncertain, model dynamics. The development of a generic controller design tool for tough-to-control systems is reported. The method utilizes a moving grid, time infinite element based solution of the necessary conditions that describe an optimal controller for a system. The technique produces a discrete feedback controller. Real time laboratory experiments are now being conducted to demonstrate the viability of the method. The algorithm that results is being implemented in a microprocessor environment. Critical computational tasks are accomplished using a low cost, on-board, multiprocessor (INMOS T800 Transputers) and parallel processing. Progress to date validates the methodology presented. Applications of the technique to the control of highly flexible robotic appendages are suggested.

  10. Using the extended parallel process model to prevent noise-induced hearing loss among coal miners in Appalachia

    SciTech Connect

    Murray-Johnson, L.; Witte, K.; Patel, D.; Orrego, V.; Zuckerman, C.; Maxfield, A.M.; Thimons, E.D.

    2004-12-15

    Occupational noise-induced hearing loss is the second most self-reported occupational illness or injury in the United States. Among coal miners, more than 90% of the population reports a hearing deficit by age 55. In this formative evaluation, focus groups were conducted with coal miners in Appalachia to ascertain whether miners perceive hearing loss as a major health risk and if so, what would motivate the consistent wearing of hearing protection devices (HPDs). The theoretical framework of the Extended Parallel Process Model was used to identify the miners' knowledge, attitudes, beliefs, and current behaviors regarding hearing protection. Focus group participants had strong perceived severity and varying levels of perceived susceptibility to hearing loss. Various barriers significantly reduced the self-efficacy and the response efficacy of using hearing protection.

  11. How Are Child Restricted and Repetitive Behaviors Associated with Caregiver Stress Over Time? A Parallel Process Multilevel Growth Model.

    PubMed

    Harrop, Clare; McBee, Matthew; Boyd, Brian A

    2016-05-01

    The impact of raising a child with autism spectrum disorder (ASD) is frequently accompanied by elevated caregiver stress. Examining the variables that predict these elevated rates will help us understand how caregiver stress is impacted by and impacts child behaviors. This study explored how restricted and repetitive behaviors (RRBs) contributed concurrently and longitudinally to caregiver stress in a large sample of preschoolers with ASD using parallel process multilevel growth models. Results indicated that initial rates of and change in RRBs predicted fluctuations in caregiver stress over time. When caregivers reported increased child RRBs, this was mirrored by increases in caregiver stress. Our data support the importance of targeted treatments for RRBs as change in this domain may lead to improvements in caregiver wellbeing.

  12. General neural computer architecture and its ANN-based task assignment method for parallel-distributed processing

    NASA Astrophysics Data System (ADS)

    Chao, Hu; Ray, Sylvian R.; Zheng, Nanning

    1995-06-01

    A new DSP-based neural simulating computer architecture and its ANN-based assignment method for parallel distributed processing are proposed. The hardware of the proposed neural simulating computer can be reconfigured in terms of a variety of research interests and requirements of pattern recognition. The software programming environment utilizes an intelligent compiler to perform static task assignment in both the cases of single-task muliprocessor and multitask processor. An improved Hopfield neural network which can converge to global optical solution is employed by the compiler to map different tasks or neurons to their corresponding real processors. An approach of introducing hidden layer to increase the computation ability of the neural simulating computer is also developed. Finally, a proof is given which shows that the use of improved Hopfield algorithm and the modification to network structure doesn't change the intrinsic properties of the original network.

  13. Using the extended parallel process model to prevent noise-induced hearing loss among coal miners in Appalachia.

    PubMed

    Murray-Johnson, Lisa; Witte, Kim; Patel, Dhaval; Orrego, Victoria; Zuckerman, Cynthia; Maxfield, Andrew M; Thimons, Edward D

    2004-12-01

    Occupational noise-induced hearing loss is the second most self-reported occupational illness or injury in the United States. Among coal miners, more than 90% of the population reports a hearing deficit by age 55. In this formative evaluation, focus groups were conducted with coal miners in Appalachia to ascertain whether miners perceive hearing loss as a major health risk and if so, what would motivate the consistent wearing of hearing protection devices (HPDs). The theoretical framework of the Extended Parallel Process Model was used to identify the miners' knowledge, attitudes, beliefs, and current behaviors regarding hearing protection. Focus group participants had strong perceived severity and varying levels of perceived susceptibility to hearing loss. Various barriers significantly reduced the self-efficacy and the response efficacy of using hearing protection. PMID:15539545

  14. Operation and performance of a longitudinal damping system using parallel digital signal processing

    SciTech Connect

    Fox, J.D.; Hindi, H.; Linscott, I.

    1994-06-01

    A programmable longitudinal feedback system based on four AT&T 1610 digital signal processors has been developed as a component of the PEP-II R&D program. This Longitudinal Quick Prototype is a proof of concept for the PEP-II system and implements full speed bunch-by-bunch signal processing for storage rings with bunch spacings of 4 ns. The design implements, via software, a general purpose feedback controller which allows the system to be operated at several accelerator facilities. The system configuration used for tests at the LBL Advanced Light Source is described. Open and closed loop results showing the detection and calculation of feedback signals from bunch motion are presented, and the system is shown to damp coupled-bunch instabilities in the ALS. Use of the system for accelerator diagnostics is illustrated via measurement of injection transients and analysis of open loop bunch motion.

  15. ISP: an optimal out-of-core image-set processing streaming architecture for parallel heterogeneous systems.

    PubMed

    Ha, Linh Khanh; Krüger, Jens; Dihl Comba, João Luiz; Silva, Cláudio T; Joshi, Sarang

    2012-06-01

    Image population analysis is the class of statistical methods that plays a central role in understanding the development, evolution, and disease of a population. However, these techniques often require excessive computational power and memory that are compounded with a large number of volumetric inputs. Restricted access to supercomputing power limits its influence in general research and practical applications. In this paper we introduce ISP, an Image-Set Processing streaming framework that harnesses the processing power of commodity heterogeneous CPU/GPU systems and attempts to solve this computational problem. In ISP, we introduce specially designed streaming algorithms and data structures that provide an optimal solution for out-of-core multiimage processing problems both in terms of memory usage and computational efficiency. ISP makes use of the asynchronous execution mechanism supported by parallel heterogeneous systems to efficiently hide the inherent latency of the processing pipeline of out-of-core approaches. Consequently, with computationally intensive problems, the ISP out-of-core solution can achieve the same performance as the in-core solution. We demonstrate the efficiency of the ISP framework on synthetic and real datasets. PMID:22291156

  16. ISP: an optimal out-of-core image-set processing streaming architecture for parallel heterogeneous systems.

    PubMed

    Ha, Linh Khanh; Krüger, Jens; Dihl Comba, João Luiz; Silva, Cláudio T; Joshi, Sarang

    2012-06-01

    Image population analysis is the class of statistical methods that plays a central role in understanding the development, evolution, and disease of a population. However, these techniques often require excessive computational power and memory that are compounded with a large number of volumetric inputs. Restricted access to supercomputing power limits its influence in general research and practical applications. In this paper we introduce ISP, an Image-Set Processing streaming framework that harnesses the processing power of commodity heterogeneous CPU/GPU systems and attempts to solve this computational problem. In ISP, we introduce specially designed streaming algorithms and data structures that provide an optimal solution for out-of-core multiimage processing problems both in terms of memory usage and computational efficiency. ISP makes use of the asynchronous execution mechanism supported by parallel heterogeneous systems to efficiently hide the inherent latency of the processing pipeline of out-of-core approaches. Consequently, with computationally intensive problems, the ISP out-of-core solution can achieve the same performance as the in-core solution. We demonstrate the efficiency of the ISP framework on synthetic and real datasets.

  17. Repeated Parallel Evolution of Parental Care Strategies within Xenotilapia, a Genus of Cichlid Fishes from Lake Tanganyika

    PubMed Central

    Kidd, Michael R.; Duftner, Nina; Koblmüller, Stephan; Sturmbauer, Christian; Hofmann, Hans A.

    2012-01-01

    The factors promoting the evolution of parental care strategies have been extensively studied in experiment and theory. However, most attempts to examine parental care in an evolutionary context have evaluated broad taxonomic categories. The explosive and recent diversifications of East African cichlid fishes offer exceptional opportunities to study the evolution of various life history traits based on species-level phylogenies. The Xenotilapia lineage within the endemic Lake Tanganyika cichlid tribe Ectodini comprises species that display either biparental or maternal only brood care and hence offers a unique opportunity to study the evolution of distinct parental care strategies in a phylogenetic framework. In order to reconstruct the evolutionary relationships among 16 species of this lineage we scored 2,478 Amplified Fragment Length Polymorphisms (AFLPs) across the genome. We find that the Ectodini genus Enantiopus is embedded within the genus Xenotilapia and that during 2.5 to 3 million years of evolution within the Xenotilapia clade there have been 3–5 transitions from maternal only to biparental care. While most previous models suggest that uniparental care (maternal or paternal) arose from biparental care, we conclude from our species-level analysis that the evolution of parental care strategies is not only remarkably fast, but much more labile than previously expected. PMID:22347454

  18. Characterization and monitoring of subsurface processes using parallel computing and electrical resistivity imaging

    SciTech Connect

    Johnson, Timothy C.; Truex, Michael J.; Wellman, Dawn M.; Marble, Justin

    2011-12-01

    This newsletter discusses recent advancement in subsurface resistivity characterization and monitoring capabilities. The BC Cribs field desiccation treatability test resistivity monitoring data is use an example to demonstrate near-real time 3D subsurface imaging capabilities. Electrical resistivity tomography (ERT) is a method of imaging the electrical resistivity distribution of the subsurface. An ERT data collection system consists of an array of electrodes, deployed on the ground surface or within boreholes, that are connected to a control unit which can access each electrode independently (Figure 1). A single measurement is collected by injecting current across a pair of current injection electrodes (source and sink), and measuring the resulting potential generated across a pair of potential measurement electrodes (positive and negative). An ERT data set is generated by collecting many such measurements using strategically selected current and potential electrode pairs. This data set is then processed using an inversion algorithm, which reconstructs an estimate (or image) of the electrical conductivity (i.e. the inverse of resistivity) distribution that gave rise to the measured data.

  19. Fabrication process and electrical characterization of direct current parallel micro-discharges in helium

    NASA Astrophysics Data System (ADS)

    Mandra, M.; Dussart, R.; Lee, J.-B.; Goeckner, M.; Dufour, T.; Lefaucheux, P.; Ranson, P.; Overzet, L.

    2007-10-01

    Micro Hollow Cathode Discharges (MHCD) have been fabricated. They are round holes through 250 μm or 500 μm thick Nickel-Alumina-Nickel surfaces. The base surfaces are constructed from 7.5 X 7.5 cm alumina wafers, which are vacuum baked then coated with chromium and copper seed layers and finally patterned. Nickel film, 5-6 um thick, is then deposited on either side of the alumina wafer using the process of electroplating. Single and multi cavity micro discharges are then laser drilled with diameters ranging from 130 μm to 300 μm and spacing between the cavities ranging from 245 μm to 315 μm. Breakdown vs. pressure measurements show that smaller diameter cavities (130 μm) have higher breakdown voltages than cavities with larger diameter (300 μm). In addition, the difference between the breakdown voltage and the operating voltage is substantially larger. Current-voltage measurements for single hole MHCD devices indicates that they operate in the normal glow regime with decreasing discharge voltage as discharge current is increased.

  20. Supernova Emulators: Connecting Massively Parallel SN Ia Radiative Transfer Simulations to Data with Gaussian Processes

    NASA Astrophysics Data System (ADS)

    Goldstein, Daniel; Thomas, Rollin; Kasen, Daniel

    2015-01-01

    Collaboration between the type Ia supernova (SN Ia) modeling and observation communities hinges on our ability to directly connect simulations to data. Here we introduce supernova emulation, a method for facilitating such a connection. Emulation allows us to instantaneously predict the observables (light curves, spectra, spectral time series) generated by arbitrary SN Ia radiative transfer simulations, with estimates of prediction error. Emulators learn the mapping between physically meaningful simulation inputs and the resulting synthetic observables from a training set of simulation input-output pairs. In our emulation framework, we model PCA-decomposed representations of simulated observables as an ensemble of Gaussian Processes. As a proof of concept, we train a bolometric light curve (BLC) emulator on a grid of 400 simulation inputs and BLCs synthesized with the publicly available, gray, time-dependent Monte Carlo expanding atmospheres code, SMOKE. We emulate SMOKE simulations evaluated at a set of 100 out-of-sample input parameters, and achieve excellent agreement between the emulator predictions and the simulated BLCs. In addition to predicting simulation outputs, emulators allow us to infer the regions of simulation input parameter space that correspond to observed SN Ia light curves and spectra. We present a Bayesian framework for solving this inverse problem using Markov Chain Monte Carlo sampling. We fit published bolometric light curves with our emulator and obtain reconstructed masses (nickel mass, total ejecta mass) in agreement with reconstructions from semi-analytic models. We discuss applications of emulation to supernova cosmology and physics, including how emulators can be used to identify and quantify astrophysical sources of systematic error affecting SNe Ia as distance indicators for cosmology.

  1. Parallel Information Processing (Image Transmission Via Fiber Bundle and Multimode Fiber

    NASA Technical Reports Server (NTRS)

    Kukhtarev, Nicholai

    2003-01-01

    Growing demand for visual, user-friendly representation of information inspires search for the new methods of image transmission. Currently used in-series (sequential) methods of information processing are inherently slow and are designed mainly for transmission of one or two dimensional arrays of data. Conventional transmission of data by fibers requires many fibers with array of laser diodes and photodetectors. In practice, fiber bundles are also used for transmission of images. Image is formed on the fiber-optic bundle entrance surface and each fiber transmits the incident image to the exit surface. Since the fibers do not preserve phase, only 2D intensity distribution can be transmitted in this way. Each single mode fiber transmit only one pixel of an image. Multimode fibers may be also used, so that each mode represent different pixel element. Direct transmission of image through multimode fiber is hindered by the mode scrambling and phase randomization. To overcome these obstacles wavelength and time-division multiplexing have been used, with each pixel transmitted on a separate wavelength or time interval. Phase-conjugate techniques also was tested in, but only in the unpractical scheme when reconstructed image return back to the fiber input end. Another method of three-dimensional imaging over single mode fibers was demonstrated in, using laser light of reduced spatial coherence. Coherence encoding, needed for a transmission of images by this methods, was realized with grating interferometer or with the help of an acousto-optic deflector. We suggest simple practical holographic method of image transmission over single multimode fiber or over fiber bundle with coherent light using filtering by holographic optical elements. Originally this method was successfully tested for the single multimode fiber. In this research we have modified holographic method for transmission of laser illuminated images over commercially available fiber bundle (fiber endoscope, or

  2. Bilingual Strategies from the Perspective of a Processing Model

    ERIC Educational Resources Information Center

    Hartsuiker, Robert J.

    2013-01-01

    Muysken argues for four general "strategies" that characterize language contact phenomena across several levels of description. These strategies are (A) maximize structural coherence of the first language (L1); (B) maximize structural coherence of the second language (L2); (C) match between L1 and L2 patterns where possible; and (D) use…

  3. Explicitly Teaching Struggling Writers: Strategies for Mastering the Writing Process

    ERIC Educational Resources Information Center

    Graham, Steve; Harris, Karen R.; MacArthur, Charles

    2006-01-01

    Students are often asked to write reports for science, history, and other content-area classes. Struggling writers and many of their classmates are unsure about how to plan and write reports. This article presents a strategy for planning and writing reports and describes how a general and special education teacher team-taught this strategy to a…

  4. Non-CAR resists and advanced materials for Massively Parallel E-Beam Direct Write process integration

    NASA Astrophysics Data System (ADS)

    Pourteau, Marie-Line; Servin, Isabelle; Lepinay, Kévin; Essomba, Cyrille; Dal'Zotto, Bernard; Pradelles, Jonathan; Lattard, Ludovic; Brandt, Pieter; Wieland, Marco

    2016-03-01

    The emerging Massively Parallel-Electron Beam Direct Write (MP-EBDW) is an attractive high resolution high throughput lithography technology. As previously shown, Chemically Amplified Resists (CARs) meet process/integration specifications in terms of dose-to-size, resolution, contrast, and energy latitude. However, they are still limited by their line width roughness. To overcome this issue, we tested an alternative advanced non-CAR and showed it brings a substantial gain in sensitivity compared to CAR. We also implemented and assessed in-line post-lithographic treatments for roughness mitigation. For outgassing-reduction purpose, a top-coat layer is added to the total process stack. A new generation top-coat was tested and showed improved printing performances compared to the previous product, especially avoiding dark erosion: SEM cross-section showed a straight pattern profile. A spin-coatable charge dissipation layer based on conductive polyaniline has also been tested for conductivity and lithographic performances, and compatibility experiments revealed that the underlying resist type has to be carefully chosen when using this product. Finally, the Process Of Reference (POR) trilayer stack defined for 5 kV multi-e-beam lithography was successfully etched with well opened and straight patterns, and no lithography-etch bias.

  5. Fear-potentiated startle processing in humans: Parallel fMRI and orbicularis EMG assessment during cue conditioning and extinction.

    PubMed

    Lindner, Katja; Neubert, Jörg; Pfannmöller, Jörg; Lotze, Martin; Hamm, Alfons O; Wendt, Julia

    2015-12-01

    Studying neural networks and behavioral indices such as potentiated startle responses during fear conditioning has a long tradition in both animal and human research. However, most of the studies in humans do not link startle potentiation and neural activity during fear acquisition and extinction. Therefore, we examined startle blink responses measured with electromyography (EMG) and brain activity measured with functional MRI simultaneously during differential conditioning. Furthermore, we combined these behavioral fear indices with brain network activity by analyzing the brain activity evoked by the startle probe stimulus presented during conditioned visual threat and safety cues as well as in the absence of visual stimulation. In line with previous research, we found a fear-induced potentiation of the startle blink responses when elicited during a conditioned threat stimulus and a rapid decline of amygdala activity after an initial differentiation of threat and safety cues in early acquisition trials. Increased activation during processing of threat cues was also found in the anterior insula, the anterior cingulate cortex (ACC), and the periaqueductal gray (PAG). More importantly, our results depict an increase of brain activity to probes presented during threatening in comparison to safety cues indicating an involvement of the anterior insula, the ACC, the thalamus, and the PAG in fear-potentiated startle processing during early extinction trials. Our study underlines that parallel assessment of fear-potentiated startle in fMRI paradigms can provide a helpful method to investigate common and distinct processing pathways in humans and animals and, thus, contributes to translational research.

  6. Improvement of Moist and Radiative Processes in Highly Parallel Atmospheric General Circulation Models: Validation and Development

    SciTech Connect

    Frank, William M.; Hack, James J.; Kiehl, Jeffrey T.

    1997-02-24

    Research on designing an integrated moist process parameterization package was carried. This work began with a study that coupled an ensemble of cloud models to a boundary layer model to examine the feasibility of such a methodology for linking boundary layer and cumulus parameterization schemes. The approach proved feasible, prompting research to design and evaluate a coupled parameterization package for GCMS. This research contributed to the development of an Integrated Cumulus Ensemble-Turbulence (ICET) parameterization package. This package incorporates a higher-order turbulence boundary layer that feeds information concerning updraft properties and the variances of temperature and water vapor to the cloud parameterizations. The cumulus ensemble model has been developed, and initial sensitivity tests have been performed in the single column model (SCM) version of CCM2. It is currently being coupled to a convective wake/gust front model. The major function of the convective wake/gust front model is to simulate the partitioning of the boundary layer into disturbed and undisturbed regions. A second function of this model is to predict the nonlinear enhancement of surface to air sensible heat and moisture fluxes that occur in convective regimes due to correlations between winds and anomalously cold, dry air from downdrafts in the gust front region. The third function of the convective wake/gust front model is to predict the amount of undisturbed boundary layer air lifted by the leading edge of the wake and the height to which this air is lifted. The development of the wake/gust front model has been completed, and it has done well in initial testing as a stand-alone component. The current task, to be completed by the end of the funding period, is to tie the wake model to a cumulus ensemble model and to install both components into the single column model version of CCM3 for evaluation. Another area of parametrization research has been focused on the representation

  7. A New Application of Parallel Synthesis Strategy for Discovery of Amide-Linked Small Molecules as Potent Chondroprotective Agents in TNF-α-Stimulated Chondrocytes

    PubMed Central

    Lee, Chia-Chung; Lo, Yang; Ho, Ling-Jun; Lai, Jenn-Haung; Lien, Shiu-Bii; Lin, Leou-Chyr; Chen, Chun-Liang; Chen, Tsung-Chih; Liu, Feng-Cheng; Huang, Hsu-Shan

    2016-01-01

    As part of an effort to profile potential therapeutics for the treatment of inflammation-related diseases, a diversity of amide-linked small molecules was synthesized by using parallel synthesis strategy. Moreover, these new compounds were also evaluated for their inhibitory effects on nitric oxide (NO) by using tumor necrosis factor alpha (TNF-α)-induced inflammatory responses in chondrocytes. Among the tested compounds, N-(3-chloro-4-fluorophenyl)-2-hydroxybenzamide (HS-Ck) was the most potent inhibitor of NO production and inducible nitric oxide synthase (iNOS) expression in TNF-α-stimulated chondrocytes. In addition, our biological results indicated that HS-Ck might suppress the expression levels of iNOS and matrix metalloproteinases-13 (MMP-13) activities through downregulating the activation of nuclear factor kappa B (NF-κB) and signal transducer and activator of transcription 3 (STAT-3) transcriptional factors. Therefore, the parallel synthesis was successful used to develop a new class of potential anti-inflammatory agents as chondroprotective candidates for the treatment of osteoarthritis. PMID:26963090

  8. A New Application of Parallel Synthesis Strategy for Discovery of Amide-Linked Small Molecules as Potent Chondroprotective Agents in TNF-α-Stimulated Chondrocytes.

    PubMed

    Lee, Chia-Chung; Lo, Yang; Ho, Ling-Jun; Lai, Jenn-Haung; Lien, Shiu-Bii; Lin, Leou-Chyr; Chen, Chun-Liang; Chen, Tsung-Chih; Liu, Feng-Cheng; Huang, Hsu-Shan

    2016-01-01

    As part of an effort to profile potential therapeutics for the treatment of inflammation-related diseases, a diversity of amide-linked small molecules was synthesized by using parallel synthesis strategy. Moreover, these new compounds were also evaluated for their inhibitory effects on nitric oxide (NO) by using tumor necrosis factor alpha (TNF-α)-induced inflammatory responses in chondrocytes. Among the tested compounds, N-(3-chloro-4-fluorophenyl)-2-hydroxybenzamide (HS-Ck) was the most potent inhibitor of NO production and inducible nitric oxide synthase (iNOS) expression in TNF-α-stimulated chondrocytes. In addition, our biological results indicated that HS-Ck might suppress the expression levels of iNOS and matrix metalloproteinases-13 (MMP-13) activities through downregulating the activation of nuclear factor kappa B (NF-κB) and signal transducer and activator of transcription 3 (STAT-3) transcriptional factors. Therefore, the parallel synthesis was successful used to develop a new class of potential anti-inflammatory agents as chondroprotective candidates for the treatment of osteoarthritis. PMID:26963090

  9. An extension of the extended parallel process model (EPPM) in television health news: the influence of health consciousness on individual message processing and acceptance.

    PubMed

    Hong, Hyehyun

    2011-06-01

    The purpose of this study is to examine the role of health consciousness in processing TV news that contains potential health threats and preventive recommendations. Based on the extended parallel process model (Witte, 1992), relationships among health consciousness, perceived severity, perceived susceptibility, perceived response efficacy, perceived self-efficacy, and message acceptance/rejection were hypothesized. Responses collected from 175 participants after viewing four TV health news stories were analyzed using the bootstrapping analysis (Preacher & Hayes, 2008). Results confirmed three mediators (i.e., perceived severity, response efficacy, self-efficacy) in the influence of health consciousness on message acceptance. A negative association found between health consciousness and perceived susceptibility is discussed in relation to characteristics of health conscious individuals and optimistic bias of health risks. PMID:21416420

  10. Natural language processing in an intelligent writing strategy tutoring system.

    PubMed

    McNamara, Danielle S; Crossley, Scott A; Roscoe, Rod

    2013-06-01

    The Writing Pal is an intelligent tutoring system that provides writing strategy training. A large part of its artificial intelligence resides in the natural language processing algorithms to assess essay quality and guide feedback to students. Because writing is often highly nuanced and subjective, the development of these algorithms must consider a broad array of linguistic, rhetorical, and contextual features. This study assesses the potential for computational indices to predict human ratings of essay quality. Past studies have demonstrated that linguistic indices related to lexical diversity, word frequency, and syntactic complexity are significant predictors of human judgments of essay quality but that indices of cohesion are not. The present study extends prior work by including a larger data sample and an expanded set of indices to assess new lexical, syntactic, cohesion, rhetorical, and reading ease indices. Three models were assessed. The model reported by McNamara, Crossley, and McCarthy (Written Communication 27:57-86, 2010) including three indices of lexical diversity, word frequency, and syntactic complexity accounted for only 6% of the variance in the larger data set. A regression model including the full set of indices examined in prior studies of writing predicted 38% of the variance in human scores of essay quality with 91% adjacent accuracy (i.e., within 1 point). A regression model that also included new indices related to rhetoric and cohesion predicted 44% of the variance with 94% adjacent accuracy. The new indices increased accuracy but, more importantly, afford the means to provide more meaningful feedback in the context of a writing tutoring system.

  11. Parallel training and testing methods for complex image processing algorithms on distributed, heterogeneous, unreliable, and non-dedicated resources

    NASA Astrophysics Data System (ADS)

    Usamentiaga, Rubén; García, Daniel F.; Molleda, Julio; Sainz, Ignacio; Bulnes, Francisco G.

    2011-01-01

    Advances in the image processing field have brought new methods which are able to perform complex tasks robustly. However, in order to meet constraints on functionality and reliability, imaging application developers often design complex algorithms with many parameters which must be finely tuned for each particular environment. The best approach for tuning these algorithms is to use an automatic training method, but the computational cost of this kind of training method is prohibitive, making it inviable even in powerful machines. The same problem arises when designing testing procedures. This work presents methods to train and test complex image processing algorithms in parallel execution environments. The approach proposed in this work is to use existing resources in offices or laboratories, rather than expensive clusters. These resources are typically non-dedicated, heterogeneous and unreliable. The proposed methods have been designed to deal with all these issues. Two methods are proposed: intelligent training based on genetic algorithms and PVM, and a full factorial design based on grid computing which can be used for training or testing. These methods are capable of harnessing the available computational power resources, giving more work to more powerful machines, while taking its unreliable nature into account. Both methods have been tested using real applications.

  12. Serial and parallel processing in reading: investigating the effects of parafoveal orthographic information on nonisolated word recognition.

    PubMed

    Dare, Natasha; Shillcock, Richard

    2013-01-01

    We present a novel lexical decision task and three boundary paradigm eye-tracking experiments that clarify the picture of parallel processing in word recognition in context. First, we show that lexical decision is facilitated by associated letter information to the left and right of the word, with no apparent hemispheric specificity. Second, we show that parafoveal preview of a repeat of word n at word n + 1 facilitates reading of word n relative to a control condition with an unrelated word at word n + 1. Third, using a version of the boundary paradigm that allowed for a regressive eye movement, we show no parafoveal "postview" effect on reading word n of repeating word n at word n - 1. Fourth, we repeat the second experiment but compare the effects of parafoveal previews consisting of a repeated word n with a transposed central bigram (e.g., caot for coat) and a substituted central bigram (e.g., ceit for coat), showing the latter to have a deleterious effect on processing word n, thereby demonstrating that the parafoveal preview effect is at least orthographic and not purely visual. PMID:22950804

  13. Developing Local Lifelong Guidance Strategies.

    ERIC Educational Resources Information Center

    Watts, A. G.; Hawthorn, Ruth; Hoffbrand, Jill; Jackson, Heather; Spurling, Andrea

    1997-01-01

    Outlines the background, rationale, methodology, and outcomes of developing local lifelong guidance strategies in four geographic areas. Analyzes the main components of the strategies developed and addresses a number of issues relating to the process of strategy development. Explores implications for parallel work in other localities. (RJM)

  14. Parallel processing neural networks

    SciTech Connect

    Zargham, M.

    1988-09-01

    A model for Neural Network which is based on a particular kind of Petri Net has been introduced. The model has been implemented in C and runs on the Sequent Balance 8000 multiprocessor, however it can be directly ported to different multiprocessor environments. The potential advantages of using Petri Nets include: (1) the overall system is often easier to understand due to the graphical and precise nature of the representation scheme, (2) the behavior of the system can be analyzed using Petri Net theory. Though, the Petri Net is an obvious choice as a basis for the model, the basic Petri Net definition is not adequate to represent the neuronal system. To eliminate certain inadequacies more information has been added to the Petri Net model. In the model, a token represents either a processor or a post synaptic potential. Progress through a particular Neural Network is thus graphically depicted in the movement of the processor tokens through the Petri Net.

  15. Transuranic Waste Processing Center (TWPC) Legacy Tank RH-TRU Sludge Processing and Compliance Strategy - 13255

    SciTech Connect

    Rogers, Ben C.; Heacker, Fred K.; Shannon, Christopher; and others

    2013-07-01

    the necessary integrated systems to process the accumulated MVST Facilities SL inventory at the TWPC thus enabling safe and effective disposal of the waste. This BCP does not include work to support current MVST Facility Surveillance and Maintenance programs or the ORNL Building 3019 U-233 Disposition project, since they are not currently part of the TWPC prime contract. The purpose of the environmental compliance strategy is to identify the environmental permits and other required regulatory documents necessary for the construction and operation of the SL- PFB at the TWPC, Oak Ridge, TN. The permits and other regulatory documents identified are necessary to comply with the environmental laws and regulations of DOE Orders, and other requirements documented in the SL-PFB, Safety Design Strategy (SDS), SL-A-AD-002, R0 draft, and the Systems, Function and Requirements Document (SFRD), SL-X-AD-002, R1 draft. This compliance strategy is considered a 'living strategy' and it is anticipated that it will be revised as design progresses and more detail is known. The design basis on which this environmental permitting and compliance strategy is based is the Wastren Advantage, Inc., (WAI), TWPC, SL-PFB (WAI-BL-B.01.06) baseline. (authors)

  16. GEOEYE-1 Satellite Stereo-Pair DEM Extraction Using Scale-Invariant Feature Transform on a Parallel Processing Platform

    NASA Astrophysics Data System (ADS)

    Daliakopoulos, Ioannis; Tsanis, Ioannis

    2013-04-01

    A module for Digital Elevation Model (DEM) extraction from Very High Resolution (VHR) satellite stereo-pair imagery was developed. A procedure for parallel processing of cascading image tiles is used for handling the large datasets requirements of VHR satellite imagery. The Scale-Invariant Feature Transform (SIFT) algorithm is used to detect potentially homogeneous features in the members of the stereo-pair. The resulting feature pairs are filtered using the RANdom SAmple Consensus (RANSAC) algorithm by using a variable distance threshold. Finally, homogeneous pairs are converted to point cloud ground coordinates for DEM generation. The module is tested with a 0.5mx0.5m Geoeye-1 stereo-pair acquired over an area of 25sqkm in the island of Crete, Greece. A sensitivity analysis is performed to determine the optimum module parameterization. The criteria of average point spacing irregularity is introduced to evaluate the quality and assess the effective resolution of the produced DEMs. The resulting 1.5mx1.5m DEM has superior detail over the 2m and 5m DEMs used as reference and yields a Root Mean Square Error (RMSE) of about 1m compared to ground truth measurements.

  17. Photon counting imaging with an electron-bombarded CCD: towards a parallel-processing photoelectronic time-to-amplitude converter.

    PubMed

    Hirvonen, Liisa M; Jiggins, Stephen; Sergent, Nicolas; Zanda, Gianmarco; Suhling, Klaus

    2014-12-01

    We have used an electron-bombarded CCD for optical photon counting imaging. The photon event pulse height distribution was found to be linearly dependent on the gain voltage. We propose on this basis that a gain voltage sweep during exposure in an electron-bombarded sensor would allow photon arrival time determination with sub-frame exposure time resolution. This effectively uses an electron-bombarded sensor as a parallel-processing photoelectronic time-to-amplitude converter, or a two-dimensional photon counting streak camera. Several applications that require timing of photon arrival, including Fluorescence Lifetime Imaging Microscopy, may benefit from such an approach. A simulation of a voltage sweep performed with experimental data collected with different acceleration voltages validates the principle of this approach. Moreover, photon event centroiding was performed and a hybrid 50% Gaussian/Centre of Gravity + 50% Hyperbolic cosine centroiding algorithm was found to yield the lowest fixed pattern noise. Finally, the camera was mounted on a fluorescence microscope to image F-actin filaments stained with the fluorescent dye Alexa 488 in fixed cells.

  18. GPUDePiCt: A Parallel Implementation of a Clustering Algorithm for Computing Degenerate Primers on Graphics Processing Units.

    PubMed

    Cickovski, Trevor; Flor, Tiffany; Irving-Sachs, Galen; Novikov, Philip; Parda, James; Narasimhan, Giri

    2015-01-01

    In order to make multiple copies of a target sequence in the laboratory, the technique of Polymerase Chain Reaction (PCR) requires the design of "primers", which are short fragments of nucleotides complementary to the flanking regions of the target sequence. If the same primer is to amplify multiple closely related target sequences, then it is necessary to make the primers "degenerate", which would allow it to hybridize to target sequences with a limited amount of variability that may have been caused by mutations. However, the PCR technique can only allow a limited amount of degeneracy, and therefore the design of degenerate primers requires the identification of reasonably well-conserved regions in the input sequences. We take an existing algorithm for designing degenerate primers that is based on clustering and parallelize it in a web-accessible software package GPUDePiCt, using a shared memory model and the computing power of Graphics Processing Units (GPUs). We test our implementation on large sets of aligned sequences from the human genome and show a multi-fold speedup for clustering using our hybrid GPU/CPU implementation over a pure CPU approach for these sequences, which consist of more than 7,500 nucleotides. We also demonstrate that this speedup is consistent over larger numbers and longer lengths of aligned sequences.

  19. GPUDePiCt: A Parallel Implementation of a Clustering Algorithm for Computing Degenerate Primers on Graphics Processing Units.

    PubMed

    Cickovski, Trevor; Flor, Tiffany; Irving-Sachs, Galen; Novikov, Philip; Parda, James; Narasimhan, Giri

    2015-01-01

    In order to make multiple copies of a target sequence in the laboratory, the technique of Polymerase Chain Reaction (PCR) requires the design of "primers", which are short fragments of nucleotides complementary to the flanking regions of the target sequence. If the same primer is to amplify multiple closely related target sequences, then it is necessary to make the primers "degenerate", which would allow it to hybridize to target sequences with a limited amount of variability that may have been caused by mutations. However, the PCR technique can only allow a limited amount of degeneracy, and therefore the design of degenerate primers requires the identification of reasonably well-conserved regions in the input sequences. We take an existing algorithm for designing degenerate primers that is based on clustering and parallelize it in a web-accessible software package GPUDePiCt, using a shared memory model and the computing power of Graphics Processing Units (GPUs). We test our implementation on large sets of aligned sequences from the human genome and show a multi-fold speedup for clustering using our hybrid GPU/CPU implementation over a pure CPU approach for these sequences, which consist of more than 7,500 nucleotides. We also demonstrate that this speedup is consistent over larger numbers and longer lengths of aligned sequences. PMID:26357230

  20. Influence of data volume and EPC on process window in massively parallel e-beam direct write

    NASA Astrophysics Data System (ADS)

    Lin, Shy-Jay; Liu, Pei-Yi; Chen, Cheng-Hung; Wang, Wen-Chuan; Shin, Jaw-Jung; Lin, Burn Jeng; McCord, Mark A.; Shriyan, Sameet K.

    2013-03-01

    Multiple e-beam direct write lithography (MEBDW), using >10,000 e-beams writing in parallel, proposed by MAPPER, KLA-Tencor, and IMS is a potential solution for 20-nm half-pitch and beyond. The raster scan in MEBDW makes bitmap its data format. Data handling becomes indispensable since bitmap needs a huge data volume due to the fine pixel size to keep the CD accuracy after e-beam proximity correction (EPC). In fact, in 10,000-beam MEBDW, for a 10 WPH tool of 1-nm pixel size and 1-bit gray level, the aggregated data transmission rate would be up to 1963 Tera bits per second (bps), requiring 19,630 fibers transmitting 10 Gbps in each fiber. The data rate per beam would be <20 Gbps. Hence data reduction using bigger pixel size, fewer grey levels to achieve sub-nm EPC accuracy, and data truncation have been extensively studied. In this paper, process window assessment through Exposure-Defocus (E-D) Forest to quantitatively characterize the data truncation before and after EPC is reported. REBL electron optics, electron scattering in resist, and resist acid diffusion are considered, to construct the E-D Forest and to analyze the imaging performance of the most representative layers and patterns, such as critical line/space and hole layers with minimum pitch, cutting layers, and implant layers, for the 10-nm, and 7-nm nodes.