A Parallel Particle Swarm Optimization Algorithm Accelerated by Asynchronous Evaluations
NASA Technical Reports Server (NTRS)
Venter, Gerhard; Sobieszczanski-Sobieski, Jaroslaw
2005-01-01
A parallel Particle Swarm Optimization (PSO) algorithm is presented. Particle swarm optimization is a fairly recent addition to the family of non-gradient based, probabilistic search algorithms that is based on a simplified social model and is closely tied to swarming theory. Although PSO algorithms present several attractive properties to the designer, they are plagued by high computational cost as measured by elapsed time. One approach to reduce the elapsed time is to make use of coarse-grained parallelization to evaluate the design points. Previous parallel PSO algorithms were mostly implemented in a synchronous manner, where all design points within a design iteration are evaluated before the next iteration is started. This approach leads to poor parallel speedup in cases where a heterogeneous parallel environment is used and/or where the analysis time depends on the design point being analyzed. This paper introduces an asynchronous parallel PSO algorithm that greatly improves the parallel e ciency. The asynchronous algorithm is benchmarked on a cluster assembled of Apple Macintosh G5 desktop computers, using the multi-disciplinary optimization of a typical transport aircraft wing as an example.
Parallel, Asynchronous Executive (PAX): System concepts, facilities, and architecture
NASA Technical Reports Server (NTRS)
Jones, W. H.
1983-01-01
The Parallel, Asynchronous Executive (PAX) is a software operating system simulation that allows many computers to work on a single problem at the same time. PAX is currently implemented on a UNIVAC 1100/42 computer system. Independent UNIVAC runstreams are used to simulate independent computers. Data are shared among independent UNIVAC runstreams through shared mass-storage files. PAX has achieved the following: (1) applied several computing processes simultaneously to a single, logically unified problem; (2) resolved most parallel processor conflicts by careful work assignment; (3) resolved by means of worker requests to PAX all conflicts not resolved by work assignment; (4) provided fault isolation and recovery mechanisms to meet the problems of an actual parallel, asynchronous processing machine. Additionally, one real-life problem has been constructed for the PAX environment. This is CASPER, a collection of aerodynamic and structural dynamic problem simulation routines. CASPER is not discussed in this report except to provide examples of parallel-processing techniques.
3D Hybrid Simulations of Interactions of High-Velocity Plasmoids with Obstacles
NASA Astrophysics Data System (ADS)
Omelchenko, Y. A.; Weber, T. E.; Smith, R. J.
2015-11-01
Interactions of fast plasma streams and objects with magnetic obstacles (dipoles, mirrors, etc) lie at the core of many space and laboratory plasma phenomena ranging from magnetoshells and solar wind interactions with planetary magnetospheres to compact fusion plasmas (spheromaks and FRCs) to astrophysics-in-lab experiments. Properly modeling ion kinetic, finite-Larmor radius and Hall effects is essential for describing large-scale plasma dynamics, turbulence and heating in complex magnetic field geometries. Using an asynchronous parallel hybrid code, HYPERS, we conduct 3D hybrid (particle-in-cell ion, fluid electron) simulations of such interactions under realistic conditions that include magnetic flux coils, ion-ion collisions and the Chodura resistivity. HYPERS does not step simulation variables synchronously in time but instead performs time integration by executing asynchronous discrete events: updates of particles and fields carried out as frequently as dictated by local physical time scales. Simulations are compared with data from the MSX experiment which studies the physics of magnetized collisionless shocks through the acceleration and subsequent stagnation of FRC plasmoids against a strong magnetic mirror and flux-conserving boundary.
NASA Astrophysics Data System (ADS)
Shoemaker, C. A.; Pang, M.; Akhtar, T.; Bindel, D.
2016-12-01
New parallel surrogate global optimization algorithms are developed and applied to objective functions that are expensive simulations (possibly with multiple local minima). The algorithms can be applied to most geophysical simulations, including those with nonlinear partial differential equations. The optimization does not require simulations be parallelized. Asynchronous (and synchronous) parallel execution is available in the optimization toolbox "pySOT". The parallel algorithms are modified from serial to eliminate fine grained parallelism. The optimization is computed with open source software pySOT, a Surrogate Global Optimization Toolbox that allows user to pick the type of surrogate (or ensembles), the search procedure on surrogate, and the type of parallelism (synchronous or asynchronous). pySOT also allows the user to develop new algorithms by modifying parts of the code. In the applications here, the objective function takes up to 30 minutes for one simulation, and serial optimization can take over 200 hours. Results from Yellowstone (NSF) and NCSS (Singapore) supercomputers are given for groundwater contaminant hydrology simulations with applications to model parameter estimation and decontamination management. All results are compared with alternatives. The first results are for optimization of pumping at many wells to reduce cost for decontamination of groundwater at a superfund site. The optimization runs with up to 128 processors. Superlinear speed up is obtained for up to 16 processors, and efficiency with 64 processors is over 80%. Each evaluation of the objective function requires the solution of nonlinear partial differential equations to describe the impact of spatially distributed pumping and model parameters on model predictions for the spatial and temporal distribution of groundwater contaminants. The second application uses an asynchronous parallel global optimization for groundwater quality model calibration. The time for a single objective function evaluation varies unpredictably, so efficiency is improved with asynchronous parallel calculations to improve load balancing. The third application (done at NCSS) incorporates new global surrogate multi-objective parallel search algorithms into pySOT and applies it to a large watershed calibration problem.
A massively asynchronous, parallel brain.
Zeki, Semir
2015-05-19
Whether the visual brain uses a parallel or a serial, hierarchical, strategy to process visual signals, the end result appears to be that different attributes of the visual scene are perceived asynchronously--with colour leading form (orientation) by 40 ms and direction of motion by about 80 ms. Whatever the neural root of this asynchrony, it creates a problem that has not been properly addressed, namely how visual attributes that are perceived asynchronously over brief time windows after stimulus onset are bound together in the longer term to give us a unified experience of the visual world, in which all attributes are apparently seen in perfect registration. In this review, I suggest that there is no central neural clock in the (visual) brain that synchronizes the activity of different processing systems. More likely, activity in each of the parallel processing-perceptual systems of the visual brain is reset independently, making of the brain a massively asynchronous organ, just like the new generation of more efficient computers promise to be. Given the asynchronous operations of the brain, it is likely that the results of activities in the different processing-perceptual systems are not bound by physiological interactions between cells in the specialized visual areas, but post-perceptually, outside the visual brain.
A Binary Array Asynchronous Sorting Algorithm with Using Petri Nets
NASA Astrophysics Data System (ADS)
Voevoda, A. A.; Romannikov, D. O.
2017-01-01
Nowadays the tasks of computations speed-up and/or their optimization are actual. Among the approaches on how to solve these tasks, a method applying approaches of parallelization and asynchronization to a sorting algorithm is considered in the paper. The sorting methods are ones of elementary methods and they are used in a huge amount of different applications. In the paper, we offer a method of an array sorting that based on a division into a set of independent adjacent pairs of numbers and their parallel and asynchronous comparison. And this one distinguishes the offered method from the traditional sorting algorithms (like quick sorting, merge sorting, insertion sorting and others). The algorithm is implemented with the use of Petri nets, like the most suitable tool for an asynchronous systems description.
Zeki, Semir
2016-10-01
Results from a variety of sources, some many years old, lead ineluctably to a re-appraisal of the twin strategies of hierarchical and parallel processing used by the brain to construct an image of the visual world. Contrary to common supposition, there are at least three 'feed-forward' anatomical hierarchies that reach the primary visual cortex (V1) and the specialized visual areas outside it, in parallel. These anatomical hierarchies do not conform to the temporal order with which visual signals reach the specialized visual areas through V1. Furthermore, neither the anatomical hierarchies nor the temporal order of activation through V1 predict the perceptual hierarchies. The latter shows that we see (and become aware of) different visual attributes at different times, with colour leading form (orientation) and directional visual motion, even though signals from fast-moving, high-contrast stimuli are among the earliest to reach the visual cortex (of area V5). Parallel processing, on the other hand, is much more ubiquitous than commonly supposed but is subject to a barely noticed but fundamental aspect of brain operations, namely that different parallel systems operate asynchronously with respect to each other and reach perceptual endpoints at different times. This re-assessment leads to the conclusion that the visual brain is constituted of multiple, parallel and asynchronously operating task- and stimulus-dependent hierarchies (STDH); which of these parallel anatomical hierarchies have temporal and perceptual precedence at any given moment is stimulus and task related, and dependent on the visual brain's ability to undertake multiple operations asynchronously. © 2016 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Global interrupt and barrier networks
Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E; Heidelberger, Philip; Kopcsay, Gerard V.; Steinmacher-Burow, Burkhard D.; Takken, Todd E.
2008-10-28
A system and method for generating global asynchronous signals in a computing structure. Particularly, a global interrupt and barrier network is implemented that implements logic for generating global interrupt and barrier signals for controlling global asynchronous operations performed by processing elements at selected processing nodes of a computing structure in accordance with a processing algorithm; and includes the physical interconnecting of the processing nodes for communicating the global interrupt and barrier signals to the elements via low-latency paths. The global asynchronous signals respectively initiate interrupt and barrier operations at the processing nodes at times selected for optimizing performance of the processing algorithms. In one embodiment, the global interrupt and barrier network is implemented in a scalable, massively parallel supercomputing device structure comprising a plurality of processing nodes interconnected by multiple independent networks, with each node including one or more processing elements for performing computation or communication activity as required when performing parallel algorithm operations. One multiple independent network includes a global tree network for enabling high-speed global tree communications among global tree network nodes or sub-trees thereof. The global interrupt and barrier network may operate in parallel with the global tree network for providing global asynchronous sideband signals.
A massively asynchronous, parallel brain
Zeki, Semir
2015-01-01
Whether the visual brain uses a parallel or a serial, hierarchical, strategy to process visual signals, the end result appears to be that different attributes of the visual scene are perceived asynchronously—with colour leading form (orientation) by 40 ms and direction of motion by about 80 ms. Whatever the neural root of this asynchrony, it creates a problem that has not been properly addressed, namely how visual attributes that are perceived asynchronously over brief time windows after stimulus onset are bound together in the longer term to give us a unified experience of the visual world, in which all attributes are apparently seen in perfect registration. In this review, I suggest that there is no central neural clock in the (visual) brain that synchronizes the activity of different processing systems. More likely, activity in each of the parallel processing-perceptual systems of the visual brain is reset independently, making of the brain a massively asynchronous organ, just like the new generation of more efficient computers promise to be. Given the asynchronous operations of the brain, it is likely that the results of activities in the different processing-perceptual systems are not bound by physiological interactions between cells in the specialized visual areas, but post-perceptually, outside the visual brain. PMID:25823871
On the utility of threads for data parallel programming
NASA Technical Reports Server (NTRS)
Fahringer, Thomas; Haines, Matthew; Mehrotra, Piyush
1995-01-01
Threads provide a useful programming model for asynchronous behavior because of their ability to encapsulate units of work that can then be scheduled for execution at runtime, based on the dynamic state of a system. Recently, the threaded model has been applied to the domain of data parallel scientific codes, and initial reports indicate that the threaded model can produce performance gains over non-threaded approaches, primarily through the use of overlapping useful computation with communication latency. However, overlapping computation with communication is possible without the benefit of threads if the communication system supports asynchronous primitives, and this comparison has not been made in previous papers. This paper provides a critical look at the utility of lightweight threads as applied to data parallel scientific programming.
NASA Astrophysics Data System (ADS)
Tolson, B.; Matott, L. S.; Gaffoor, T. A.; Asadzadeh, M.; Shafii, M.; Pomorski, P.; Xu, X.; Jahanpour, M.; Razavi, S.; Haghnegahdar, A.; Craig, J. R.
2015-12-01
We introduce asynchronous parallel implementations of the Dynamically Dimensioned Search (DDS) family of algorithms including DDS, discrete DDS, PA-DDS and DDS-AU. These parallel algorithms are unique from most existing parallel optimization algorithms in the water resources field in that parallel DDS is asynchronous and does not require an entire population (set of candidate solutions) to be evaluated before generating and then sending a new candidate solution for evaluation. One key advance in this study is developing the first parallel PA-DDS multi-objective optimization algorithm. The other key advance is enhancing the computational efficiency of solving optimization problems (such as model calibration) by combining a parallel optimization algorithm with the deterministic model pre-emption concept. These two efficiency techniques can only be combined because of the asynchronous nature of parallel DDS. Model pre-emption functions to terminate simulation model runs early, prior to completely simulating the model calibration period for example, when intermediate results indicate the candidate solution is so poor that it will definitely have no influence on the generation of further candidate solutions. The computational savings of deterministic model preemption available in serial implementations of population-based algorithms (e.g., PSO) disappear in synchronous parallel implementations as these algorithms. In addition to the key advances above, we implement the algorithms across a range of computation platforms (Windows and Unix-based operating systems from multi-core desktops to a supercomputer system) and package these for future modellers within a model-independent calibration software package called Ostrich as well as MATLAB versions. Results across multiple platforms and multiple case studies (from 4 to 64 processors) demonstrate the vast improvement over serial DDS-based algorithms and highlight the important role model pre-emption plays in the performance of parallel, pre-emptable DDS algorithms. Case studies include single- and multiple-objective optimization problems in water resources model calibration and in many cases linear or near linear speedups are observed.
Reverse engineering a gene network using an asynchronous parallel evolution strategy
2010-01-01
Background The use of reverse engineering methods to infer gene regulatory networks by fitting mathematical models to gene expression data is becoming increasingly popular and successful. However, increasing model complexity means that more powerful global optimisation techniques are required for model fitting. The parallel Lam Simulated Annealing (pLSA) algorithm has been used in such approaches, but recent research has shown that island Evolutionary Strategies can produce faster, more reliable results. However, no parallel island Evolutionary Strategy (piES) has yet been demonstrated to be effective for this task. Results Here, we present synchronous and asynchronous versions of the piES algorithm, and apply them to a real reverse engineering problem: inferring parameters in the gap gene network. We find that the asynchronous piES exhibits very little communication overhead, and shows significant speed-up for up to 50 nodes: the piES running on 50 nodes is nearly 10 times faster than the best serial algorithm. We compare the asynchronous piES to pLSA on the same test problem, measuring the time required to reach particular levels of residual error, and show that it shows much faster convergence than pLSA across all optimisation conditions tested. Conclusions Our results demonstrate that the piES is consistently faster and more reliable than the pLSA algorithm on this problem, and scales better with increasing numbers of nodes. In addition, the piES is especially well suited to further improvements and adaptations: Firstly, the algorithm's fast initial descent speed and high reliability make it a good candidate for being used as part of a global/local search hybrid algorithm. Secondly, it has the potential to be used as part of a hierarchical evolutionary algorithm, which takes advantage of modern multi-core computing architectures. PMID:20196855
NASA Astrophysics Data System (ADS)
Fang, Ye; Feng, Sheng; Tam, Ka-Ming; Yun, Zhifeng; Moreno, Juana; Ramanujam, J.; Jarrell, Mark
2014-10-01
Monte Carlo simulations of the Ising model play an important role in the field of computational statistical physics, and they have revealed many properties of the model over the past few decades. However, the effect of frustration due to random disorder, in particular the possible spin glass phase, remains a crucial but poorly understood problem. One of the obstacles in the Monte Carlo simulation of random frustrated systems is their long relaxation time making an efficient parallel implementation on state-of-the-art computation platforms highly desirable. The Graphics Processing Unit (GPU) is such a platform that provides an opportunity to significantly enhance the computational performance and thus gain new insight into this problem. In this paper, we present optimization and tuning approaches for the CUDA implementation of the spin glass simulation on GPUs. We discuss the integration of various design alternatives, such as GPU kernel construction with minimal communication, memory tiling, and look-up tables. We present a binary data format, Compact Asynchronous Multispin Coding (CAMSC), which provides an additional 28.4% speedup compared with the traditionally used Asynchronous Multispin Coding (AMSC). Our overall design sustains a performance of 33.5 ps per spin flip attempt for simulating the three-dimensional Edwards-Anderson model with parallel tempering, which significantly improves the performance over existing GPU implementations.
Frog: Asynchronous Graph Processing on GPU with Hybrid Coloring Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Xuanhua; Luo, Xuan; Liang, Junling
GPUs have been increasingly used to accelerate graph processing for complicated computational problems regarding graph theory. Many parallel graph algorithms adopt the asynchronous computing model to accelerate the iterative convergence. Unfortunately, the consistent asynchronous computing requires locking or atomic operations, leading to significant penalties/overheads when implemented on GPUs. As such, coloring algorithm is adopted to separate the vertices with potential updating conflicts, guaranteeing the consistency/correctness of the parallel processing. Common coloring algorithms, however, may suffer from low parallelism because of a large number of colors generally required for processing a large-scale graph with billions of vertices. We propose a light-weightmore » asynchronous processing framework called Frog with a preprocessing/hybrid coloring model. The fundamental idea is based on Pareto principle (or 80-20 rule) about coloring algorithms as we observed through masses of realworld graph coloring cases. We find that a majority of vertices (about 80%) are colored with only a few colors, such that they can be read and updated in a very high degree of parallelism without violating the sequential consistency. Accordingly, our solution separates the processing of the vertices based on the distribution of colors. In this work, we mainly answer three questions: (1) how to partition the vertices in a sparse graph with maximized parallelism, (2) how to process large-scale graphs that cannot fit into GPU memory, and (3) how to reduce the overhead of data transfers on PCIe while processing each partition. We conduct experiments on real-world data (Amazon, DBLP, YouTube, RoadNet-CA, WikiTalk and Twitter) to evaluate our approach and make comparisons with well-known non-preprocessed (such as Totem, Medusa, MapGraph and Gunrock) and preprocessed (Cusha) approaches, by testing four classical algorithms (BFS, PageRank, SSSP and CC). On all the tested applications and datasets, Frog is able to significantly outperform existing GPU-based graph processing systems except Gunrock and MapGraph. MapGraph gets better performance than Frog when running BFS on RoadNet-CA. The comparison between Gunrock and Frog is inconclusive. Frog can outperform Gunrock more than 1.04X when running PageRank and SSSP, while the advantage of Frog is not obvious when running BFS and CC on some datasets especially for RoadNet-CA.« less
Dynamic grid refinement for partial differential equations on parallel computers
NASA Technical Reports Server (NTRS)
Mccormick, S.; Quinlan, D.
1989-01-01
The fast adaptive composite grid method (FAC) is an algorithm that uses various levels of uniform grids to provide adaptive resolution and fast solution of PDEs. An asynchronous version of FAC, called AFAC, that completely eliminates the bottleneck to parallelism is presented. This paper describes the advantage that this algorithm has in adaptive refinement for moving singularities on multiprocessor computers. This work is applicable to the parallel solution of two- and three-dimensional shock tracking problems.
An Asynchronous Many-Task Implementation of In-Situ Statistical Analysis using Legion.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pebay, Philippe Pierre; Bennett, Janine Camille
2015-11-01
In this report, we propose a framework for the design and implementation of in-situ analy- ses using an asynchronous many-task (AMT) model, using the Legion programming model together with the MiniAero mini-application as a surrogate for full-scale parallel scientific computing applications. The bulk of this work consists of converting the Learn/Derive/Assess model which we had initially developed for parallel statistical analysis using MPI [PTBM11], from a SPMD to an AMT model. In this goal, we propose an original use of the concept of Legion logical regions as a replacement for the parallel communication schemes used for the only operation ofmore » the statistics engines that require explicit communication. We then evaluate this proposed scheme in a shared memory environment, using the Legion port of MiniAero as a proxy for a full-scale scientific application, as a means to provide input data sets of variable size for the in-situ statistical analyses in an AMT context. We demonstrate in particular that the approach has merit, and warrants further investigation, in collaboration with ongoing efforts to improve the overall parallel performance of the Legion system.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
D'Azevedo, Eduardo; Abbott, Stephen; Koskela, Tuomas
The XGC fusion gyrokinetic code combines state-of-the-art, portable computational and algorithmic technologies to enable complicated multiscale simulations of turbulence and transport dynamics in ITER edge plasma on the largest US open-science computer, the CRAY XK7 Titan, at its maximal heterogeneous capability, which have not been possible before due to a factor of over 10 shortage in the time-to-solution for less than 5 days of wall-clock time for one physics case. Frontier techniques such as nested OpenMP parallelism, adaptive parallel I/O, staging I/O and data reduction using dynamic and asynchronous applications interactions, dynamic repartitioning for balancing computational work in pushing particlesmore » and in grid related work, scalable and accurate discretization algorithms for non-linear Coulomb collisions, and communication-avoiding subcycling technology for pushing particles on both CPUs and GPUs are also utilized to dramatically improve the scalability and time-to-solution, hence enabling the difficult kinetic ITER edge simulation on a present-day leadership class computer.« less
AP-IO: asynchronous pipeline I/O for hiding periodic output cost in CFD simulation.
Xiaoguang, Ren; Xinhai, Xu
2014-01-01
Computational fluid dynamics (CFD) simulation often needs to periodically output intermediate results to files in the form of snapshots for visualization or restart, which seriously impacts the performance. In this paper, we present asynchronous pipeline I/O (AP-IO) optimization scheme for the periodically snapshot output on the basis of asynchronous I/O and CFD application characteristics. In AP-IO, dedicated background I/O processes or threads are in charge of handling the file write in pipeline mode, therefore the write overhead can be hidden with more calculation than classic asynchronous I/O. We design the framework of AP-IO and implement it in OpenFOAM, providing CFD users with a user-friendly interface. Experimental results on the Tianhe-2 supercomputer demonstrate that AP-IO can achieve a good optimization effect for the periodical snapshot output in CFD application, and the effect is especially better for massively parallel CFD simulations, which can reduce the total execution time up to about 40%.
AP-IO: Asynchronous Pipeline I/O for Hiding Periodic Output Cost in CFD Simulation
Xiaoguang, Ren; Xinhai, Xu
2014-01-01
Computational fluid dynamics (CFD) simulation often needs to periodically output intermediate results to files in the form of snapshots for visualization or restart, which seriously impacts the performance. In this paper, we present asynchronous pipeline I/O (AP-IO) optimization scheme for the periodically snapshot output on the basis of asynchronous I/O and CFD application characteristics. In AP-IO, dedicated background I/O processes or threads are in charge of handling the file write in pipeline mode, therefore the write overhead can be hidden with more calculation than classic asynchronous I/O. We design the framework of AP-IO and implement it in OpenFOAM, providing CFD users with a user-friendly interface. Experimental results on the Tianhe-2 supercomputer demonstrate that AP-IO can achieve a good optimization effect for the periodical snapshot output in CFD application, and the effect is especially better for massively parallel CFD simulations, which can reduce the total execution time up to about 40%. PMID:24955390
Reliability models for dataflow computer systems
NASA Technical Reports Server (NTRS)
Kavi, K. M.; Buckles, B. P.
1985-01-01
The demands for concurrent operation within a computer system and the representation of parallelism in programming languages have yielded a new form of program representation known as data flow (DENN 74, DENN 75, TREL 82a). A new model based on data flow principles for parallel computations and parallel computer systems is presented. Necessary conditions for liveness and deadlock freeness in data flow graphs are derived. The data flow graph is used as a model to represent asynchronous concurrent computer architectures including data flow computers.
Adaptive parallel logic networks
NASA Technical Reports Server (NTRS)
Martinez, Tony R.; Vidal, Jacques J.
1988-01-01
Adaptive, self-organizing concurrent systems (ASOCS) that combine self-organization with massive parallelism for such applications as adaptive logic devices, robotics, process control, and system malfunction management, are presently discussed. In ASOCS, an adaptive network composed of many simple computing elements operating in combinational and asynchronous fashion is used and problems are specified by presenting if-then rules to the system in the form of Boolean conjunctions. During data processing, which is a different operational phase from adaptation, the network acts as a parallel hardware circuit.
Design and Testing of a Fast, 50 kV Solid-State Kicker Pulser
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cook, E G; Hickman, B C; Lee, B S
2002-06-24
The ability to extract particle beam bunches from a ring accelerator in arbitrary order can greatly extend an accelerator's capabilities and applications. A prototype solid-state kicker pulser capable of generating asynchronous bursts of 50 kV pulses has been designed and tested into a 50{Omega} load. The pulser features fast rise and fall times and is capable of generating an arbitrary pattern of pulses with a maximum burst frequency exceeding 5 MHz If required, the pulse-width of each pulse in the burst is independently adjustable. This kicker modulator uses multiple solid-state modules stacked in an inductive-adder configuration where the energy ismore » switched into each section of the adder by a parallel array of MOSFETs. Test data, capabilities, and limitations of the prototype pulser are described.« less
UPC++ Programmer’s Guide (v1.0 2017.9)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bachan, J.; Baden, S.; Bonachea, D.
UPC++ is a C++11 library that provides Asynchronous Partitioned Global Address Space (APGAS) programming. It is designed for writing parallel programs that run efficiently and scale well on distributed-memory parallel computers. The APGAS model is single program, multiple-data (SPMD), with each separate thread of execution (referred to as a rank, a term borrowed from MPI) having access to local memory as it would in C++. However, APGAS also provides access to a global address space, which is allocated in shared segments that are distributed over the ranks. UPC++ provides numerous methods for accessing and using global memory. In UPC++, allmore » operations that access remote memory are explicit, which encourages programmers to be aware of the cost of communication and data movement. Moreover, all remote-memory access operations are by default asynchronous, to enable programmers to write code that scales well even on hundreds of thousands of cores.« less
NASA Technical Reports Server (NTRS)
Mccormick, S.; Quinlan, D.
1989-01-01
The fast adaptive composite grid method (FAC) is an algorithm that uses various levels of uniform grids (global and local) to provide adaptive resolution and fast solution of PDEs. Like all such methods, it offers parallelism by using possibly many disconnected patches per level, but is hindered by the need to handle these levels sequentially. The finest levels must therefore wait for processing to be essentially completed on all the coarser ones. A recently developed asynchronous version of FAC, called AFAC, completely eliminates this bottleneck to parallelism. This paper describes timing results for AFAC, coupled with a simple load balancing scheme, applied to the solution of elliptic PDEs on an Intel iPSC hypercube. These tests include performance of certain processes necessary in adaptive methods, including moving grids and changing refinement. A companion paper reports on numerical and analytical results for estimating convergence factors of AFAC applied to very large scale examples.
Overcoming Challenges in Kinetic Modeling of Magnetized Plasmas and Vacuum Electronic Devices
NASA Astrophysics Data System (ADS)
Omelchenko, Yuri; Na, Dong-Yeop; Teixeira, Fernando
2017-10-01
We transform the state-of-the art of plasma modeling by taking advantage of novel computational techniques for fast and robust integration of multiscale hybrid (full particle ions, fluid electrons, no displacement current) and full-PIC models. These models are implemented in 3D HYPERS and axisymmetric full-PIC CONPIC codes. HYPERS is a massively parallel, asynchronous code. The HYPERS solver does not step fields and particles synchronously in time but instead executes local variable updates (events) at their self-adaptive rates while preserving fundamental conservation laws. The charge-conserving CONPIC code has a matrix-free explicit finite-element (FE) solver based on a sparse-approximate inverse (SPAI) algorithm. This explicit solver approximates the inverse FE system matrix (``mass'' matrix) using successive sparsity pattern orders of the original matrix. It does not reduce the set of Maxwell's equations to a vector-wave (curl-curl) equation of second order but instead utilizes the standard coupled first-order Maxwell's system. We discuss the ability of our codes to accurately and efficiently account for multiscale physical phenomena in 3D magnetized space and laboratory plasmas and axisymmetric vacuum electronic devices.
2007-09-17
been proposed; these include a combination of variable fidelity models, parallelisation strategies and hybridisation techniques (Coello, Veldhuizen et...Coello et al (Coello, Veldhuizen et al. 2002). 4.4.2 HIERARCHICAL POPULATION TOPOLOGY A hierarchical population topology, when integrated into...to hybrid parallel Multi-Objective Evolutionary Algorithms (pMOEA) (Cantu-Paz 2000; Veldhuizen , Zydallis et al. 2003); it uses a master slave
RAMP: A fault tolerant distributed microcomputer structure for aircraft navigation and control
NASA Technical Reports Server (NTRS)
Dunn, W. R.
1980-01-01
RAMP consists of distributed sets of parallel computers partioned on the basis of software and packaging constraints. To minimize hardware and software complexity, the processors operate asynchronously. It was shown that through the design of asymptotically stable control laws, data errors due to the asynchronism were minimized. It was further shown that by designing control laws with this property and making minor hardware modifications to the RAMP modules, the system became inherently tolerant to intermittent faults. A laboratory version of RAMP was constructed and is described in the paper along with the experimental results.
An Environment for Incremental Development of Distributed Extensible Asynchronous Real-time Systems
NASA Technical Reports Server (NTRS)
Ames, Charles K.; Burleigh, Scott; Briggs, Hugh C.; Auernheimer, Brent
1996-01-01
Incremental parallel development of distributed real-time systems is difficult. Architectural techniques and software tools developed at the Jet Propulsion Laboratory's (JPL's) Flight System Testbed make feasible the integration of complex systems in various stages of development.
Williams, Lance R
2016-01-01
Object-oriented combinator chemistry (OOCC) is an artificial chemistry with composition devices borrowed from object-oriented and functional programming languages. Actors in OOCC are embedded in space and subject to diffusion; since they are neither created nor destroyed, their mass is conserved. Actors use programs constructed from combinators to asynchronously update their own states and the states of other actors in their neighborhoods. The fact that programs and combinators are themselves reified as actors makes it possible to build programs that build programs from combinators of a few primitive types using asynchronous spatial processes that resemble chemistry as much as computation. To demonstrate this, OOCC is used to define a parallel, asynchronous, spatially distributed self-replicating system modeled in part on the living cell. Since interactions among its parts result in the construction of more of these same parts, the system is strongly constructive. The system's high normalized complexity is contrasted with that of a simple composome.
Design, development and use of the finite element machine
NASA Technical Reports Server (NTRS)
Adams, L. M.; Voigt, R. C.
1983-01-01
Some of the considerations that went into the design of the Finite Element Machine, a research asynchronous parallel computer are described. The present status of the system is also discussed along with some indication of the type of results that were obtained.
Three-dimensional particle tracking velocimetry using dynamic vision sensors
NASA Astrophysics Data System (ADS)
Borer, D.; Delbruck, T.; Rösgen, T.
2017-12-01
A fast-flow visualization method is presented based on tracking neutrally buoyant soap bubbles with a set of neuromorphic cameras. The "dynamic vision sensors" register only the changes in brightness with very low latency, capturing fast processes at a low data rate. The data consist of a stream of asynchronous events, each encoding the corresponding pixel position, the time instant of the event and the sign of the change in logarithmic intensity. The work uses three such synchronized cameras to perform 3D particle tracking in a medium sized wind tunnel. The data analysis relies on Kalman filters to associate the asynchronous events with individual tracers and to reconstruct the three-dimensional path and velocity based on calibrated sensor information.
A Verification System for Distributed Objects with Asynchronous Method Calls
NASA Astrophysics Data System (ADS)
Ahrendt, Wolfgang; Dylla, Maximilian
We present a verification system for Creol, an object-oriented modeling language for concurrent distributed applications. The system is an instance of KeY, a framework for object-oriented software verification, which has so far been applied foremost to sequential Java. Building on KeY characteristic concepts, like dynamic logic, sequent calculus, explicit substitutions, and the taclet rule language, the system presented in this paper addresses functional correctness of Creol models featuring local cooperative thread parallelism and global communication via asynchronous method calls. The calculus heavily operates on communication histories which describe the interfaces of Creol units. Two example scenarios demonstrate the usage of the system.
A wavelet approach to binary blackholes with asynchronous multitasking
NASA Astrophysics Data System (ADS)
Lim, Hyun; Hirschmann, Eric; Neilsen, David; Anderson, Matthew; Debuhr, Jackson; Zhang, Bo
2016-03-01
Highly accurate simulations of binary black holes and neutron stars are needed to address a variety of interesting problems in relativistic astrophysics. We present a new method for the solving the Einstein equations (BSSN formulation) using iterated interpolating wavelets. Wavelet coefficients provide a direct measure of the local approximation error for the solution and place collocation points that naturally adapt to features of the solution. Further, they exhibit exponential convergence on unevenly spaced collection points. The parallel implementation of the wavelet simulation framework presented here deviates from conventional practice in combining multi-threading with a form of message-driven computation sometimes referred to as asynchronous multitasking.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumar, Sameer
Disclosed is a mechanism on receiving processors in a parallel computing system for providing order to data packets received from a broadcast call and to distinguish data packets received at nodes from several incoming asynchronous broadcast messages where header space is limited. In the present invention, processors at lower leafs of a tree do not need to obtain a broadcast message by directly accessing the data in a root processor's buffer. Instead, each subsequent intermediate node's rank id information is squeezed into the software header of packet headers. In turn, the entire broadcast message is not transferred from the rootmore » processor to each processor in a communicator but instead is replicated on several intermediate nodes which then replicated the message to nodes in lower leafs. Hence, the intermediate compute nodes become "virtual root compute nodes" for the purpose of replicating the broadcast message to lower levels of a tree.« less
Exploring Asynchronous Many-Task Runtime Systems toward Extreme Scales
DOE Office of Scientific and Technical Information (OSTI.GOV)
Knight, Samuel; Baker, Gavin Matthew; Gamell, Marc
2015-10-01
Major exascale computing reports indicate a number of software challenges to meet the dramatic change of system architectures in near future. While several-orders-of-magnitude increase in parallelism is the most commonly cited of those, hurdles also include performance heterogeneity of compute nodes across the system, increased imbalance between computational capacity and I/O capabilities, frequent system interrupts, and complex hardware architectures. Asynchronous task-parallel programming models show a great promise in addressing these issues, but are not yet fully understood nor developed su ciently for computational science and engineering application codes. We address these knowledge gaps through quantitative and qualitative exploration of leadingmore » candidate solutions in the context of engineering applications at Sandia. In this poster, we evaluate MiniAero code ported to three leading candidate programming models (Charm++, Legion and UINTAH) to examine the feasibility of these models that permits insertion of new programming model elements into an existing code base.« less
Optimization by nonhierarchical asynchronous decomposition
NASA Technical Reports Server (NTRS)
Shankar, Jayashree; Ribbens, Calvin J.; Haftka, Raphael T.; Watson, Layne T.
1992-01-01
Large scale optimization problems are tractable only if they are somehow decomposed. Hierarchical decompositions are inappropriate for some types of problems and do not parallelize well. Sobieszczanski-Sobieski has proposed a nonhierarchical decomposition strategy for nonlinear constrained optimization that is naturally parallel. Despite some successes on engineering problems, the algorithm as originally proposed fails on simple two dimensional quadratic programs. The algorithm is carefully analyzed for quadratic programs, and a number of modifications are suggested to improve its robustness.
NASA Technical Reports Server (NTRS)
Dongarra, Jack (Editor); Messina, Paul (Editor); Sorensen, Danny C. (Editor); Voigt, Robert G. (Editor)
1990-01-01
Attention is given to such topics as an evaluation of block algorithm variants in LAPACK and presents a large-grain parallel sparse system solver, a multiprocessor method for the solution of the generalized Eigenvalue problem on an interval, and a parallel QR algorithm for iterative subspace methods on the CM2. A discussion of numerical methods includes the topics of asynchronous numerical solutions of PDEs on parallel computers, parallel homotopy curve tracking on a hypercube, and solving Navier-Stokes equations on the Cedar Multi-Cluster system. A section on differential equations includes a discussion of a six-color procedure for the parallel solution of elliptic systems using the finite quadtree structure, data parallel algorithms for the finite element method, and domain decomposition methods in aerodynamics. Topics dealing with massively parallel computing include hypercube vs. 2-dimensional meshes and massively parallel computation of conservation laws. Performance and tools are also discussed.
Portable parallel portfolio optimization in the Aurora Financial Management System
NASA Astrophysics Data System (ADS)
Laure, Erwin; Moritsch, Hans
2001-07-01
Financial planning problems are formulated as large scale, stochastic, multiperiod, tree structured optimization problems. An efficient technique for solving this kind of problems is the nested Benders decomposition method. In this paper we present a parallel, portable, asynchronous implementation of this technique. To achieve our portability goals we elected the programming language Java for our implementation and used a high level Java based framework, called OpusJava, for expressing the parallelism potential as well as synchronization constraints. Our implementation is embedded within a modular decision support tool for portfolio and asset liability management, the Aurora Financial Management System.
NASA Astrophysics Data System (ADS)
Souza, D. M.; Costa, I. A.; Nóbrega, R. A.
2017-10-01
This document presents a detailed study of the performance of a set of digital filters whose implementations are based on the best linear unbiased estimator theory interpreted as a constrained optimization problem that could be relaxed depending on the input signal characteristics. This approach has been employed by a number of recent particle physics experiments for measuring the energy of particle events interacting with their detectors. The considered filters have been designed to measure the peak amplitude of signals produced by their detectors based on the digitized version of such signals. This study provides a clear understanding of the characteristics of those filters in the context of particle physics and, additionally, it proposes a phase related constraint based on the second derivative of the Taylor expansion in order to make the estimator less sensitive to phase variation (phase between the analog signal shaping and its sampled version), which is stronger in asynchronous experiments. The asynchronous detector developed by the ν-Angra Collaboration is used as the basis for this work. Nevertheless, the proposed analysis goes beyond, considering a wide range of conditions related to signal parameters such as pedestal, phase, sampling rate, amplitude resolution, noise and pile-up; therefore crossing the bounds of the ν-Angra Experiment to make it interesting and useful for different asynchronous and even synchronous experiments.
Distributed Computing for Signal Processing: Modeling of Asynchronous Parallel Computation.
1986-03-01
the proposed approaches 16, 16, 40 . 451. The conclusion most often reached is that the best scheme to use in a particular design depends highly upon...76. 40 . Siegel, H. J., McMillen. R. J., and Mueller. P. T.. Jr. A survey of interconnection methods for reconligurable parallel processing systems...addressing meehaanm distributed in the network area rimonication% tit reach gigabit./second speeds je g.. PoCoS83 .’ i.V--i the lirO! lk i nitronment is
A Parallel Saturation Algorithm on Shared Memory Architectures
NASA Technical Reports Server (NTRS)
Ezekiel, Jonathan; Siminiceanu
2007-01-01
Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.
Simplified Parallel Domain Traversal
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erickson III, David J
2011-01-01
Many data-intensive scientific analysis techniques require global domain traversal, which over the years has been a bottleneck for efficient parallelization across distributed-memory architectures. Inspired by MapReduce and other simplified parallel programming approaches, we have designed DStep, a flexible system that greatly simplifies efficient parallelization of domain traversal techniques at scale. In order to deliver both simplicity to users as well as scalability on HPC platforms, we introduce a novel two-tiered communication architecture for managing and exploiting asynchronous communication loads. We also integrate our design with advanced parallel I/O techniques that operate directly on native simulation output. We demonstrate DStep bymore » performing teleconnection analysis across ensemble runs of terascale atmospheric CO{sub 2} and climate data, and we show scalability results on up to 65,536 IBM BlueGene/P cores.« less
Ultrawideband asynchronous tracking system and method
NASA Technical Reports Server (NTRS)
Arndt, G. Dickey (Inventor); Ngo, Phong H. (Inventor); Phan, Chau T. (Inventor); Gross, Julia A. (Inventor); Ni, Jianjun (Inventor); Dusl, John (Inventor)
2012-01-01
A passive tracking system is provided with a plurality of ultrawideband (UWB) receivers that is asynchronous with respect to a UWB transmitter. A geometry of the tracking system may utilize a plurality of clusters with each cluster comprising a plurality of antennas. Time Difference of Arrival (TDOA) may be determined for the antennas in each cluster and utilized to determine Angle of Arrival (AOA) based on a far field assumption regarding the geometry. Parallel software communication sockets may be established with each of the plurality of UWB receivers. Transfer of waveform data may be processed by alternately receiving packets of waveform data from each UWB receiver. Cross Correlation Peak Detection (CCPD) is utilized to estimate TDOA information to reduce errors in a noisy, multipath environment.
Multi-petascale highly efficient parallel supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time andmore » supports DMA functionality allowing for parallel processing message-passing.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
BYNA, SUNRENDRA; DONG, BIN; WU, KESHENG
Data Elevator: Efficient Asynchronous Data Movement in Hierarchical Storage Systems Multi-layer storage subsystems, including SSD-based burst buffers and disk-based parallel file systems (PFS), are becoming part of HPC systems. However, software for this storage hierarchy is still in its infancy. Applications may have to explicitly move data among the storage layers. We propose Data Elevator for transparently and efficiently moving data between a burst buffer and a PFS. Users specify the final destination for their data, typically on PFS, Data Elevator intercepts the I/O calls, stages data on burst buffer, and then asynchronously transfers the data to their final destinationmore » in the background. This system allows extensive optimizations, such as overlapping read and write operations, choosing I/O modes, and aligning buffer boundaries. In tests with large-scale scientific applications, Data Elevator is as much as 4.2X faster than Cray DataWarp, the start-of-art software for burst buffer, and 4X faster than directly writing to PFS. The Data Elevator library uses HDF5's Virtual Object Layer (VOL) for intercepting parallel I/O calls that write data to PFS. The intercepted calls are redirected to the Data Elevator, which provides a handle to write the file in a faster and intermediate burst buffer system. Once the application finishes writing the data to the burst buffer, the Data Elevator job uses HDF5 to move the data to final destination in an asynchronous manner. Hence, using the Data Elevator library is currently useful for applications that call HDF5 for writing data files. Also, the Data Elevator depends on the HDF5 VOL functionality.« less
Parallel asynchronous systems and image processing algorithms
NASA Technical Reports Server (NTRS)
Coon, D. D.; Perera, A. G. U.
1989-01-01
A new hardware approach to implementation of image processing algorithms is described. The approach is based on silicon devices which would permit an independent analog processing channel to be dedicated to evey pixel. A laminar architecture consisting of a stack of planar arrays of the device would form a two-dimensional array processor with a 2-D array of inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in neuronlike asynchronous pulse coded form through the laminar processor. Such systems would integrate image acquisition and image processing. Acquisition and processing would be performed concurrently as in natural vision systems. The research is aimed at implementation of algorithms, such as the intensity dependent summation algorithm and pyramid processing structures, which are motivated by the operation of natural vision systems. Implementation of natural vision algorithms would benefit from the use of neuronlike information coding and the laminar, 2-D parallel, vision system type architecture. Besides providing a neural network framework for implementation of natural vision algorithms, a 2-D parallel approach could eliminate the serial bottleneck of conventional processing systems. Conversion to serial format would occur only after raw intensity data has been substantially processed. An interesting challenge arises from the fact that the mathematical formulation of natural vision algorithms does not specify the means of implementation, so that hardware implementation poses intriguing questions involving vision science.
Superconducting magnetic energy storage for asynchronous electrical systems
Boenig, Heinrich J.
1986-01-01
A superconducting magnetic energy storage coil connected in parallel between converters of two or more ac power systems provides load leveling and stability improvement to any or all of the ac systems. Control is provided to direct the charging and independently the discharging of the superconducting coil to at least a selected one of the ac power systems.
A software architecture for multidisciplinary applications: Integrating task and data parallelism
NASA Technical Reports Server (NTRS)
Chapman, Barbara; Mehrotra, Piyush; Vanrosendale, John; Zima, Hans
1994-01-01
Data parallel languages such as Vienna Fortran and HPF can be successfully applied to a wide range of numerical applications. However, many advanced scientific and engineering applications are of a multidisciplinary and heterogeneous nature and thus do not fit well into the data parallel paradigm. In this paper we present new Fortran 90 language extensions to fill this gap. Tasks can be spawned as asynchronous activities in a homogeneous or heterogeneous computing environment; they interact by sharing access to Shared Data Abstractions (SDA's). SDA's are an extension of Fortran 90 modules, representing a pool of common data, together with a set of Methods for controlled access to these data and a mechanism for providing persistent storage. Our language supports the integration of data and task parallelism as well as nested task parallelism and thus can be used to express multidisciplinary applications in a natural and efficient way.
DOE Office of Scientific and Technical Information (OSTI.GOV)
D'Azevedo, Eduardo; Abbott, Stephen; Koskela, Tuomas
The XGC fusion gyrokinetic code combines state-of-the-art, portable computational and algorithmic technologies to enable complicated multiscale simulations of turbulence and transport dynamics in ITER edge plasma on the largest US open-science computer, the CRAY XK7 Titan, at its maximal heterogeneous capability, which have not been possible before due to a factor of over 10 shortage in the time-to-solution for less than 5 days of wall-clock time for one physics case. Frontier techniques such as nested OpenMP parallelism, adaptive parallel I/O, staging I/O and data reduction using dynamic and asynchronous applications interactions, dynamic repartitioning.
1985-05-01
unit in the data base, with knowing one generic assembly language. °-’--a 139 The 5-tuple describing single operation execution time of the operations...TSi-- generate , random eventi ( ,.0-15 tieit tmls - ((floa egus ()16 274 r Ispt imet imel I at :EVE’JS- II ktime=0.0; /0 present time 0/ rrs ptime=0.0...computing machinery capable of performing these tasks within a given time constraint. Because the majority of the available computing machinery is general
Proxy-equation paradigm: A strategy for massively parallel asynchronous computations
NASA Astrophysics Data System (ADS)
Mittal, Ankita; Girimaji, Sharath
2017-09-01
Massively parallel simulations of transport equation systems call for a paradigm change in algorithm development to achieve efficient scalability. Traditional approaches require time synchronization of processing elements (PEs), which severely restricts scalability. Relaxing synchronization requirement introduces error and slows down convergence. In this paper, we propose and develop a novel "proxy equation" concept for a general transport equation that (i) tolerates asynchrony with minimal added error, (ii) preserves convergence order and thus, (iii) expected to scale efficiently on massively parallel machines. The central idea is to modify a priori the transport equation at the PE boundaries to offset asynchrony errors. Proof-of-concept computations are performed using a one-dimensional advection (convection) diffusion equation. The results demonstrate the promise and advantages of the present strategy.
Ropes: Support for collective opertions among distributed threads
NASA Technical Reports Server (NTRS)
Haines, Matthew; Mehrotra, Piyush; Cronk, David
1995-01-01
Lightweight threads are becoming increasingly useful in supporting parallelism and asynchronous control structures in applications and language implementations. Recently, systems have been designed and implemented to support interprocessor communication between lightweight threads so that threads can be exploited in a distributed memory system. Their use, in this setting, has been largely restricted to supporting latency hiding techniques and functional parallelism within a single application. However, to execute data parallel codes independent of other threads in the system, collective operations and relative indexing among threads are required. This paper describes the design of ropes: a scoping mechanism for collective operations and relative indexing among threads. We present the design of ropes in the context of the Chant system, and provide performance results evaluating our initial design decisions.
Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda A [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN
2012-01-10
Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Cambridge, MA; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN
2012-04-17
Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Opus: A Coordination Language for Multidisciplinary Applications
NASA Technical Reports Server (NTRS)
Chapman, Barbara; Haines, Matthew; Mehrotra, Piyush; Zima, Hans; vanRosendale, John
1997-01-01
Data parallel languages, such as High Performance fortran, can be successfully applied to a wide range of numerical applications. However, many advanced scientific and engineering applications are multidisciplinary and heterogeneous in nature, and thus do not fit well into the data parallel paradigm. In this paper we present Opus, a language designed to fill this gap. The central concept of Opus is a mechanism called ShareD Abstractions (SDA). An SDA can be used as a computation server, i.e., a locus of computational activity, or as a data repository for sharing data between asynchronous tasks. SDAs can be internally data parallel, providing support for the integration of data and task parallelism as well as nested task parallelism. They can thus be used to express multidisciplinary applications in a natural and efficient way. In this paper we describe the features of the language through a series of examples and give an overview of the runtime support required to implement these concepts in parallel and distributed environments.
NASA Astrophysics Data System (ADS)
Konduri, Aditya
Many natural and engineering systems are governed by nonlinear partial differential equations (PDEs) which result in a multiscale phenomena, e.g. turbulent flows. Numerical simulations of these problems are computationally very expensive and demand for extreme levels of parallelism. At realistic conditions, simulations are being carried out on massively parallel computers with hundreds of thousands of processing elements (PEs). It has been observed that communication between PEs as well as their synchronization at these extreme scales take up a significant portion of the total simulation time and result in poor scalability of codes. This issue is likely to pose a bottleneck in scalability of codes on future Exascale systems. In this work, we propose an asynchronous computing algorithm based on widely used finite difference methods to solve PDEs in which synchronization between PEs due to communication is relaxed at a mathematical level. We show that while stability is conserved when schemes are used asynchronously, accuracy is greatly degraded. Since message arrivals at PEs are random processes, so is the behavior of the error. We propose a new statistical framework in which we show that average errors drop always to first-order regardless of the original scheme. We propose new asynchrony-tolerant schemes that maintain accuracy when synchronization is relaxed. The quality of the solution is shown to depend, not only on the physical phenomena and numerical schemes, but also on the characteristics of the computing machine. A novel algorithm using remote memory access communications has been developed to demonstrate excellent scalability of the method for large-scale computing. Finally, we present a path to extend this method in solving complex multi-scale problems on Exascale machines.
On a model of three-dimensional bursting and its parallel implementation
NASA Astrophysics Data System (ADS)
Tabik, S.; Romero, L. F.; Garzón, E. M.; Ramos, J. I.
2008-04-01
A mathematical model for the simulation of three-dimensional bursting phenomena and its parallel implementation are presented. The model consists of four nonlinearly coupled partial differential equations that include fast and slow variables, and exhibits bursting in the absence of diffusion. The differential equations have been discretized by means of a second-order accurate in both space and time, linearly-implicit finite difference method in equally-spaced grids. The resulting system of linear algebraic equations at each time level has been solved by means of the Preconditioned Conjugate Gradient (PCG) method. Three different parallel implementations of the proposed mathematical model have been developed; two of these implementations, i.e., the MPI and the PETSc codes, are based on a message passing paradigm, while the third one, i.e., the OpenMP code, is based on a shared space address paradigm. These three implementations are evaluated on two current high performance parallel architectures, i.e., a dual-processor cluster and a Shared Distributed Memory (SDM) system. A novel representation of the results that emphasizes the most relevant factors that affect the performance of the paralled implementations, is proposed. The comparative analysis of the computational results shows that the MPI and the OpenMP implementations are about twice more efficient than the PETSc code on the SDM system. It is also shown that, for the conditions reported here, the nonlinear dynamics of the three-dimensional bursting phenomena exhibits three stages characterized by asynchronous, synchronous and then asynchronous oscillations, before a quiescent state is reached. It is also shown that the fast system reaches steady state in much less time than the slow variables.
Highly Asynchronous VisitOr Queue Graph Toolkit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pearce, R.
2012-10-01
HAVOQGT is a C++ framework that can be used to create highly parallel graph traversal algorithms. The framework stores the graph and algorithmic data structures on external memory that is typically mapped to high performance locally attached NAND FLASH arrays. The framework supports a vertex-centered visitor programming model. The frameworkd has been used to implement breadth first search, connected components, and single source shortest path.
Increasing processor utilization during parallel computation rundown
NASA Technical Reports Server (NTRS)
Jones, W. H.
1986-01-01
Some parallel processing environments provide for asynchronous execution and completion of general purpose parallel computations from a single computational phase. When all the computations from such a phase are complete, a new parallel computational phase is begun. Depending upon the granularity of the parallel computations to be performed, there may be a shortage of available work as a particular computational phase draws to a close (computational rundown). This can result in the waste of computing resources and the delay of the overall problem. In many practical instances, strict sequential ordering of phases of parallel computation is not totally required. In such cases, the beginning of one phase can be correctly computed before the end of a previous phase is completed. This allows additional work to be generated somewhat earlier to keep computing resources busy during each computational rundown. The conditions under which this can occur are identified and the frequency of occurrence of such overlapping in an actual parallel Navier-Stokes code is reported. A language construct is suggested and possible control strategies for the management of such computational phase overlapping are discussed.
Parallel adaptive wavelet collocation method for PDEs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nejadmalayeri, Alireza, E-mail: Alireza.Nejadmalayeri@gmail.com; Vezolainen, Alexei, E-mail: Alexei.Vezolainen@Colorado.edu; Brown-Dymkoski, Eric, E-mail: Eric.Browndymkoski@Colorado.edu
2015-10-01
A parallel adaptive wavelet collocation method for solving a large class of Partial Differential Equations is presented. The parallelization is achieved by developing an asynchronous parallel wavelet transform, which allows one to perform parallel wavelet transform and derivative calculations with only one data synchronization at the highest level of resolution. The data are stored using tree-like structure with tree roots starting at a priori defined level of resolution. Both static and dynamic domain partitioning approaches are developed. For the dynamic domain partitioning, trees are considered to be the minimum quanta of data to be migrated between the processes. This allowsmore » fully automated and efficient handling of non-simply connected partitioning of a computational domain. Dynamic load balancing is achieved via domain repartitioning during the grid adaptation step and reassigning trees to the appropriate processes to ensure approximately the same number of grid points on each process. The parallel efficiency of the approach is discussed based on parallel adaptive wavelet-based Coherent Vortex Simulations of homogeneous turbulence with linear forcing at effective non-adaptive resolutions up to 2048{sup 3} using as many as 2048 CPU cores.« less
Ultrascalable petaflop parallel supercomputer
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY
2010-07-20
A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
Exploiting parallel computing with limited program changes using a network of microcomputers
NASA Technical Reports Server (NTRS)
Rogers, J. L., Jr.; Sobieszczanski-Sobieski, J.
1985-01-01
Network computing and multiprocessor computers are two discernible trends in parallel processing. The computational behavior of an iterative distributed process in which some subtasks are completed later than others because of an imbalance in computational requirements is of significant interest. The effects of asynchronus processing was studied. A small existing program was converted to perform finite element analysis by distributing substructure analysis over a network of four Apple IIe microcomputers connected to a shared disk, simulating a parallel computer. The substructure analysis uses an iterative, fully stressed, structural resizing procedure. A framework of beams divided into three substructures is used as the finite element model. The effects of asynchronous processing on the convergence of the design variables are determined by not resizing particular substructures on various iterations.
Communication library for run-time visualization of distributed, asynchronous data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rowlan, J.; Wightman, B.T.
1994-04-01
In this paper we present a method for collecting and visualizing data generated by a parallel computational simulation during run time. Data distributed across multiple processes is sent across parallel communication lines to a remote workstation, which sorts and queues the data for visualization. We have implemented our method in a set of tools called PORTAL (for Parallel aRchitecture data-TrAnsfer Library). The tools comprise generic routines for sending data from a parallel program (callable from either C or FORTRAN), a semi-parallel communication scheme currently built upon Unix Sockets, and a real-time connection to the scientific visualization program AVS. Our methodmore » is most valuable when used to examine large datasets that can be efficiently generated and do not need to be stored on disk. The PORTAL source libraries, detailed documentation, and a working example can be obtained by anonymous ftp from info.mcs.anl.gov from the file portal.tar.Z from the directory pub/portal.« less
Fiber optic cable-based high-resolution, long-distance VGA extenders
NASA Astrophysics Data System (ADS)
Rhee, Jin-Geun; Lee, Iksoo; Kim, Heejoon; Kim, Sungjoon; Koh, Yeon-Wan; Kim, Hoik; Lim, Jiseok; Kim, Chur; Kim, Jungwon
2013-02-01
Remote transfer of high-resolution video information finds more applications in detached display applications for large facilities such as theaters, sports complex, airports, and security facilities. Active optical cables (AOCs) provide a promising approach for enhancing both the transmittable resolution and distance that standard copper-based cables cannot reach. In addition to the standard digital formats such as HDMI, the high-resolution, long-distance transfer of VGA format signals is important for applications where high-resolution analog video ports should be also supported, such as military/defense applications and high-resolution video camera links. In this presentation we present the development of a compressionless, high-resolution (up to WUXGA, 1920x1200), long-distance (up to 2 km) VGA extenders based on serialized technique. We employed asynchronous serial transmission and clock regeneration techniques, which enables lower cost implementation of VGA extenders by removing the necessity for clock transmission and large memory at the receiver. Two 3.125-Gbps transceivers are used in parallel to meet the required maximum video data rate of 6.25 Gbps. As the data are transmitted asynchronously, 24-bit pixel clock time stamp is employed to regenerate video pixel clock accurately at the receiver side. In parallel to the video information, stereo audio and RS-232 control signals are transmitted as well.
A Primer for Telemetry Interfacing in Accordance with NASA Standards Using Low Cost FPGAs
NASA Astrophysics Data System (ADS)
McCoy, Jake; Schultz, Ted; Tutt, James; Rogers, Thomas; Miles, Drew; McEntaffer, Randall
2016-03-01
Photon counting detector systems on sounding rocket payloads often require interfacing asynchronous outputs with a synchronously clocked telemetry (TM) stream. Though this can be handled with an on-board computer, there are several low cost alternatives including custom hardware, microcontrollers and field-programmable gate arrays (FPGAs). This paper outlines how a TM interface (TMIF) for detectors on a sounding rocket with asynchronous parallel digital output can be implemented using low cost FPGAs and minimal custom hardware. Low power consumption and high speed FPGAs are available as commercial off-the-shelf (COTS) products and can be used to develop the main component of the TMIF. Then, only a small amount of additional hardware is required for signal buffering and level translating. This paper also discusses how this system can be tested with a simulated TM chain in the small laboratory setting using FPGAs and COTS specialized data acquisition products.
Architecture and method for a burst buffer using flash technology
Tzelnic, Percy; Faibish, Sorin; Gupta, Uday K.; Bent, John; Grider, Gary Alan; Chen, Hsing-bung
2016-03-15
A parallel supercomputing cluster includes compute nodes interconnected in a mesh of data links for executing an MPI job, and solid-state storage nodes each linked to a respective group of the compute nodes for receiving checkpoint data from the respective compute nodes, and magnetic disk storage linked to each of the solid-state storage nodes for asynchronous migration of the checkpoint data from the solid-state storage nodes to the magnetic disk storage. Each solid-state storage node presents a file system interface to the MPI job, and multiple MPI processes of the MPI job write the checkpoint data to a shared file in the solid-state storage in a strided fashion, and the solid-state storage node asynchronously migrates the checkpoint data from the shared file in the solid-state storage to the magnetic disk storage and writes the checkpoint data to the magnetic disk storage in a sequential fashion.
Quinoa - Adaptive Computational Fluid Dynamics, 0.2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bakosi, Jozsef; Gonzalez, Francisco; Rogers, Brandon
Quinoa is a set of computational tools that enables research and numerical analysis in fluid dynamics. At this time it remains a test-bed to experiment with various algorithms using fully asynchronous runtime systems. Currently, Quinoa consists of the following tools: (1) Walker, a numerical integrator for systems of stochastic differential equations in time. It is a mathematical tool to analyze and design the behavior of stochastic differential equations. It allows the estimation of arbitrary coupled statistics and probability density functions and is currently used for the design of statistical moment approximations for multiple mixing materials in variable-density turbulence. (2) Inciter,more » an overdecomposition-aware finite element field solver for partial differential equations using 3D unstructured grids. Inciter is used to research asynchronous mesh-based algorithms and to experiment with coupling asynchronous to bulk-synchronous parallel code. Two planned new features of Inciter, compared to the previous release (LA-CC-16-015), to be implemented in 2017, are (a) a simple Navier-Stokes solver for ideal single-material compressible gases, and (b) solution-adaptive mesh refinement (AMR), which enables dynamically concentrating compute resources to regions with interesting physics. Using the NS-AMR problem we plan to explore how to scale such high-load-imbalance simulations, representative of large production multiphysics codes, to very large problems on very large computers using an asynchronous runtime system. (3) RNGTest, a test harness to subject random number generators to stringent statistical tests enabling quantitative ranking with respect to their quality and computational cost. (4) UnitTest, a unit test harness, running hundreds of tests per second, capable of testing serial, synchronous, and asynchronous functions. (5) MeshConv, a mesh file converter that can be used to convert 3D tetrahedron meshes from and to either of the following formats: Gmsh, (http://www.geuz.org/gmsh), Netgen, (http://sourceforge.net/apps/mediawiki/netgen-mesher), ExodusII, (http://sourceforge.net/projects/exodusii), HyperMesh, (http://www.altairhyperworks.com/product/HyperMesh).« less
Ibraheem; Hasan, Naimul; Hussein, Arkan Ahmed
2014-01-01
This Paper presents the design of decentralized automatic generation controller for an interconnected power system using PID, Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The designed controllers are tested on identical two-area interconnected power systems consisting of thermal power plants. The area interconnections between two areas are considered as (i) AC tie-line only (ii) Asynchronous tie-line. The dynamic response analysis is carried out for 1% load perturbation. The performance of the intelligent controllers based on GA and PSO has been compared with the conventional PID controller. The investigations of the system dynamic responses reveal that PSO has the better dynamic response result as compared with PID and GA controller for both type of area interconnection.
COMP Superscalar, an interoperable programming framework
NASA Astrophysics Data System (ADS)
Badia, Rosa M.; Conejero, Javier; Diaz, Carlos; Ejarque, Jorge; Lezzi, Daniele; Lordan, Francesc; Ramon-Cortes, Cristian; Sirvent, Raul
2015-12-01
COMPSs is a programming framework that aims to facilitate the parallelization of existing applications written in Java, C/C++ and Python scripts. For that purpose, it offers a simple programming model based on sequential development in which the user is mainly responsible for (i) identifying the functions to be executed as asynchronous parallel tasks and (ii) annotating them with annotations or standard Python decorators. A runtime system is in charge of exploiting the inherent concurrency of the code, automatically detecting and enforcing the data dependencies between tasks and spawning these tasks to the available resources, which can be nodes in a cluster, clouds or grids. In cloud environments, COMPSs provides scalability and elasticity features allowing the dynamic provision of resources.
GraphReduce: Processing Large-Scale Graphs on Accelerator-Based Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sengupta, Dipanjan; Song, Shuaiwen; Agarwal, Kapil
2015-11-15
Recent work on real-world graph analytics has sought to leverage the massive amount of parallelism offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithms and limitations in GPU-resident memory for storing large graphs. We present GraphReduce, a highly efficient and scalable GPU-based framework that operates on graphs that exceed the device’s internal memory capacity. GraphReduce adopts a combination of edge- and vertex-centric implementations of the Gather-Apply-Scatter programming model and operates on multiple asynchronous GPU streams to fully exploit the high degrees of parallelism in GPUs with efficient graph data movement between the host andmore » device.« less
Collective network for computer structures
Blumrich, Matthias A; Coteus, Paul W; Chen, Dong; Gara, Alan; Giampapa, Mark E; Heidelberger, Philip; Hoenicke, Dirk; Takken, Todd E; Steinmacher-Burow, Burkhard D; Vranas, Pavlos M
2014-01-07
A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to the needs of a processing algorithm.
Collective network for computer structures
Blumrich, Matthias A [Ridgefield, CT; Coteus, Paul W [Yorktown Heights, NY; Chen, Dong [Croton On Hudson, NY; Gara, Alan [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Takken, Todd E [Brewster, NY; Steinmacher-Burow, Burkhard D [Wernau, DE; Vranas, Pavlos M [Bedford Hills, NY
2011-08-16
A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices ate included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.
Asynchronous Replica Exchange Software for Grid and Heterogeneous Computing.
Gallicchio, Emilio; Xia, Junchao; Flynn, William F; Zhang, Baofeng; Samlalsingh, Sade; Mentes, Ahmet; Levy, Ronald M
2015-11-01
Parallel replica exchange sampling is an extended ensemble technique often used to accelerate the exploration of the conformational ensemble of atomistic molecular simulations of chemical systems. Inter-process communication and coordination requirements have historically discouraged the deployment of replica exchange on distributed and heterogeneous resources. Here we describe the architecture of a software (named ASyncRE) for performing asynchronous replica exchange molecular simulations on volunteered computing grids and heterogeneous high performance clusters. The asynchronous replica exchange algorithm on which the software is based avoids centralized synchronization steps and the need for direct communication between remote processes. It allows molecular dynamics threads to progress at different rates and enables parameter exchanges among arbitrary sets of replicas independently from other replicas. ASyncRE is written in Python following a modular design conducive to extensions to various replica exchange schemes and molecular dynamics engines. Applications of the software for the modeling of association equilibria of supramolecular and macromolecular complexes on BOINC campus computational grids and on the CPU/MIC heterogeneous hardware of the XSEDE Stampede supercomputer are illustrated. They show the ability of ASyncRE to utilize large grids of desktop computers running the Windows, MacOS, and/or Linux operating systems as well as collections of high performance heterogeneous hardware devices.
Throughput analysis of the IEEE 802.4 token bus standard under heavy load
NASA Technical Reports Server (NTRS)
Pang, Joseph; Tobagi, Fouad
1987-01-01
It has become clear in the last few years that there is a trend towards integrated digital services. Parallel to the development of public Integrated Services Digital Network (ISDN) is service integration in the local area (e.g., a campus, a building, an aircraft). The types of services to be integrated depend very much on the specific local environment. However, applications tend to generate data traffic belonging to one of two classes. According to IEEE 802.4 terminology, the first major class of traffic is termed synchronous, such as packetized voice and data generated from other applications with real-time constraints, and the second class is called asynchronous which includes most computer data traffic such as file transfer or facsimile. The IEEE 802.4 token bus protocol which was designed to support both synchronous and asynchronous traffic is examined. The protocol is basically a timer-controlled token bus access scheme. By a suitable choice of the design parameters, it can be shown that access delay is bounded for synchronous traffic. As well, the bandwidth allocated to asynchronous traffic can be controlled. A throughput analysis of the protocol under heavy load with constant channel occupation of synchronous traffic and constant token-passing times is presented.
Fusion of Asynchronous, Parallel, Unreliable Data Streams
2010-09-01
channels that might be used. The two channels chosen for this study, galvanic skin response (GSR) and pulse rate, are convenient and reasonably well...vector as NA. The MDS software tool, PERMAP, uses this same abbreviation. The impact of the lack of information may vary depending on the situation...of how PERMAP (and MDS in general) functions when the input parameters are varied. That is outlined in this section; the impact of those choices is
Performance Evaluation in Network-Based Parallel Computing
NASA Technical Reports Server (NTRS)
Dezhgosha, Kamyar
1996-01-01
Network-based parallel computing is emerging as a cost-effective alternative for solving many problems which require use of supercomputers or massively parallel computers. The primary objective of this project has been to conduct experimental research on performance evaluation for clustered parallel computing. First, a testbed was established by augmenting our existing SUNSPARCs' network with PVM (Parallel Virtual Machine) which is a software system for linking clusters of machines. Second, a set of three basic applications were selected. The applications consist of a parallel search, a parallel sort, a parallel matrix multiplication. These application programs were implemented in C programming language under PVM. Third, we conducted performance evaluation under various configurations and problem sizes. Alternative parallel computing models and workload allocations for application programs were explored. The performance metric was limited to elapsed time or response time which in the context of parallel computing can be expressed in terms of speedup. The results reveal that the overhead of communication latency between processes in many cases is the restricting factor to performance. That is, coarse-grain parallelism which requires less frequent communication between processes will result in higher performance in network-based computing. Finally, we are in the final stages of installing an Asynchronous Transfer Mode (ATM) switch and four ATM interfaces (each 155 Mbps) which will allow us to extend our study to newer applications, performance metrics, and configurations.
Solving Partial Differential Equations in a data-driven multiprocessor environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gaudiot, J.L.; Lin, C.M.; Hosseiniyar, M.
1988-12-31
Partial differential equations can be found in a host of engineering and scientific problems. The emergence of new parallel architectures has spurred research in the definition of parallel PDE solvers. Concurrently, highly programmable systems such as data-how architectures have been proposed for the exploitation of large scale parallelism. The implementation of some Partial Differential Equation solvers (such as the Jacobi method) on a tagged token data-flow graph is demonstrated here. Asynchronous methods (chaotic relaxation) are studied and new scheduling approaches (the Token No-Labeling scheme) are introduced in order to support the implementation of the asychronous methods in a data-driven environment.more » New high-level data-flow language program constructs are introduced in order to handle chaotic operations. Finally, the performance of the program graphs is demonstrated by a deterministic simulation of a message passing data-flow multiprocessor. An analysis of the overhead in the data-flow graphs is undertaken to demonstrate the limits of parallel operations in dataflow PDE program graphs.« less
Toward Millions of File System IOPS on Low-Cost, Commodity Hardware
Zheng, Da; Burns, Randal; Szalay, Alexander S.
2013-01-01
We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a user-space file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads. PMID:24402052
Toward Millions of File System IOPS on Low-Cost, Commodity Hardware.
Zheng, Da; Burns, Randal; Szalay, Alexander S
2013-01-01
We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a user-space file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads.
Cascaded VLSI neural network architecture for on-line learning
NASA Technical Reports Server (NTRS)
Thakoor, Anilkumar P. (Inventor); Duong, Tuan A. (Inventor); Daud, Taher (Inventor)
1992-01-01
High-speed, analog, fully-parallel, and asynchronous building blocks are cascaded for larger sizes and enhanced resolution. A hardware compatible algorithm permits hardware-in-the-loop learning despite limited weight resolution. A computation intensive feature classification application was demonstrated with this flexible hardware and new algorithm at high speed. This result indicates that these building block chips can be embedded as an application specific coprocessor for solving real world problems at extremely high data rates.
Data General Corporation Advanced Operating System/Virtual Storage (AOS/ VS). Revision 7.60
1989-02-22
control list for each directory and data file. An access control list includes the users who can and cannot access files as well as the access...and any required data, it can -5- February 22, 1989 Final Evaluation Report Data General AOS/VS SYSTEM OVERVIEW operate asynchronously and in parallel...memory. The IOC can perform the data transfer without further interventiin from the CPU. The I/O channels interface with the processor or system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chow, Edmond
Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.
Cascaded VLSI neural network architecture for on-line learning
NASA Technical Reports Server (NTRS)
Duong, Tuan A. (Inventor); Daud, Taher (Inventor); Thakoor, Anilkumar P. (Inventor)
1995-01-01
High-speed, analog, fully-parallel and asynchronous building blocks are cascaded for larger sizes and enhanced resolution. A hardware-compatible algorithm permits hardware-in-the-loop learning despite limited weight resolution. A comparison-intensive feature classification application has been demonstrated with this flexible hardware and new algorithm at high speed. This result indicates that these building block chips can be embedded as application-specific-coprocessors for solving real-world problems at extremely high data rates.
GraphReduce: Large-Scale Graph Analytics on Accelerator-Based HPC Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sengupta, Dipanjan; Agarwal, Kapil; Song, Shuaiwen
2015-09-30
Recent work on real-world graph analytics has sought to leverage the massive amount of parallelism offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithms and limitations in GPU-resident memory for storing large graphs. We present GraphReduce, a highly efficient and scalable GPU-based framework that operates on graphs that exceed the device’s internal memory capacity. GraphReduce adopts a combination of both edge- and vertex-centric implementations of the Gather-Apply-Scatter programming model and operates on multiple asynchronous GPU streams to fully exploit the high degrees of parallelism in GPUs with efficient graph data movement between the hostmore » and the device.« less
Simple and Flexible Self-Reproducing Structures in Asynchronous Cellular Automata and Their Dynamics
NASA Astrophysics Data System (ADS)
Huang, Xin; Lee, Jia; Yang, Rui-Long; Zhu, Qing-Sheng
2013-03-01
Self-reproduction on asynchronous cellular automata (ACAs) has attracted wide attention due to the evident artifacts induced by synchronous updating. Asynchronous updating, which allows cells to undergo transitions independently at random times, might be more compatible with the natural processes occurring at micro-scale, but the dark side of the coin is the increment in the complexity of an ACA in order to accomplish stable self-reproduction. This paper proposes a novel model of self-timed cellular automata (STCAs), a special type of ACAs, where unsheathed loops are able to duplicate themselves reliably in parallel. The removal of sheath cannot only allow various loops with more flexible and compact structures to replicate themselves, but also reduce the number of cell states of the STCA as compared to the previous model adopting sheathed loops [Y. Takada, T. Isokawa, F. Peper and N. Matsui, Physica D227, 26 (2007)]. The lack of sheath, on the other hand, often tends to cause much more complicated interactions among loops, when all of them struggle independently to stretch out their constructing arms at the same time. In particular, such intense collisions may even cause the emergence of a mess of twisted constructing arms in the cellular space. By using a simple and natural method, our self-reproducing loops (SRLs) are able to retract their arms successively, thereby disentangling from the mess successfully.
High performance interconnection between high data rate networks
NASA Technical Reports Server (NTRS)
Foudriat, E. C.; Maly, K.; Overstreet, C. M.; Zhang, L.; Sun, W.
1992-01-01
The bridge/gateway system needed to interconnect a wide range of computer networks to support a wide range of user quality-of-service requirements is discussed. The bridge/gateway must handle a wide range of message types including synchronous and asynchronous traffic, large, bursty messages, short, self-contained messages, time critical messages, etc. It is shown that messages can be classified into three basic classes, synchronous and large and small asynchronous messages. The first two require call setup so that packet identification, buffer handling, etc. can be supported in the bridge/gateway. Identification enables resequences in packet size. The third class is for messages which do not require call setup. Resequencing hardware based to handle two types of resequencing problems is presented. The first is for a virtual parallel circuit which can scramble channel bytes. The second system is effective in handling both synchronous and asynchronous traffic between networks with highly differing packet sizes and data rates. The two other major needs for the bridge/gateway are congestion and error control. A dynamic, lossless congestion control scheme which can easily support effective error correction is presented. Results indicate that the congestion control scheme provides close to optimal capacity under congested conditions. Under conditions where error may develop due to intervening networks which are not lossless, intermediate error recovery and correction takes 1/3 less time than equivalent end-to-end error correction under similar conditions.
Asynchronous parallel status comparator
Arnold, Jeffrey W.; Hart, Mark M.
1992-01-01
Apparatus for matching asynchronously received signals and determining whether two or more out of a total number of possible signals match. The apparatus comprises, in one embodiment, an array of sensors positioned in discrete locations and in communication with one or more processors. The processors will receive signals if the sensors detect a change in the variable sensed from a nominal to a special condition and will transmit location information in the form of a digital data set to two or more receivers. The receivers collect, read, latch and acknowledge the data sets and forward them to decoders that produce an output signal for each data set received. The receivers also periodically reset the system following each scan of the sensor array. A comparator then determines if any two or more, as specified by the user, of the output signals corresponds to the same location. A sufficient number of matches produces a system output signal that activates a system to restore the array to its nominal condition.
Asynchronous parallel status comparator
Arnold, J.W.; Hart, M.M.
1992-12-15
Disclosed is an apparatus for matching asynchronously received signals and determining whether two or more out of a total number of possible signals match. The apparatus comprises, in one embodiment, an array of sensors positioned in discrete locations and in communication with one or more processors. The processors will receive signals if the sensors detect a change in the variable sensed from a nominal to a special condition and will transmit location information in the form of a digital data set to two or more receivers. The receivers collect, read, latch and acknowledge the data sets and forward them to decoders that produce an output signal for each data set received. The receivers also periodically reset the system following each scan of the sensor array. A comparator then determines if any two or more, as specified by the user, of the output signals corresponds to the same location. A sufficient number of matches produces a system output signal that activates a system to restore the array to its nominal condition. 4 figs.
Dong, Suwei; Cahill, Katharine J; Kang, Moon-Il; Colburn, Nancy H; Henrich, Curtis J; Wilson, Jennifer A; Beutler, John A; Johnson, Richard P; Porco, John A
2011-11-04
We have accomplished a parallel screen of cycloaddition partners for o-quinols utilizing a plate-based microwave system. Microwave irradiation improves the efficiency of retro-Diels-Alder/Diels-Alder cascades of o-quinol dimers which generally proceed in a diastereoselective fashion. Computational studies indicate that asynchronous transition states are favored in Diels-Alder cycloadditions of o-quinols. Subsequent biological evaluation of a collection of cycloadducts has identified an inhibitor of activator protein-1 (AP-1), an oncogenic transcription factor.
Workshop on Solid State Switches for Pulsed Power, held January 12-14, 1983 at Tamarron, Colorado
1983-05-31
of its anticipated scalabil- ity. However, the projected performance of other types of dis- crete switches made their continued exploration and...linking of "asynchronous AC power grids. Some present installations arid projected increases are showr. in Table 2. A new commercial power application...Average Power 62.5 KW 160 KW Device RBDT (RSR) T60R SCR 2N3873 Arra , 6 Series 10 Parallel-20 Series Table 18. Applications of solid state pulse
Adaptive multi-resolution 3D Hartree-Fock-Bogoliubov solver for nuclear structure
NASA Astrophysics Data System (ADS)
Pei, J. C.; Fann, G. I.; Harrison, R. J.; Nazarewicz, W.; Shi, Yue; Thornton, S.
2014-08-01
Background: Complex many-body systems, such as triaxial and reflection-asymmetric nuclei, weakly bound halo states, cluster configurations, nuclear fragments produced in heavy-ion fusion reactions, cold Fermi gases, and pasta phases in neutron star crust, are all characterized by large sizes and complex topologies in which many geometrical symmetries characteristic of ground-state configurations are broken. A tool of choice to study such complex forms of matter is an adaptive multi-resolution wavelet analysis. This method has generated much excitement since it provides a common framework linking many diversified methodologies across different fields, including signal processing, data compression, harmonic analysis and operator theory, fractals, and quantum field theory. Purpose: To describe complex superfluid many-fermion systems, we introduce an adaptive pseudospectral method for solving self-consistent equations of nuclear density functional theory in three dimensions, without symmetry restrictions. Methods: The numerical method is based on the multi-resolution and computational harmonic analysis techniques with a multi-wavelet basis. The application of state-of-the-art parallel programming techniques include sophisticated object-oriented templates which parse the high-level code into distributed parallel tasks with a multi-thread task queue scheduler for each multi-core node. The internode communications are asynchronous. The algorithm is variational and is capable of solving coupled complex-geometric systems of equations adaptively, with functional and boundary constraints, in a finite spatial domain of very large size, limited by existing parallel computer memory. For smooth functions, user-defined finite precision is guaranteed. Results: The new adaptive multi-resolution Hartree-Fock-Bogoliubov (HFB) solver madness-hfb is benchmarked against a two-dimensional coordinate-space solver hfb-ax that is based on the B-spline technique and a three-dimensional solver hfodd that is based on the harmonic-oscillator basis expansion. Several examples are considered, including the self-consistent HFB problem for spin-polarized trapped cold fermions and the Skyrme-Hartree-Fock (+BCS) problem for triaxial deformed nuclei. Conclusions: The new madness-hfb framework has many attractive features when applied to nuclear and atomic problems involving many-particle superfluid systems. Of particular interest are weakly bound nuclear configurations close to particle drip lines, strongly elongated and dinuclear configurations such as those present in fission and heavy-ion fusion, and exotic pasta phases that appear in neutron star crust.
Data parallel sorting for particle simulation
NASA Technical Reports Server (NTRS)
Dagum, Leonardo
1992-01-01
Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.
PENTACLE: Parallelized particle-particle particle-tree code for planet formation
NASA Astrophysics Data System (ADS)
Iwasawa, Masaki; Oshino, Shoichi; Fujii, Michiko S.; Hori, Yasunori
2017-10-01
We have newly developed a parallelized particle-particle particle-tree code for planet formation, PENTACLE, which is a parallelized hybrid N-body integrator executed on a CPU-based (super)computer. PENTACLE uses a fourth-order Hermite algorithm to calculate gravitational interactions between particles within a cut-off radius and a Barnes-Hut tree method for gravity from particles beyond. It also implements an open-source library designed for full automatic parallelization of particle simulations, FDPS (Framework for Developing Particle Simulator), to parallelize a Barnes-Hut tree algorithm for a memory-distributed supercomputer. These allow us to handle 1-10 million particles in a high-resolution N-body simulation on CPU clusters for collisional dynamics, including physical collisions in a planetesimal disc. In this paper, we show the performance and the accuracy of PENTACLE in terms of \\tilde{R}_cut and a time-step Δt. It turns out that the accuracy of a hybrid N-body simulation is controlled through Δ t / \\tilde{R}_cut and Δ t / \\tilde{R}_cut ˜ 0.1 is necessary to simulate accurately the accretion process of a planet for ≥106 yr. For all those interested in large-scale particle simulations, PENTACLE, customized for planet formation, will be freely available from https://github.com/PENTACLE-Team/PENTACLE under the MIT licence.
Tempest: Accelerated MS/MS database search software for heterogeneous computing platforms
Adamo, Mark E.; Gerber, Scott A.
2017-01-01
MS/MS database search algorithms derive a set of candidate peptide sequences from in-silico digest of a protein sequence database, and compute theoretical fragmentation patterns to match these candidates against observed MS/MS spectra. The original Tempest publication described these operations mapped to a CPU-GPU model, in which the CPU generates peptide candidates that are asynchronously sent to a discrete GPU to be scored against experimental spectra in parallel (Milloy et al., 2012). The current version of Tempest expands this model, incorporating OpenCL to offer seamless parallelization across multicore CPUs, GPUs, integrated graphics chips, and general-purpose coprocessors. Three protocols describe how to configure and run a Tempest search, including discussion of how to leverage Tempest's unique feature set to produce optimal results. PMID:27603022
Pteros 2.0: Evolution of the fast parallel molecular analysis library for C++ and python.
Yesylevskyy, Semen O
2015-07-15
Pteros is the high-performance open-source library for molecular modeling and analysis of molecular dynamics trajectories. Starting from version 2.0 Pteros is available for C++ and Python programming languages with very similar interfaces. This makes it suitable for writing complex reusable programs in C++ and simple interactive scripts in Python alike. New version improves the facilities for asynchronous trajectory reading and parallel execution of analysis tasks by introducing analysis plugins which could be written in either C++ or Python in completely uniform way. The high level of abstraction provided by analysis plugins greatly simplifies prototyping and implementation of complex analysis algorithms. Pteros is available for free under Artistic License from http://sourceforge.net/projects/pteros/. © 2015 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Hasanov, Alemdar; Kawano, Alexandre
2016-05-01
Two types of inverse source problems of identifying asynchronously distributed spatial loads governed by the Euler-Bernoulli beam equation ρ (x){w}{tt}+μ (x){w}t+{({EI}(x){w}{xx})}{xx}-{T}r{u}{xx}={\\sum }m=1M{g}m(t){f}m(x), (x,t)\\in {{{Ω }}}T := (0,l)× (0,T), with hinged-clamped ends (w(0,t)={w}{xx}(0,t)=0,w(l,t) = {w}x(l,t)=0,t\\in (0,T)), are studied. Here {g}m(t) are linearly independent functions, describing an asynchronous temporal loading, and {f}m(x) are the spatial load distributions. In the first identification problem the values {ν }k(t),k=\\bar{1,K}, of the deflection w(x,t), are assumed to be known, as measured output data, in a neighbourhood of the finite set of points P:= \\{{x}k\\in (0,l),k=\\bar{1,K}\\}\\subset (0,l), corresponding to the internal points of a continuous beam, for all t\\in ]0,T[. In the second identification problem the values {θ }k(t),k=\\bar{1,K}, of the slope {w}x(x,t), are assumed to be known, as measured output data in a neighbourhood of the same set of points P for all t\\in ]0,T[. These inverse source problems will be defined subsequently as the problems ISP1 and ISP2. The general purpose of this study is to develop mathematical concepts and tools that are capable of providing effective numerical algorithms for the numerical solution of the considered class of inverse problems. Note that both measured output data {ν }k(t) and {θ }k(t) contain random noise. In the first part of the study we prove that each measured output data {ν }k(t) and {θ }k(t),k=\\bar{1,K} can uniquely determine the unknown functions {f}m\\in {H}-1(]0,l[),m=\\bar{1,M}. In the second part of the study we will introduce the input-output operators {{ K }}d :{L}2(0,T)\\mapsto {L}2(0,T),({{ K }}df)(t):= w(x,t;f),x\\in P, f(x) := ({f}1(x),\\ldots ,{f}M(x)), and {{ K }}s :{L}2(0,T)\\mapsto {L}2(0,T), ({{ K }}sf)(t):= {w}x(x,t;f), x\\in P , corresponding to the problems ISP1 and ISP2, and then reformulate these problems as the operator equations: {{ K }}df=ν and {{ K }}sf=θ , where ν (t):= ({ν }1(t),\\ldots ,{ν }K(t)) and {θ }k(t):= ({θ }1(t),\\ldots ,{θ }K(t)). Since both measured output data contain random noise, we use the most prominent regularisation method, Tikhonov regularisation, introducing the regularised cost functionals {J}1α (f):= (1/2)\\parallel {{ K }}df-ν {\\parallel }{L2(0,T)}2+(1/2)α \\parallel f{\\parallel }{L2(0,T)}2 and {J}2α (f):= (1/2)\\parallel {{ K }}sf-θ {\\parallel }{L2(0,T)}2+(1/2)α \\parallel f{\\parallel }{L2(0,T)}2. Using a priori estimates for the weak solution of the direct problem and the Tikhonov regularisation method combined with the adjoint problem approach, we prove that the Fréchet gradients {J}1\\prime (f) and {J}2\\prime (f) of both cost functionals can explicitly be derived via the corresponding weak solutions of adjoint problems and the known temporal loads {g}m(t). Moreover, we show that these gradients are Lipschitz continuous, which allows the use of gradient type iteration convergent algorithms. Two applications of the proposed theory are presented. It is shown that solvability results for inverse source problems related to the synchronous loading case, with a single interior measured data, are special cases of the obtained results for asynchronously distributed spatial load cases.
A surface phase transition of supported gold nanoparticles.
Plech, Anton; Cerna, Roland; Kotaidis, Vassilios; Hudert, Florian; Bartels, Albrecht; Dekorsy, Thomas
2007-04-01
A thermal phase transition has been resolved in gold nanoparticles supported on a surface. By use of asynchronous optical sampling with coupled femtosecond oscillators, the Lamb vibrational modes could be resolved as a function of annealing temperature. At a temperature of 104 degrees C the damping rate and phase changes abruptly, indicating a structural transition in the particle, which is explained as the onset of surface melting.
Asynchronous beating of cilia enhances particle capture rate
NASA Astrophysics Data System (ADS)
Ding, Yang; Kanso, Eva
2014-11-01
Many aquatic micro-organisms use beating cilia to generate feeding currents and capture particles in surrounding fluids. One of the capture strategies is to ``catch up'' with particles when a cilium is beating towards the overall flow direction (effective stroke) and intercept particles on the downstream side of the cilium. Here, we developed a 3D computational model of a cilia band with prescribed motion in a viscous fluid and calculated the trajectories of the particles with different sizes in the fluid. We found an optimal particle diameter that maximizes the capture rate. The flow field and particle motion indicate that the low capture rate of smaller particles is due to the laminar flow in the neighbor of the cilia, whereas larger particles have to move above the cilia tips to get advected downstream which decreases their capture rate. We then analyzed the effect of beating coordination between neighboring cilia on the capture rate. Interestingly, we found that asynchrony of the beating of the cilia can enhance the relative motion between a cilium and the particles near it and hence increase the capture rate.
Dong, Suwei; Cahill, Kath arine J.; Kang, Moon -Il; Colburn, Nancy H.; Henrich, Curtis J.; Wilson, Jennifer A.; Beutler, John A.; Johnson, Richard P.; Porco, John A.
2011-01-01
We have accomplished a parallel screen of cycloaddition partners for ortho-quinols utilizing a plate-based microwave system. Microwave irradiation improves the efficiency of retro-Diels-Alder/Diels-Alder cascades of ortho-quinol dimers which generally proceed in a diastereoselective fashion. Computational studies indicate that asynchronous transition states are favored in Diels-Alder cycloadditions of ortho-quinols. Subsequent biological evaluation of a collection of cycloadducts has identified an inhibitor of activator protein-1 (AP-1), an oncogenic transcription factor. PMID:21942286
What can neuromorphic event-driven precise timing add to spike-based pattern recognition?
Akolkar, Himanshu; Meyer, Cedric; Clady, Zavier; Marre, Olivier; Bartolozzi, Chiara; Panzeri, Stefano; Benosman, Ryad
2015-03-01
This letter introduces a study to precisely measure what an increase in spike timing precision can add to spike-driven pattern recognition algorithms. The concept of generating spikes from images by converting gray levels into spike timings is currently at the basis of almost every spike-based modeling of biological visual systems. The use of images naturally leads to generating incorrect artificial and redundant spike timings and, more important, also contradicts biological findings indicating that visual processing is massively parallel, asynchronous with high temporal resolution. A new concept for acquiring visual information through pixel-individual asynchronous level-crossing sampling has been proposed in a recent generation of asynchronous neuromorphic visual sensors. Unlike conventional cameras, these sensors acquire data not at fixed points in time for the entire array but at fixed amplitude changes of their input, resulting optimally sparse in space and time-pixel individually and precisely timed only if new, (previously unknown) information is available (event based). This letter uses the high temporal resolution spiking output of neuromorphic event-based visual sensors to show that lowering time precision degrades performance on several recognition tasks specifically when reaching the conventional range of machine vision acquisition frequencies (30-60 Hz). The use of information theory to characterize separability between classes for each temporal resolution shows that high temporal acquisition provides up to 70% more information that conventional spikes generated from frame-based acquisition as used in standard artificial vision, thus drastically increasing the separability between classes of objects. Experiments on real data show that the amount of information loss is correlated with temporal precision. Our information-theoretic study highlights the potentials of neuromorphic asynchronous visual sensors for both practical applications and theoretical investigations. Moreover, it suggests that representing visual information as a precise sequence of spike times as reported in the retina offers considerable advantages for neuro-inspired visual computations.
Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent
De Sa, Christopher; Feldman, Matthew; Ré, Christopher; Olukotun, Kunle
2018-01-01
Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue for the foreseeable future, it is important to study techniques that can make it run fast on parallel hardware. In this paper, we provide the first analysis of a technique called Buckwild! that uses both asynchronous execution and low-precision computation. We introduce the DMGC model, the first conceptualization of the parameter space that exists when implementing low-precision SGD, and show that it provides a way to both classify these algorithms and model their performance. We leverage this insight to propose and analyze techniques to improve the speed of low-precision SGD. First, we propose software optimizations that can increase throughput on existing CPUs by up to 11×. Second, we propose architectural changes, including a new cache technique we call an obstinate cache, that increase throughput beyond the limits of current-generation hardware. We also implement and analyze low-precision SGD on the FPGA, which is a promising alternative to the CPU for future SGD systems. PMID:29391770
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clark, Haley; BC Cancer Agency, Surrey, B.C.; BC Cancer Agency, Vancouver, B.C.
2014-08-15
Many have speculated about the future of computational technology in clinical radiation oncology. It has been advocated that the next generation of computational infrastructure will improve on the current generation by incorporating richer aspects of automation, more heavily and seamlessly featuring distributed and parallel computation, and providing more flexibility toward aggregate data analysis. In this report we describe how a recently created — but currently existing — analysis framework (DICOMautomaton) incorporates these aspects. DICOMautomaton supports a variety of use cases but is especially suited for dosimetric outcomes correlation analysis, investigation and comparison of radiotherapy treatment efficacy, and dose-volume computation. Wemore » describe: how it overcomes computational bottlenecks by distributing workload across a network of machines; how modern, asynchronous computational techniques are used to reduce blocking and avoid unnecessary computation; and how issues of out-of-date data are addressed using reactive programming techniques and data dependency chains. We describe internal architecture of the software and give a detailed demonstration of how DICOMautomaton could be used to search for correlations between dosimetric and outcomes data.« less
Evolution of a minimal parallel programming model
Lusk, Ewing; Butler, Ralph; Pieper, Steven C.
2017-04-30
Here, we take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how an extremely simple task model has evolved into a system, asynchronous dynamic load balancing (ADLB), and a scalable implementation capable of supporting sophisticated applications on today’s (and tomorrow’s) largest supercomputers; and we illustrate the use of ADLB with a Green’s function Monte Carlo application, a modern, mature nuclear physics code in production use. Our lesson is that by surrendering a certain amount of generalitymore » and thus applicability, a minimal programming model (in terms of its basic concepts and the size of its application programmer interface) can achieve extreme scalability without introducing complexity.« less
An Analysis of Performance Enhancement Techniques for Overset Grid Applications
NASA Technical Reports Server (NTRS)
Djomehri, J. J.; Biswas, R.; Potsdam, M.; Strawn, R. C.; Biegel, Bryan (Technical Monitor)
2002-01-01
The overset grid methodology has significantly reduced time-to-solution of high-fidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement techniques on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machine. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crozier, Paul; Howard, Micah; Rider, William J.
The SPARC (Sandia Parallel Aerodynamics and Reentry Code) will provide nuclear weapon qualification evidence for the random vibration and thermal environments created by re-entry of a warhead into the earth’s atmosphere. SPARC incorporates the innovative approaches of ATDM projects on several fronts including: effective harnessing of heterogeneous compute nodes using Kokkos, exascale-ready parallel scalability through asynchronous multi-tasking, uncertainty quantification through Sacado integration, implementation of state-of-the-art reentry physics and multiscale models, use of advanced verification and validation methods, and enabling of improved workflows for users. SPARC is being developed primarily for the Department of Energy nuclear weapon program, with additional developmentmore » and use of the code is being supported by the Department of Defense for conventional weapons programs.« less
Human Exposure to Electromagnetic Fields from Parallel Wireless Power Transfer Systems.
Wen, Feng; Huang, Xueliang
2017-02-08
The scenario of multiple wireless power transfer (WPT) systems working closely, synchronously or asynchronously with phase difference often occurs in power supply for household appliances and electric vehicles in parking lots. Magnetic field leakage from the WPT systems is also varied due to unpredictable asynchronous working conditions. In this study, the magnetic field leakage from parallel WPT systems working with phase difference is predicted, and the induced electric field and specific absorption rate (SAR) in a human body standing in the vicinity are also evaluated. Computational results are compared with the restrictions prescribed in the regulations established to limit human exposure to time-varying electromagnetic fields (EMFs). The results show that the middle region between the two WPT coils is safer for the two WPT systems working in-phase, and the peripheral regions are safer around the WPT systems working anti-phase. Thin metallic plates larger than the WPT coils can shield the magnetic field leakage well, while smaller ones may worsen the situation. The orientation of the human body will influence the maximum magnitude of induced electric field and its distribution within the human body. The induced electric field centralizes in the trunk, groin, and genitals with only one exception: when the human body is standing right at the middle of the two WPT coils working in-phase, the induced electric field focuses on lower limbs. The SAR value in the lungs always seems to be greater than in other organs, while the value in the liver is minimal. Human exposure to EMFs meets the guidelines of the International Committee on Non-Ionizing Radiation Protection (ICNIRP), specifically reference levels with respect to magnetic field and basic restrictions on induced electric fields and SAR, as the charging power is lower than 3.1 kW and 55.5 kW, respectively. These results are positive with respect to the safe applications of parallel WPT systems working simultaneously.
Human Exposure to Electromagnetic Fields from Parallel Wireless Power Transfer Systems
Wen, Feng; Huang, Xueliang
2017-01-01
The scenario of multiple wireless power transfer (WPT) systems working closely, synchronously or asynchronously with phase difference often occurs in power supply for household appliances and electric vehicles in parking lots. Magnetic field leakage from the WPT systems is also varied due to unpredictable asynchronous working conditions. In this study, the magnetic field leakage from parallel WPT systems working with phase difference is predicted, and the induced electric field and specific absorption rate (SAR) in a human body standing in the vicinity are also evaluated. Computational results are compared with the restrictions prescribed in the regulations established to limit human exposure to time-varying electromagnetic fields (EMFs). The results show that the middle region between the two WPT coils is safer for the two WPT systems working in-phase, and the peripheral regions are safer around the WPT systems working anti-phase. Thin metallic plates larger than the WPT coils can shield the magnetic field leakage well, while smaller ones may worsen the situation. The orientation of the human body will influence the maximum magnitude of induced electric field and its distribution within the human body. The induced electric field centralizes in the trunk, groin, and genitals with only one exception: when the human body is standing right at the middle of the two WPT coils working in-phase, the induced electric field focuses on lower limbs. The SAR value in the lungs always seems to be greater than in other organs, while the value in the liver is minimal. Human exposure to EMFs meets the guidelines of the International Committee on Non-Ionizing Radiation Protection (ICNIRP), specifically reference levels with respect to magnetic field and basic restrictions on induced electric fields and SAR, as the charging power is lower than 3.1 kW and 55.5 kW, respectively. These results are positive with respect to the safe applications of parallel WPT systems working simultaneously. PMID:28208709
Parallel design of JPEG-LS encoder on graphics processing units
NASA Astrophysics Data System (ADS)
Duan, Hao; Fang, Yong; Huang, Bormin
2012-01-01
With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadowski, Greg
In one form, a logic circuit includes an asynchronous logic circuit, a synchronous logic circuit, and an interface circuit coupled between the asynchronous logic circuit and the synchronous logic circuit. The asynchronous logic circuit has a plurality of asynchronous outputs for providing a corresponding plurality of asynchronous signals. The synchronous logic circuit has a plurality of synchronous inputs corresponding to the plurality of asynchronous outputs, a stretch input for receiving a stretch signal, and a clock output for providing a clock signal. The synchronous logic circuit provides the clock signal as a periodic signal but prolongs a predetermined state ofmore » the clock signal while the stretch signal is active. The asynchronous interface detects whether metastability could occur when latching any of the plurality of the asynchronous outputs of the asynchronous logic circuit using said clock signal, and activates the stretch signal while the metastability could occur.« less
On the suitability of the connection machine for direct particle simulation
NASA Technical Reports Server (NTRS)
Dagum, Leonard
1990-01-01
The algorithmic structure was examined of the vectorizable Stanford particle simulation (SPS) method and the structure is reformulated in data parallel form. Some of the SPS algorithms can be directly translated to data parallel, but several of the vectorizable algorithms have no direct data parallel equivalent. This requires the development of new, strictly data parallel algorithms. In particular, a new sorting algorithm is developed to identify collision candidates in the simulation and a master/slave algorithm is developed to minimize communication cost in large table look up. Validation of the method is undertaken through test calculations for thermal relaxation of a gas, shock wave profiles, and shock reflection from a stationary wall. A qualitative measure is provided of the performance of the Connection Machine for direct particle simulation. The massively parallel architecture of the Connection Machine is found quite suitable for this type of calculation. However, there are difficulties in taking full advantage of this architecture because of lack of a broad based tradition of data parallel programming. An important outcome of this work has been new data parallel algorithms specifically of use for direct particle simulation but which also expand the data parallel diction.
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)
1993-01-01
This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
CUBE: Information-optimized parallel cosmological N-body simulation code
NASA Astrophysics Data System (ADS)
Yu, Hao-Ran; Pen, Ue-Li; Wang, Xin
2018-05-01
CUBE, written in Coarray Fortran, is a particle-mesh based parallel cosmological N-body simulation code. The memory usage of CUBE can approach as low as 6 bytes per particle. Particle pairwise (PP) force, cosmological neutrinos, spherical overdensity (SO) halofinder are included.
Bimodal and multimodal plant biomass particle mixtures
Dooley, James H.
2013-07-09
An industrial feedstock of plant biomass particles having fibers aligned in a grain, wherein the particles are individually characterized by a length dimension (L) aligned substantially parallel to the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L, wherein the L.times.H dimensions define a pair of substantially parallel side surfaces characterized by substantially intact longitudinally arrayed fibers, the W.times.H dimensions define a pair of substantially parallel end surfaces characterized by crosscut fibers and end checking between fibers, and the L.times.W dimensions define a pair of substantially parallel top and bottom surfaces, and wherein the particles in the feedstock are collectively characterized by having a bimodal or multimodal size distribution.
NASA Technical Reports Server (NTRS)
Dagum, Leonardo
1989-01-01
The data parallel implementation of a particle simulation for hypersonic rarefied flow described by Dagum associates a single parallel data element with each particle in the simulation. The simulated space is divided into discrete regions called cells containing a variable and constantly changing number of particles. The implementation requires a global sort of the parallel data elements so as to arrange them in an order that allows immediate access to the information associated with cells in the simulation. Described here is a very fast algorithm for performing the necessary ranking of the parallel data elements. The performance of the new algorithm is compared with that of the microcoded instruction for ranking on the Connection Machine.
Building Hybrid Rover Models for NASA: Lessons Learned
NASA Technical Reports Server (NTRS)
Willeke, Thomas; Dearden, Richard
2004-01-01
Particle filters have recently become popular for diagnosis and monitoring of hybrid systems. In this paper we describe our experiences using particle filters on a real diagnosis problem, the NASA Ames Research Center's K-9 rover. As well as the challenge of modelling the dynamics of the system, there are two major issues in applying a particle filter to such a model. The first is the asynchronous nature of the system-observations from different subsystems arrive at different rates, and occasionally out of order, leading to large amounts of uncertainty in the state of the system. The second issue is data interpretation. The particle filter produces a probability distribution over the state of the system, from which summary statistics that can be used for control or higher-level diagnosis must be extracted. We describe our approaches to both these problems, as well as other modelling issues that arose in this domain.
On the dimensionally correct kinetic theory of turbulence for parallel propagation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gaelzer, R., E-mail: rudi.gaelzer@ufrgs.br, E-mail: yoonp@umd.edu, E-mail: 007gasun@khu.ac.kr, E-mail: luiz.ziebell@ufrgs.br; Ziebell, L. F., E-mail: rudi.gaelzer@ufrgs.br, E-mail: yoonp@umd.edu, E-mail: 007gasun@khu.ac.kr, E-mail: luiz.ziebell@ufrgs.br; Yoon, P. H., E-mail: rudi.gaelzer@ufrgs.br, E-mail: yoonp@umd.edu, E-mail: 007gasun@khu.ac.kr, E-mail: luiz.ziebell@ufrgs.br
2015-03-15
Yoon and Fang [Phys. Plasmas 15, 122312 (2008)] formulated a second-order nonlinear kinetic theory that describes the turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field. Their theory also includes discrete-particle effects, or the effects due to spontaneously emitted thermal fluctuations. However, terms associated with the spontaneous fluctuations in particle and wave kinetic equations in their theory contain proper dimensionality only for an artificial one-dimensional situation. The present paper extends the analysis and re-derives the dimensionally correct kinetic equations for three-dimensional case. The new formalism properly describes the effects of spontaneous fluctuations emitted in three-dimensional space, while the collectivelymore » emitted turbulence propagates predominantly in directions parallel/anti-parallel to the ambient magnetic field. As a first step, the present investigation focuses on linear wave-particle interaction terms only. A subsequent paper will include the dimensionally correct nonlinear wave-particle interaction terms.« less
NASA Technical Reports Server (NTRS)
Lyster, P. M.; Liewer, P. C.; Decyk, V. K.; Ferraro, R. D.
1995-01-01
A three-dimensional electrostatic particle-in-cell (PIC) plasma simulation code has been developed on coarse-grain distributed-memory massively parallel computers with message passing communications. Our implementation is the generalization to three-dimensions of the general concurrent particle-in-cell (GCPIC) algorithm. In the GCPIC algorithm, the particle computation is divided among the processors using a domain decomposition of the simulation domain. In a three-dimensional simulation, the domain can be partitioned into one-, two-, or three-dimensional subdomains ("slabs," "rods," or "cubes") and we investigate the efficiency of the parallel implementation of the push for all three choices. The present implementation runs on the Intel Touchstone Delta machine at Caltech; a multiple-instruction-multiple-data (MIMD) parallel computer with 512 nodes. We find that the parallel efficiency of the push is very high, with the ratio of communication to computation time in the range 0.3%-10.0%. The highest efficiency (> 99%) occurs for a large, scaled problem with 64(sup 3) particles per processing node (approximately 134 million particles of 512 nodes) which has a push time of about 250 ns per particle per time step. We have also developed expressions for the timing of the code which are a function of both code parameters (number of grid points, particles, etc.) and machine-dependent parameters (effective FLOP rate, and the effective interprocessor bandwidths for the communication of particles and grid points). These expressions can be used to estimate the performance of scaled problems--including those with inhomogeneous plasmas--to other parallel machines once the machine-dependent parameters are known.
The monophasic action potential upstroke: a means of characterizing local conduction.
Levine, J H; Moore, E N; Kadish, A H; Guarnieri, T; Spear, J F
1986-11-01
The upstrokes of monophasic action potentials (MAPs) recorded with an extracellular pressure electrode were characterized in isolated canine tissue preparations in vitro. The characteristics of the MAP upstroke were compared with those of the local action potential foot as well as with the characteristics of approaching electrical activation during uniform and asynchronous conduction. The upstroke of the MAP was exponential during uniform conduction. The time constant of rise of the MAP upstroke (TMAP) correlated with that of the action potential foot (Tfoot): TMAP + 1.01 Tfoot + 0.50; r2 = .80. Furthermore, changes in Tfoot with alterations in cycle length were associated with similar changes in TMAP: Tfoot = 1.06 TMAP - 0.11; r2 = .78. In addition, TMAP and Tfoot both deviated from exponential during asynchronous activation; the inflections that developed in the MAP upstroke correlated in time with intracellular action potential upstrokes that were asynchronous in onset in these tissues. Finally, the field of view of the MAP was determined and was found to be dependent in part on tissue architecture and the space constant. Specifically, the field of view of the MAP was found to be greater parallel compared with transverse to fiber orientation (6.02 +/- 1.74 vs 3.03 +/- 1.10 mm; p less than .01). These data suggest that the MAP upstroke may be used to define and characterize local electrical activation. The relatively large field of view of the MAP suggests that this technique may be a sensitive means to record focal membrane phenomena in vivo.
NASA Astrophysics Data System (ADS)
Iwasawa, Masaki; Tanikawa, Ataru; Hosono, Natsuki; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro
2016-08-01
We present the basic idea, implementation, measured performance, and performance model of FDPS (Framework for Developing Particle Simulators). FDPS is an application-development framework which helps researchers to develop simulation programs using particle methods for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, exchange of particles which are not in the domain of each computing node, and gathering of the particle information in other nodes which are necessary for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as the Barnes-Hut tree algorithm or the Fast Multipole Method should be used in the case of long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are required. FDPS provides all of these functions which are necessary for efficient parallel execution of particle-based simulations as "templates," which are independent of the actual data structure of particles and the functional form of the particle-particle interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speed-up was obtained for up to the full system of the K computer. The minimum calculation time per timestep is in the range of 30 ms (N = 107) to 300 ms (N = 109). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.
NASA Astrophysics Data System (ADS)
Trakumas, S.; Salter, E.
2009-02-01
Adverse health effects due to exposure to airborne particles are associated with particle deposition within the human respiratory tract. Particle size, shape, chemical composition, and the individual physiological characteristics of each person determine to what depth inhaled particles may penetrate and deposit within the respiratory tract. Various particle inertial classification devices are available to fractionate airborne particles according to their aerodynamic size to approximate particle penetration through the human respiratory tract. Cyclones are most often used to sample thoracic or respirable fractions of inhaled particles. Extensive studies of different cyclonic samplers have shown, however, that the sampling characteristics of cyclones do not follow the entire selected convention accurately. In the search for a more accurate way to assess worker exposure to different fractions of inhaled dust, a novel sampler comprising several inertial impactors arranged in parallel was designed and tested. The new design includes a number of separated impactors arranged in parallel. Prototypes of respirable and thoracic samplers each comprising four impactors arranged in parallel were manufactured and tested. Results indicated that the prototype samplers followed closely the penetration characteristics for which they were designed. The new samplers were found to perform similarly for liquid and solid test particles; penetration characteristics remained unchanged even after prolonged exposure to coal mine dust at high concentration. The new parallel impactor design can be applied to approximate any monotonically decreasing penetration curve at a selected flow rate. Personal-size samplers that operate at a few L/min as well as area samplers that operate at higher flow rates can be made based on the suggested design. Performance of such samplers can be predicted with high accuracy employing well-established impaction theory.
UPC++ Programmer’s Guide, v1.0-2018.3.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bachan, J.; Baden, S.; Bonachea, Dan
UPC++ is a C++11 library that provides Partitioned Global Address Space (PGAS) programming. It is designed for writing parallel programs that run efficiently and scale well on distributed-memory parallel computers. The PGAS model is single program, multiple-data (SPMD), with each separate thread of execution (referred to as a rank, a term borrowed from MPI) having access to local memory as it would in C++. However, PGAS also provides access to a global address space, which is allocated in shared segments that are distributed over the ranks. UPC++ provides numerous methods for accessing and using global memory. In UPC++, all operationsmore » that access remote memory are explicit, which encourages programmers to be aware of the cost of communication and data movement. Moreover, all remote-memory access operations are by default asynchronous, to enable programmers to write code that scales well even on hundreds of thousands of cores.« less
More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server
Ho, Qirong; Cipar, James; Cui, Henggang; Kim, Jin Kyu; Lee, Seunghak; Gibbons, Phillip B.; Gibson, Garth A.; Ganger, Gregory R.; Xing, Eric P.
2014-01-01
We propose a parameter server system for distributed ML, which follows a Stale Synchronous Parallel (SSP) model of computation that maximizes the time computational workers spend doing useful work on ML algorithms, while still providing correctness guarantees. The parameter server provides an easy-to-use shared interface for read/write access to an ML model’s values (parameters and variables), and the SSP model allows distributed workers to read older, stale versions of these values from a local cache, instead of waiting to get them from a central storage. This significantly increases the proportion of time workers spend computing, as opposed to waiting. Furthermore, the SSP model ensures ML algorithm correctness by limiting the maximum age of the stale values. We provide a proof of correctness under SSP, as well as empirical results demonstrating that the SSP model achieves faster algorithm convergence on several different ML problems, compared to fully-synchronous and asynchronous schemes. PMID:25400488
Real time software tools and methodologies
NASA Technical Reports Server (NTRS)
Christofferson, M. J.
1981-01-01
Real time systems are characterized by high speed processing and throughput as well as asynchronous event processing requirements. These requirements give rise to particular implementations of parallel or pipeline multitasking structures, of intertask or interprocess communications mechanisms, and finally of message (buffer) routing or switching mechanisms. These mechanisms or structures, along with the data structue, describe the essential character of the system. These common structural elements and mechanisms are identified, their implementation in the form of routines, tasks or macros - in other words, tools are formalized. The tools developed support or make available the following: reentrant task creation, generalized message routing techniques, generalized task structures/task families, standardized intertask communications mechanisms, and pipeline and parallel processing architectures in a multitasking environment. Tools development raise some interesting prospects in the areas of software instrumentation and software portability. These issues are discussed following the description of the tools themselves.
Performance Enhancement Strategies for Multi-Block Overset Grid CFD Applications
NASA Technical Reports Server (NTRS)
Djomehri, M. Jahed; Biswas, Rupak
2003-01-01
The overset grid methodology has significantly reduced time-to-solution of highfidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement strategies on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machinc. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Details of a sophisticated graph partitioning technique for grid grouping are also provided. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.
Tempest: Accelerated MS/MS Database Search Software for Heterogeneous Computing Platforms.
Adamo, Mark E; Gerber, Scott A
2016-09-07
MS/MS database search algorithms derive a set of candidate peptide sequences from in silico digest of a protein sequence database, and compute theoretical fragmentation patterns to match these candidates against observed MS/MS spectra. The original Tempest publication described these operations mapped to a CPU-GPU model, in which the CPU (central processing unit) generates peptide candidates that are asynchronously sent to a discrete GPU (graphics processing unit) to be scored against experimental spectra in parallel. The current version of Tempest expands this model, incorporating OpenCL to offer seamless parallelization across multicore CPUs, GPUs, integrated graphics chips, and general-purpose coprocessors. Three protocols describe how to configure and run a Tempest search, including discussion of how to leverage Tempest's unique feature set to produce optimal results. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Energetic particle diffusion coefficients upstream of quasi-parallel interplanetary shocks
NASA Technical Reports Server (NTRS)
Tan, L. C.; Mason, G. M.; Gloeckler, G.; Ipavich, F. M.
1989-01-01
The properties of about 30 to 130-keV/e protons and alpha particles upstream of six quasi-parallel interplanetary shocks that passed by the ISEE 3 spacecraft during 1978-1979 were analyzed, and the values for the upstream energegic particle diffusion coefficient, kappa, in these six events were deduced for a number of energies and upstream positions. These observations were compared with predictions of Lee's (1983) theory of shock acceleration. It was found that the observations verified the prediction of the A/Q dependence (where A and Q are the particle atomic mass and ionization state, respectively) of kappa for alpha and proton particles upstream of the quasi-parallel shocks.
NASA Astrophysics Data System (ADS)
Wu, Kaihua; Shao, Zhencheng; Chen, Nian; Wang, Wenjie
2018-01-01
The wearing degree of the wheel set tread is one of the main factors that influence the safety and stability of running train. Geometrical parameters mainly include flange thickness and flange height. Line structure laser light was projected on the wheel tread surface. The geometrical parameters can be deduced from the profile image. An online image acquisition system was designed based on asynchronous reset of CCD and CUDA parallel processing unit. The image acquisition was fulfilled by hardware interrupt mode. A high efficiency parallel segmentation algorithm based on CUDA was proposed. The algorithm firstly divides the image into smaller squares, and extracts the squares of the target by fusion of k_means and STING clustering image segmentation algorithm. Segmentation time is less than 0.97ms. A considerable acceleration ratio compared with the CPU serial calculation was obtained, which greatly improved the real-time image processing capacity. When wheel set was running in a limited speed, the system placed alone railway line can measure the geometrical parameters automatically. The maximum measuring speed is 120km/h.
NASA Astrophysics Data System (ADS)
Liu, Jiping; Kang, Xiaochen; Dong, Chun; Xu, Shenghua
2017-12-01
Surface area estimation is a widely used tool for resource evaluation in the physical world. When processing large scale spatial data, the input/output (I/O) can easily become the bottleneck in parallelizing the algorithm due to the limited physical memory resources and the very slow disk transfer rate. In this paper, we proposed a stream tilling approach to surface area estimation that first decomposed a spatial data set into tiles with topological expansions. With these tiles, the one-to-one mapping relationship between the input and the computing process was broken. Then, we realized a streaming framework towards the scheduling of the I/O processes and computing units. Herein, each computing unit encapsulated a same copy of the estimation algorithm, and multiple asynchronous computing units could work individually in parallel. Finally, the performed experiment demonstrated that our stream tilling estimation can efficiently alleviate the heavy pressures from the I/O-bound work, and the measured speedup after being optimized have greatly outperformed the directly parallel versions in shared memory systems with multi-core processors.
Maintaining High Assurance in Asynchronous Messaging
2015-10-24
Assurance in Asynchronous Messaging Kevin E. Foltz and William R. Simpson Abstract—Asynchronous messaging is the delivery of a message without... integrity , and confidentiality guarantees. End-to-end security for asynchronous messaging must be provided by the asynchronous messaging layer itself... continuing its processing. At the completion of message transmission, the sender does not know when or whether the receiver received it. The message
NASA Astrophysics Data System (ADS)
Furuichi, M.; Nishiura, D.
2015-12-01
Fully Lagrangian methods such as Smoothed Particle Hydrodynamics (SPH) and Discrete Element Method (DEM) have been widely used to solve the continuum and particles motions in the computational geodynamics field. These mesh-free methods are suitable for the problems with the complex geometry and boundary. In addition, their Lagrangian nature allows non-diffusive advection useful for tracking history dependent properties (e.g. rheology) of the material. These potential advantages over the mesh-based methods offer effective numerical applications to the geophysical flow and tectonic processes, which are for example, tsunami with free surface and floating body, magma intrusion with fracture of rock, and shear zone pattern generation of granular deformation. In order to investigate such geodynamical problems with the particle based methods, over millions to billion particles are required for the realistic simulation. Parallel computing is therefore important for handling such huge computational cost. An efficient parallel implementation of SPH and DEM methods is however known to be difficult especially for the distributed-memory architecture. Lagrangian methods inherently show workload imbalance problem for parallelization with the fixed domain in space, because particles move around and workloads change during the simulation. Therefore dynamic load balance is key technique to perform the large scale SPH and DEM simulation. In this work, we present the parallel implementation technique of SPH and DEM method utilizing dynamic load balancing algorithms toward the high resolution simulation over large domain using the massively parallel super computer system. Our method utilizes the imbalances of the executed time of each MPI process as the nonlinear term of parallel domain decomposition and minimizes them with the Newton like iteration method. In order to perform flexible domain decomposition in space, the slice-grid algorithm is used. Numerical tests show that our approach is suitable for solving the particles with different calculation costs (e.g. boundary particles) as well as the heterogeneous computer architecture. We analyze the parallel efficiency and scalability on the super computer systems (K-computer, Earth simulator 3, etc.).
INSTABILITIES DRIVEN BY THE DRIFT AND TEMPERATURE ANISOTROPY OF ALPHA PARTICLES IN THE SOLAR WIND
DOE Office of Scientific and Technical Information (OSTI.GOV)
Verscharen, Daniel; Bourouaine, Sofiane; Chandran, Benjamin D. G., E-mail: daniel.verscharen@unh.edu, E-mail: s.bourouaine@unh.edu, E-mail: benjamin.chandran@unh.edu
2013-08-20
We investigate the conditions under which parallel-propagating Alfven/ion-cyclotron (A/IC) waves and fast-magnetosonic/whistler (FM/W) waves are driven unstable by the differential flow and temperature anisotropy of alpha particles in the solar wind. We focus on the limit in which w{sub Parallel-To {alpha}} {approx}> 0.25v{sub A}, where w{sub Parallel-To {alpha}} is the parallel alpha-particle thermal speed and v{sub A} is the Alfven speed. We derive analytic expressions for the instability thresholds of these waves, which show, e.g., how the minimum unstable alpha-particle beam speed depends upon w{sub Parallel-To {alpha}}/v{sub A}, the degree of alpha-particle temperature anisotropy, and the alpha-to-proton temperature ratio. Wemore » validate our analytical results using numerical solutions to the full hot-plasma dispersion relation. Consistent with previous work, we find that temperature anisotropy allows A/IC waves and FM/W waves to become unstable at significantly lower values of the alpha-particle beam speed U{sub {alpha}} than in the isotropic-temperature case. Likewise, differential flow lowers the minimum temperature anisotropy needed to excite A/IC or FM/W waves relative to the case in which U{sub {alpha}} = 0. We discuss the relevance of our results to alpha particles in the solar wind near 1 AU.« less
NASA Astrophysics Data System (ADS)
Gassmöller, Rene; Bangerth, Wolfgang
2016-04-01
Particle-in-cell methods have a long history and many applications in geodynamic modelling of mantle convection, lithospheric deformation and crustal dynamics. They are primarily used to track material information, the strain a material has undergone, the pressure-temperature history a certain material region has experienced, or the amount of volatiles or partial melt present in a region. However, their efficient parallel implementation - in particular combined with adaptive finite-element meshes - is complicated due to the complex communication patterns and frequent reassignment of particles to cells. Consequently, many current scientific software packages accomplish this efficient implementation by specifically designing particle methods for a single purpose, like the advection of scalar material properties that do not evolve over time (e.g., for chemical heterogeneities). Design choices for particle integration, data storage, and parallel communication are then optimized for this single purpose, making the code relatively rigid to changing requirements. Here, we present the implementation of a flexible, scalable and efficient particle-in-cell method for massively parallel finite-element codes with adaptively changing meshes. Using a modular plugin structure, we allow maximum flexibility of the generation of particles, the carried tracer properties, the advection and output algorithms, and the projection of properties to the finite-element mesh. We present scaling tests ranging up to tens of thousands of cores and tens of billions of particles. Additionally, we discuss efficient load-balancing strategies for particles in adaptive meshes with their strengths and weaknesses, local particle-transfer between parallel subdomains utilizing existing communication patterns from the finite element mesh, and the use of established parallel output algorithms like the HDF5 library. Finally, we show some relevant particle application cases, compare our implementation to a modern advection-field approach, and demonstrate under which conditions which method is more efficient. We implemented the presented methods in ASPECT (aspect.dealii.org), a freely available open-source community code for geodynamic simulations. The structure of the particle code is highly modular, and segregated from the PDE solver, and can thus be easily transferred to other programs, or adapted for various application cases.
Pietersen, Alexander N.J.; Cheong, Soon Keen; Munn, Brandon; Gong, Pulin; Solomon, Samuel G.
2017-01-01
Key points How parallel are the primate visual pathways? In the present study, we demonstrate that parallel visual pathways in the dorsal lateral geniculate nucleus (LGN) show distinct patterns of interaction with rhythmic activity in the primary visual cortex (V1).In the V1 of anaesthetized marmosets, the EEG frequency spectrum undergoes transient changes that are characterized by fluctuations in delta‐band EEG power.We show that, on multisecond timescales, spiking activity in an evolutionary primitive (koniocellular) LGN pathway is specifically linked to these slow EEG spectrum changes. By contrast, on subsecond (delta frequency) timescales, cortical oscillations can entrain spiking activity throughout the entire LGN.Our results are consistent with the hypothesis that, in waking animals, the koniocellular pathway selectively participates in brain circuits controlling vigilance and attention. Abstract The major afferent cortical pathway in the visual system passes through the dorsal lateral geniculate nucleus (LGN), where nerve signals originating in the eye can first interact with brain circuits regulating visual processing, vigilance and attention. In the present study, we investigated how ongoing and visually driven activity in magnocellular (M), parvocellular (P) and koniocellular (K) layers of the LGN are related to cortical state. We recorded extracellular spiking activity in the LGN simultaneously with local field potentials (LFP) in primary visual cortex, in sufentanil‐anaesthetized marmoset monkeys. We found that asynchronous cortical states (marked by low power in delta‐band LFPs) are linked to high spike rates in K cells (but not P cells or M cells), on multisecond timescales. Cortical asynchrony precedes the increases in K cell spike rates by 1–3 s, implying causality. At subsecond timescales, the spiking activity in many cells of all (M, P and K) classes is phase‐locked to delta waves in the cortical LFP, and more cells are phase‐locked during synchronous cortical states than during asynchronous cortical states. The switch from low‐to‐high spike rates in K cells does not degrade their visual signalling capacity. By contrast, during asynchronous cortical states, the fidelity of visual signals transmitted by K cells is improved, probably because K cell responses become less rectified. Overall, the data show that slow fluctuations in cortical state are selectively linked to K pathway spiking activity, whereas delta‐frequency cortical oscillations entrain spiking activity throughout the entire LGN, in anaesthetized marmosets. PMID:28116750
de Vreede, Gert-Jan; Briggs, Robert O; Reiter-Palmon, Roni
2010-04-01
The aim of this study was to compare the results of two different modes of using multiple groups (instead of one large group) to identify problems and develop solutions. Many of the complex problems facing organizations today require the use of very large groups or collaborations of groups from multiple organizations. There are many logistical problems associated with the use of such large groups, including the ability to bring everyone together at the same time and location. A field study involved two different organizations and compared productivity and satisfaction of group. The approaches included (a) multiple small groups, each completing the entire process from start to end and combining the results at the end (parallel mode); and (b) multiple subgroups, each building on the work provided by previous subgroups (serial mode). Groups using the serial mode produced more elaborations compared with parallel groups, whereas parallel groups produced more unique ideas compared with serial groups. No significant differences were found related to satisfaction with process and outcomes between the two modes. Preferred mode depends on the type of task facing the group. Parallel groups are more suited for tasks for which a variety of new ideas are needed, whereas serial groups are best suited when elaboration and in-depth thinking on the solution are required. Results of this research can guide the development of facilitated sessions of large groups or "teams of teams."
NASA Astrophysics Data System (ADS)
Nakhostin, M.; Baba, M.
2014-06-01
Parallel-plate avalanche counters have long been recognized as timing detectors for heavily ionizing particles. However, these detectors suffer from a poor pulse-height resolution which limits their capability to discriminate between different ionizing particles. In this paper, a new approach for discriminating between charged particles of different specific energy-loss with avalanche counters is demonstrated. We show that the effect of the self-induced space-charge in parallel-plate avalanche counters leads to a strong correlation between the shape of output current pulses and the amount of primary ionization created by the incident charged particles. The correlation is then exploited for the discrimination of charged particles with different energy-losses in the detector. The experimental results obtained with α-particles from an 241Am α-source demonstrate a discrimination capability far beyond that achievable with the standard pulse-height discrimination method.
Fast Particle Methods for Multiscale Phenomena Simulations
NASA Technical Reports Server (NTRS)
Koumoutsakos, P.; Wray, A.; Shariff, K.; Pohorille, Andrew
2000-01-01
We are developing particle methods oriented at improving computational modeling capabilities of multiscale physical phenomena in : (i) high Reynolds number unsteady vortical flows, (ii) particle laden and interfacial flows, (iii)molecular dynamics studies of nanoscale droplets and studies of the structure, functions, and evolution of the earliest living cell. The unifying computational approach involves particle methods implemented in parallel computer architectures. The inherent adaptivity, robustness and efficiency of particle methods makes them a multidisciplinary computational tool capable of bridging the gap of micro-scale and continuum flow simulations. Using efficient tree data structures, multipole expansion algorithms, and improved particle-grid interpolation, particle methods allow for simulations using millions of computational elements, making possible the resolution of a wide range of length and time scales of these important physical phenomena.The current challenges in these simulations are in : [i] the proper formulation of particle methods in the molecular and continuous level for the discretization of the governing equations [ii] the resolution of the wide range of time and length scales governing the phenomena under investigation. [iii] the minimization of numerical artifacts that may interfere with the physics of the systems under consideration. [iv] the parallelization of processes such as tree traversal and grid-particle interpolations We are conducting simulations using vortex methods, molecular dynamics and smooth particle hydrodynamics, exploiting their unifying concepts such as : the solution of the N-body problem in parallel computers, highly accurate particle-particle and grid-particle interpolations, parallel FFT's and the formulation of processes such as diffusion in the context of particle methods. This approach enables us to transcend among seemingly unrelated areas of research.
Heinrich events simulated across the glacial
NASA Astrophysics Data System (ADS)
Ziemen, F. A.; Mikolajewicz, U.
2015-12-01
Heinrich events are among the most prominent climate change events recorded in proxies across the northern hemisphere. They are the archetype of ice sheet — climate interactions on millennial time scales. Nevertheless, the exact mechanisms that cause Heinrich events are still under discussion, and their climatic consequences are far from being fully understood. We contribute to answering the open questions by studying Heinrich events in a coupled ice sheet model (ISM) atmosphere-ocean-vegetation general circulation model (AOVGCM) framework, where this variability occurs as part of the model generated internal variability. The setup consists of a northern hemisphere setup of the modified Parallel Ice Sheet Model (mPISM) coupled to the global AOVGCM ECHAM5/MPIOM/LPJ. The simulations were performed fully coupled and with transient orbital and greenhouse gas forcing. They span from several millennia before the last glacial maximum into the deglaciation. We analyze simulations where the ISM is coupled asynchronously to the AOVGCM and simulations where the ISM and the ocean model are coupled synchronously and the atmosphere model is coupled asynchronously to them. The modeled Heinrich events show a marked influence of the ice discharge on the Atlantic circulation and heat transport.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Procassini, R.J.
1997-12-31
The fine-scale, multi-space resolution that is envisioned for accurate simulations of complex weapons systems in three spatial dimensions implies flop-rate and memory-storage requirements that will only be obtained in the near future through the use of parallel computational techniques. Since the Monte Carlo transport models in these simulations usually stress both of these computational resources, they are prime candidates for parallelization. The MONACO Monte Carlo transport package, which is currently under development at LLNL, will utilize two types of parallelism within the context of a multi-physics design code: decomposition of the spatial domain across processors (spatial parallelism) and distribution ofmore » particles in a given spatial subdomain across additional processors (particle parallelism). This implementation of the package will utilize explicit data communication between domains (message passing). Such a parallel implementation of a Monte Carlo transport model will result in non-deterministic communication patterns. The communication of particles between subdomains during a Monte Carlo time step may require a significant level of effort to achieve a high parallel efficiency.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Satake, Shin-ichi; Kanamori, Hiroyuki; Kunugi, Tomoaki
2007-02-01
We have developed a parallel algorithm for microdigital-holographic particle-tracking velocimetry. The algorithm is used in (1) numerical reconstruction of a particle image computer using a digital hologram, and (2) searching for particles. The numerical reconstruction from the digital hologram makes use of the Fresnel diffraction equation and the FFT (fast Fourier transform),whereas the particle search algorithm looks for local maximum graduation in a reconstruction field represented by a 3D matrix. To achieve high performance computing for both calculations (reconstruction and particle search), two memory partitions are allocated to the 3D matrix. In this matrix, the reconstruction part consists of horizontallymore » placed 2D memory partitions on the x-y plane for the FFT, whereas, the particle search part consists of vertically placed 2D memory partitions set along the z axes.Consequently, the scalability can be obtained for the proportion of processor elements,where the benchmarks are carried out for parallel computation by a SGI Altix machine.« less
On the relationship between collisionless shock structure and energetic particle acceleration
NASA Technical Reports Server (NTRS)
Kennel, C. F.
1983-01-01
Recent experimental research on bow shock structure and theoretical studies of quasi-parallel shock structure and shock acceleration of energetic particles were reviewed, to point out the relationship between structure and particle acceleration. The phenomenological distinction between quasi-parallel and quasi-perpendicular shocks that has emerged from bow shock research; present efforts to extend this work to interplanetary shocks; theories of particle acceleration by shocks; and particle acceleration to shock structures using multiple fluid models were discussed.
ASC-ATDM Performance Portability Requirements for 2015-2019
DOE Office of Scientific and Technical Information (OSTI.GOV)
Edwards, Harold C.; Trott, Christian Robert
This report outlines the research, development, and support requirements for the Advanced Simulation and Computing (ASC ) Advanced Technology, Development, and Mitigation (ATDM) Performance Portability (a.k.a., Kokkos) project for 2015 - 2019 . The research and development (R&D) goal for Kokkos (v2) has been to create and demonstrate a thread - parallel programming model a nd standard C++ library - based implementation that enables performance portability across diverse manycore architectures such as multicore CPU, Intel Xeon Phi, and NVIDIA Kepler GPU. This R&D goal has been achieved for algorithms that use data parallel pat terns including parallel - for, parallelmore » - reduce, and parallel - scan. Current R&D is focusing on hierarchical parallel patterns such as a directed acyclic graph (DAG) of asynchronous tasks where each task contain s nested data parallel algorithms. This five y ear plan includes R&D required to f ully and performance portably exploit thread parallelism across current and anticipated next generation platforms (NGP). The Kokkos library is being evaluated by many projects exploring algorithm s and code design for NGP. Some production libraries and applications such as Trilinos and LAMMPS have already committed to Kokkos as their foundation for manycore parallelism an d performance portability. These five year requirements includes support required for current and antic ipated ASC projects to be effective and productive in their use of Kokkos on NGP. The greatest risk to the success of Kokkos and ASC projects relying upon Kokkos is a lack of staffing resources to support Kokkos to the degree needed by these ASC projects. This support includes up - to - date tutorials, documentation, multi - platform (hardware and software stack) testing, minor feature enhancements, thread - scalable algorithm consulting, and managing collaborative R&D.« less
Engineered plant biomass feedstock particles
Dooley, James H [Federal Way, WA; Lanning, David N [Federal Way, WA; Broderick, Thomas F [Lake Forest Park, WA
2012-04-17
A new class of plant biomass feedstock particles characterized by consistent piece size and shape uniformity, high skeletal surface area, and good flow properties. The particles of plant biomass material having fibers aligned in a grain are characterized by a length dimension (L) aligned substantially parallel to the grain and defining a substantially uniform distance along the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L. In particular, the L.times.H dimensions define a pair of substantially parallel side surfaces characterized by substantially intact longitudinally arrayed fibers, the W.times.H dimensions define a pair of substantially parallel end surfaces characterized by crosscut fibers and end checking between fibers, and the L.times.W dimensions define a pair of substantially parallel top and bottom surfaces. The L.times.W surfaces of particles with L/H dimension ratios of 4:1 or less are further elaborated by surface checking between longitudinally arrayed fibers. The length dimension L is preferably aligned within 30.degree. parallel to the grain, and more preferably within 10.degree. parallel to the grain. The plant biomass material is preferably selected from among wood, agricultural crop residues, plantation grasses, hemp, bagasse, and bamboo.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Small, E.; Desimone, D.
Deglaciation of the Hoosic River drainage basin in southwestern Vermont was more complex than previously described. Detailed surficial mapping, stratigraphic relationships, and terrace levels/delta elevations reveal new details in the chronology of glacial Lake Bascom: (1) a pre-Wisconsinan proglacial lake was present in a similar position to Lake Bascom as ice advanced: (2) the northern margin of 275m (900 ft) glacial Lake Bascom extended 10 km up the Vermont Valley; (3) the 215m (705 ft) Bascom level was stable and long lived; (4) intermediate water planes existed between 215m and 190m (625 ft) levels; and (5) a separate ice tonguemore » existed in Shaftsbury Hollow damming a small glacial lake, here named glacial Lake Emmons. This information is used to correlate ice margins to different lake levels. Distance of ice margin retreat during a lake level can be measured. Lake levels are then used as control points on a Lake Bascom relative time line to compare rate of retreat of different ice tongues. Correlation of ice margins to Bascom levels indicates ice retreat was asynchronous between nearby tongues in southwestern Vermont. The Vermont Valley ice tongue retreated between two and four times faster than the Hoosic Valley tongue during the Bascom 275m level. Rate of retreat of the Vermont Valley tongue slowed to one-half of the Hoosic tongue during the 215m--190m lake levels. Factors responsible for varying rates of retreat are subglacial bedrock gradient, proximity to the Hudson-Champlain lobe, and the presence of absence of a calving margins. Asynchronous retreat produced splayed ice margins in southwestern Vermont. Findings from this study do not support the model of parallel, synchronous retreat proposed by many workers for this region.« less
Scalable asynchronous execution of cellular automata
NASA Astrophysics Data System (ADS)
Folino, Gianluigi; Giordano, Andrea; Mastroianni, Carlo
2016-10-01
The performance and scalability of cellular automata, when executed on parallel/distributed machines, are limited by the necessity of synchronizing all the nodes at each time step, i.e., a node can execute only after the execution of the previous step at all the other nodes. However, these synchronization requirements can be relaxed: a node can execute one step after synchronizing only with the adjacent nodes. In this fashion, different nodes can execute different time steps. This can be a notable advantageous in many novel and increasingly popular applications of cellular automata, such as smart city applications, simulation of natural phenomena, etc., in which the execution times can be different and variable, due to the heterogeneity of machines and/or data and/or executed functions. Indeed, a longer execution time at a node does not slow down the execution at all the other nodes but only at the neighboring nodes. This is particularly advantageous when the nodes that act as bottlenecks vary during the application execution. The goal of the paper is to analyze the benefits that can be achieved with the described asynchronous implementation of cellular automata, when compared to the classical all-to-all synchronization pattern. The performance and scalability have been evaluated through a Petri net model, as this model is very useful to represent the synchronization barrier among nodes. We examined the usual case in which the territory is partitioned into a number of regions, and the computation associated with a region is assigned to a computing node. We considered both the cases of mono-dimensional and two-dimensional partitioning. The results show that the advantage obtained through the asynchronous execution, when compared to the all-to-all synchronous approach is notable, and it can be as large as 90% in terms of speedup.
Acceptability of an Asynchronous Learning Forum on Mobile Devices
ERIC Educational Resources Information Center
Chang, Chih-Kai
2010-01-01
Mobile learning has recently become noteworthy because mobile devices have become popular. To construct an asynchronous learning forum on mobile devices is important because an asynchronous learning forum is always an essential part of networked asynchronous distance learning. However, the input interface in handheld learning devices, which is…
ERIC Educational Resources Information Center
Borup, Jered; West, Richard E.; Graham, Charles R.
2013-01-01
Online courses are increasingly using asynchronous video communication. However, little is known about how asynchronous video communication influences students' communication patterns. This study presents four narratives of students with varying characteristics who engaged in asynchronous video communication. The extrovert valued the efficiency of…
Parallelization Issues and Particle-In Codes.
NASA Astrophysics Data System (ADS)
Elster, Anne Cathrine
1994-01-01
"Everything should be made as simple as possible, but not simpler." Albert Einstein. The field of parallel scientific computing has concentrated on parallelization of individual modules such as matrix solvers and factorizers. However, many applications involve several interacting modules. Our analyses of a particle-in-cell code modeling charged particles in an electric field, show that these accompanying dependencies affect data partitioning and lead to new parallelization strategies concerning processor, memory and cache utilization. Our test-bed, a KSR1, is a distributed memory machine with a globally shared addressing space. However, most of the new methods presented hold generally for hierarchical and/or distributed memory systems. We introduce a novel approach that uses dual pointers on the local particle arrays to keep the particle locations automatically partially sorted. Complexity and performance analyses with accompanying KSR benchmarks, have been included for both this scheme and for the traditional replicated grids approach. The latter approach maintains load-balance with respect to particles. However, our results demonstrate it fails to scale properly for problems with large grids (say, greater than 128-by-128) running on as few as 15 KSR nodes, since the extra storage and computation time associated with adding the grid copies, becomes significant. Our grid partitioning scheme, although harder to implement, does not need to replicate the whole grid. Consequently, it scales well for large problems on highly parallel systems. It may, however, require load balancing schemes for non-uniform particle distributions. Our dual pointer approach may facilitate this through dynamically partitioned grids. We also introduce hierarchical data structures that store neighboring grid-points within the same cache -line by reordering the grid indexing. This alignment produces a 25% savings in cache-hits for a 4-by-4 cache. A consideration of the input data's effect on the simulation may lead to further improvements. For example, in the case of mean particle drift, it is often advantageous to partition the grid primarily along the direction of the drift. The particle-in-cell codes for this study were tested using physical parameters, which lead to predictable phenomena including plasma oscillations and two-stream instabilities. An overview of the most central references related to parallel particle codes is also given.
Holzinger, Dennis; Koch, Iris; Burgard, Stefan; Ehresmann, Arno
2015-07-28
An approach for a remotely controllable transport of magnetic micro- and/or nanoparticles above a topographically flat exchange-bias (EB) thin film system, magnetically patterned into parallel stripe domains, is presented where the particle manipulation is achieved by sub-mT external magnetic field pulses. Superparamagnetic core-shell particles are moved stepwise by the dynamic transformation of the particles' magnetic potential energy landscape due to the external magnetic field pulses without affecting the magnetic state of the thin film system. The magnetic particle velocity is adjustable in the range of 1-100 μm/s by the design of the substrate's magnetic field landscape (MFL), the particle-substrate distance, and the magnitude of the applied external magnetic field pulses. The agglomeration of magnetic particles is avoided by the intrinsic magnetostatic repulsion of particles due to the parallel alignment of the particles' magnetic moments perpendicular to the transport direction and parallel to the surface normal of the substrate during the particle motion. The transport mechanism is modeled by a quantitative theory based on the precise knowledge of the sample's MFL and the particle-substrate distance.
Fast quantum Monte Carlo on a GPU
NASA Astrophysics Data System (ADS)
Lutsyshyn, Y.
2015-02-01
We present a scheme for the parallelization of quantum Monte Carlo method on graphical processing units, focusing on variational Monte Carlo simulation of bosonic systems. We use asynchronous execution schemes with shared memory persistence, and obtain an excellent utilization of the accelerator. The CUDA code is provided along with a package that simulates liquid helium-4. The program was benchmarked on several models of Nvidia GPU, including Fermi GTX560 and M2090, and the Kepler architecture K20 GPU. Special optimization was developed for the Kepler cards, including placement of data structures in the register space of the Kepler GPUs. Kepler-specific optimization is discussed.
Rectangular Array Of Digital Processors For Planning Paths
NASA Technical Reports Server (NTRS)
Kemeny, Sabrina E.; Fossum, Eric R.; Nixon, Robert H.
1993-01-01
Prototype 24 x 25 rectangular array of asynchronous parallel digital processors rapidly finds best path across two-dimensional field, which could be patch of terrain traversed by robotic or military vehicle. Implemented as single-chip very-large-scale integrated circuit. Excepting processors on edges, each processor communicates with four nearest neighbors along paths representing travel to north, south, east, and west. Each processor contains delay generator in form of 8-bit ripple counter, preset to 1 of 256 possible values. Operation begins with choice of processor representing starting point. Transmits signals to nearest neighbor processors, which retransmits to other neighboring processors, and process repeats until signals propagated across entire field.
Buxton, Eric C
2014-02-12
To evaluate and compare pharmacists' satisfaction with the content and learning environment of a continuing education program series offered as either synchronous or asynchronous webinars. An 8-lecture series of online presentations on the topic of new drug therapies was offered to pharmacists in synchronous and asynchronous webinar formats. Participants completed a 50-question online survey at the end of the program series to evaluate their perceptions of the distance learning experience. Eighty-two participants completed the survey instrument (41 participants from the live webinar series and 41 participants from the asynchronous webinar series.) Responses indicated that while both groups were satisfied with the program content, the asynchronous group showed greater satisfaction with many aspects of the learning environment. The synchronous and asynchronous webinar participants responded positively regarding the quality of the programming and the method of delivery, but asynchronous participants rated their experience more positively overall.
2014-01-01
Objective. To evaluate and compare pharmacists’ satisfaction with the content and learning environment of a continuing education program series offered as either synchronous or asynchronous webinars. Methods. An 8-lecture series of online presentations on the topic of new drug therapies was offered to pharmacists in synchronous and asynchronous webinar formats. Participants completed a 50-question online survey at the end of the program series to evaluate their perceptions of the distance learning experience. Results. Eighty-two participants completed the survey instrument (41 participants from the live webinar series and 41 participants from the asynchronous webinar series.) Responses indicated that while both groups were satisfied with the program content, the asynchronous group showed greater satisfaction with many aspects of the learning environment. Conclusion. The synchronous and asynchronous webinar participants responded positively regarding the quality of the programming and the method of delivery, but asynchronous participants rated their experience more positively overall. PMID:24558276
Comparing the force ripple during asynchronous and conventional stimulation.
Downey, Ryan J; Tate, Mark; Kawai, Hiroyuki; Dixon, Warren E
2014-10-01
Asynchronous stimulation has been shown to reduce fatigue during electrical stimulation; however, it may also exhibit a force ripple. We quantified the ripple during asynchronous and conventional single-channel transcutaneous stimulation across a range of stimulation frequencies. The ripple was measured during 5 asynchronous stimulation protocols, 2 conventional stimulation protocols, and 3 volitional contractions in 12 healthy individuals. Conventional 40 Hz and asynchronous 16 Hz stimulation were found to induce contractions that were as smooth as volitional contractions. Asynchronous 8, 10, and 12 Hz stimulation induced contractions with significant ripple. Lower stimulation frequencies can reduce fatigue; however, they may also lead to increased ripple. Future efforts should study the relationship between force ripple and the smoothness of the evoked movements in addition to the relationship between stimulation frequency and NMES-induced fatigue to elucidate an optimal stimulation frequency for asynchronous stimulation. © 2014 Wiley Periodicals, Inc.
Particle simulation of plasmas on the massively parallel processor
NASA Technical Reports Server (NTRS)
Gledhill, I. M. A.; Storey, L. R. O.
1987-01-01
Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.
Synchronous Office Hours in an Asynchronous Course: Making the Connection
ERIC Educational Resources Information Center
Gibbons-Kunka, Beatrice
2017-01-01
The notion of synchronous office hours in an asynchronous course seems counterintuitive. After all, one of the tenets of asynchronous education is to not require students to be online and participating at any time during the course. Having taught higher education online asynchronous courses for twenty years, the researcher experimented with online…
Parallelization of a Monte Carlo particle transport simulation code
NASA Astrophysics Data System (ADS)
Hadjidoukas, P.; Bousis, C.; Emfietzoglou, D.
2010-05-01
We have developed a high performance version of the Monte Carlo particle transport simulation code MC4. The original application code, developed in Visual Basic for Applications (VBA) for Microsoft Excel, was first rewritten in the C programming language for improving code portability. Several pseudo-random number generators have been also integrated and studied. The new MC4 version was then parallelized for shared and distributed-memory multiprocessor systems using the Message Passing Interface. Two parallel pseudo-random number generator libraries (SPRNG and DCMT) have been seamlessly integrated. The performance speedup of parallel MC4 has been studied on a variety of parallel computing architectures including an Intel Xeon server with 4 dual-core processors, a Sun cluster consisting of 16 nodes of 2 dual-core AMD Opteron processors and a 200 dual-processor HP cluster. For large problem size, which is limited only by the physical memory of the multiprocessor server, the speedup results are almost linear on all systems. We have validated the parallel implementation against the serial VBA and C implementations using the same random number generator. Our experimental results on the transport and energy loss of electrons in a water medium show that the serial and parallel codes are equivalent in accuracy. The present improvements allow for studying of higher particle energies with the use of more accurate physical models, and improve statistics as more particles tracks can be simulated in low response time.
Scalable Domain Decomposed Monte Carlo Particle Transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Brien, Matthew Joseph
2013-12-05
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.
An epigenetic state associated with areas of gene duplication
Gimelbrant, Alexander A.; Chess, Andrew
2006-01-01
Asynchronous DNA replication is an epigenetically determined feature found in all cases of monoallelic expression, including genomic imprinting, X-inactivation, and random monoallelic expression of autosomal genes such as immunoglobulins and olfactory receptor genes. Most genes of the latter class were identified in experiments focused on genes functioning in the chemosensory and immune systems. We performed an unbiased survey of asynchronous replication in the mouse genome, excluding known asynchronously replicated genes. Fully 10% (eight of 80) of the genes tested exhibited asynchronous replication. A common feature of the newly identified asynchronously replicated areas is their proximity to areas of tandem gene duplication. Testing of other clustered areas supported the idea that such regions are enriched with asynchronously replicated genes. PMID:16687731
Scalable Domain Decomposed Monte Carlo Particle Transport
NASA Astrophysics Data System (ADS)
O'Brien, Matthew Joseph
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation. The main algorithms we consider are: • Domain decomposition of constructive solid geometry: enables extremely large calculations in which the background geometry is too large to fit in the memory of a single computational node. • Load Balancing: keeps the workload per processor as even as possible so the calculation runs efficiently. • Global Particle Find: if particles are on the wrong processor, globally resolve their locations to the correct processor based on particle coordinate and background domain. • Visualizing constructive solid geometry, sourcing particles, deciding that particle streaming communication is completed and spatial redecomposition. These algorithms are some of the most important parallel algorithms required for domain decomposed Monte Carlo particle transport. We demonstrate that our previous algorithms were not scalable, prove that our new algorithms are scalable, and run some of the algorithms up to 2 million MPI processes on the Sequoia supercomputer.
Leveraging human oversight and intervention in large-scale parallel processing of open-source data
NASA Astrophysics Data System (ADS)
Casini, Enrico; Suri, Niranjan; Bradshaw, Jeffrey M.
2015-05-01
The popularity of cloud computing along with the increased availability of cheap storage have led to the necessity of elaboration and transformation of large volumes of open-source data, all in parallel. One way to handle such extensive volumes of information properly is to take advantage of distributed computing frameworks like Map-Reduce. Unfortunately, an entirely automated approach that excludes human intervention is often unpredictable and error prone. Highly accurate data processing and decision-making can be achieved by supporting an automatic process through human collaboration, in a variety of environments such as warfare, cyber security and threat monitoring. Although this mutual participation seems easily exploitable, human-machine collaboration in the field of data analysis presents several challenges. First, due to the asynchronous nature of human intervention, it is necessary to verify that once a correction is made, all the necessary reprocessing is done in chain. Second, it is often needed to minimize the amount of reprocessing in order to optimize the usage of resources due to limited availability. In order to improve on these strict requirements, this paper introduces improvements to an innovative approach for human-machine collaboration in the processing of large amounts of open-source data in parallel.
A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TETRAHEDRAL DOMAINS
Fu, Zhisong; Kirby, Robert M.; Whitaker, Ross T.
2014-01-01
Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers. PMID:25221418
A Phenomenological Synapse Model for Asynchronous Neurotransmitter Release
Wang, Tao; Yin, Luping; Zou, Xiaolong; Shu, Yousheng; Rasch, Malte J.; Wu, Si
2016-01-01
Neurons communicate with each other via synapses. Action potentials cause release of neurotransmitters at the axon terminal. Typically, this neurotransmitter release is tightly time-locked to the arrival of an action potential and is thus called synchronous release. However, neurotransmitter release is stochastic and the rate of release of small quanta of neurotransmitters can be considerably elevated even long after the ceasing of spiking activity, leading to asynchronous release of neurotransmitters. Such asynchronous release varies for tissue and neuron types and has been shown recently to be pronounced in fast-spiking neurons. Notably, it was found that asynchronous release is enhanced in human epileptic tissue implicating a possibly important role in generating abnormal neural activity. Current neural network models for simulating and studying neural activity virtually only consider synchronous release and ignore asynchronous transmitter release. Here, we develop a phenomenological model for asynchronous neurotransmitter release, which, on one hand, captures the fundamental features of the asynchronous release process, and, on the other hand, is simple enough to be incorporated in large-size network simulations. Our proposed model is based on the well-known equations for short-term dynamical synaptic interactions and includes an additional stochastic term for modeling asynchronous release. We use experimental data obtained from inhibitory fast-spiking synapses of human epileptic tissue to fit the model parameters, and demonstrate that our model reproduces the characteristics of realistic asynchronous transmitter release. PMID:26834617
Stochastic bifurcations in the nonlinear parallel Ising model.
Bagnoli, Franco; Rechtman, Raúl
2016-11-01
We investigate the phase transitions of a nonlinear, parallel version of the Ising model, characterized by an antiferromagnetic linear coupling and ferromagnetic nonlinear one. This model arises in problems of opinion formation. The mean-field approximation shows chaotic oscillations, by changing the couplings or the connectivity. The spatial model shows bifurcations in the average magnetization, similar to that seen in the mean-field approximation, induced by the change of the topology, after rewiring short-range to long-range connection, as predicted by the small-world effect. These coherent periodic and chaotic oscillations of the magnetization reflect a certain degree of synchronization of the spins, induced by long-range couplings. Similar bifurcations may be induced in the randomly connected model by changing the couplings or the connectivity and also the dilution (degree of asynchronism) of the updating. We also examined the effects of inhomogeneity, mixing ferromagnetic and antiferromagnetic coupling, which induces an unexpected bifurcation diagram with a "bubbling" behavior, as also happens for dilution.
Concurrent simulation of a parallel jaw end effector
NASA Technical Reports Server (NTRS)
Bynum, Bill
1985-01-01
A system of programs developed to aid in the design and development of the command/response protocol between a parallel jaw end effector and the strategic planner program controlling it are presented. The system executes concurrently with the LISP controlling program to generate a graphical image of the end effector that moves in approximately real time in response to commands sent from the controlling program. Concurrent execution of the simulation program is useful for revealing flaws in the communication command structure arising from the asynchronous nature of the message traffic between the end effector and the strategic planner. Software simulation helps to minimize the number of hardware changes necessary to the microprocessor driving the end effector because of changes in the communication protocol. The simulation of other actuator devices can be easily incorporated into the system of programs by using the underlying support that was developed for the concurrent execution of the simulation process and the communication between it and the controlling program.
Magnetophoretic circuits for digital control of single particles and cells
NASA Astrophysics Data System (ADS)
Lim, Byeonghwa; Reddy, Venu; Hu, Xinghao; Kim, Kunwoo; Jadhav, Mital; Abedini-Nassab, Roozbeh; Noh, Young-Woock; Lim, Yong Taik; Yellen, Benjamin B.; Kim, Cheolgi
2014-05-01
The ability to manipulate small fluid droplets, colloidal particles and single cells with the precision and parallelization of modern-day computer hardware has profound applications for biochemical detection, gene sequencing, chemical synthesis and highly parallel analysis of single cells. Drawing inspiration from general circuit theory and magnetic bubble technology, here we demonstrate a class of integrated circuits for executing sequential and parallel, timed operations on an ensemble of single particles and cells. The integrated circuits are constructed from lithographically defined, overlaid patterns of magnetic film and current lines. The magnetic patterns passively control particles similar to electrical conductors, diodes and capacitors. The current lines actively switch particles between different tracks similar to gated electrical transistors. When combined into arrays and driven by a rotating magnetic field clock, these integrated circuits have general multiplexing properties and enable the precise control of magnetizable objects.
Simulation of Hypervelocity Impact on Aluminum-Nextel-Kevlar Orbital Debris Shields
NASA Technical Reports Server (NTRS)
Fahrenthold, Eric P.
2000-01-01
An improved hybrid particle-finite element method has been developed for hypervelocity impact simulation. The method combines the general contact-impact capabilities of particle codes with the true Lagrangian kinematics of large strain finite element formulations. Unlike some alternative schemes which couple Lagrangian finite element models with smooth particle hydrodynamics, the present formulation makes no use of slidelines or penalty forces. The method has been implemented in a parallel, three dimensional computer code. Simulations of three dimensional orbital debris impact problems using this parallel hybrid particle-finite element code, show good agreement with experiment and good speedup in parallel computation. The simulations included single and multi-plate shields as well as aluminum and composite shielding materials. at an impact velocity of eleven kilometers per second.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Shaojie, E-mail: wangsj@ustc.edu.cn
It is found that the Lorentz force generated by the magnetic drift drives a generic plasma pinch flux of particle, energy and momentum through the Stokes-Einstein relation. The proposed theoretical model applies for both electrons and ions, trapped particles, and passing particles. An anomalous parallel current pinch due to the electrostatic turbulence with long parallel wave-length is predicted.
Distributed spatial information integration based on web service
NASA Astrophysics Data System (ADS)
Tong, Hengjian; Zhang, Yun; Shao, Zhenfeng
2008-10-01
Spatial information systems and spatial information in different geographic locations usually belong to different organizations. They are distributed and often heterogeneous and independent from each other. This leads to the fact that many isolated spatial information islands are formed, reducing the efficiency of information utilization. In order to address this issue, we present a method for effective spatial information integration based on web service. The method applies asynchronous invocation of web service and dynamic invocation of web service to implement distributed, parallel execution of web map services. All isolated information islands are connected by the dispatcher of web service and its registration database to form a uniform collaborative system. According to the web service registration database, the dispatcher of web services can dynamically invoke each web map service through an asynchronous delegating mechanism. All of the web map services can be executed at the same time. When each web map service is done, an image will be returned to the dispatcher. After all of the web services are done, all images are transparently overlaid together in the dispatcher. Thus, users can browse and analyze the integrated spatial information. Experiments demonstrate that the utilization rate of spatial information resources is significantly raised thought the proposed method of distributed spatial information integration.
Distributed spatial information integration based on web service
NASA Astrophysics Data System (ADS)
Tong, Hengjian; Zhang, Yun; Shao, Zhenfeng
2009-10-01
Spatial information systems and spatial information in different geographic locations usually belong to different organizations. They are distributed and often heterogeneous and independent from each other. This leads to the fact that many isolated spatial information islands are formed, reducing the efficiency of information utilization. In order to address this issue, we present a method for effective spatial information integration based on web service. The method applies asynchronous invocation of web service and dynamic invocation of web service to implement distributed, parallel execution of web map services. All isolated information islands are connected by the dispatcher of web service and its registration database to form a uniform collaborative system. According to the web service registration database, the dispatcher of web services can dynamically invoke each web map service through an asynchronous delegating mechanism. All of the web map services can be executed at the same time. When each web map service is done, an image will be returned to the dispatcher. After all of the web services are done, all images are transparently overlaid together in the dispatcher. Thus, users can browse and analyze the integrated spatial information. Experiments demonstrate that the utilization rate of spatial information resources is significantly raised thought the proposed method of distributed spatial information integration.
Designing Asynchronous Communication Tools for Optimization of Patient-Clinician Coordination
Eschler, Jordan; Liu, Leslie S.; Vizer, Lisa M.; McClure, Jennifer B.; Lozano, Paula; Pratt, Wanda; Ralston, James D.
2015-01-01
Asynchronous communication outside the clinical setting has both enriched and complicated patient-clinician interactions. Many patients can now interact with a patient portal 24 hours a day, asking questions of their clinicians via secure message, checking lab results, ordering medication refills, or making appointments. However, the mode of communication (asynchronous) and the nature of the interaction (lacking tone or body language) strip valuable information from each side of patient-clinician asynchronous communication. Using interviews with 34 individuals who actively manage a chronic illness of their own, or for a child or partner, we elicited narratives about patients’ experiences and expectations for using asynchronous communication to address medical issues with their clinicians. Based on these perspectives, we present opportunities for designing asynchronous communication tools to better facilitate understanding of and coordination around care activities between patients and clinicians. PMID:26958188
2007-01-01
Objectives To compare students' performance and course evaluations for a pharmacogenetic pharmacotherapy course taught by synchronous videoconferencing method via the Internet and for the same course taught via asynchronous video streaming via the Internet. Methods In spring 2005, a pharmacogenetic therapy course was taught to 73 students located on Amarillo, Lubbock, and Dallas campuses using synchronous videoconferencing, and in spring 2006, to 78 students located on the same 3 campuses using asynchronous video streaming. A course evaluation was administered to each group at the end of the courses. Results Students in the asynchronous setting had final course grades of 89% ± 7% compared to the mean final course grade of 87% ± 7% in the synchronous group (p = 0.05). Regardless of which technology was used, average course grades did not differ significantly among the 3 campus sites. Significantly more of the students in the asynchronous setting agreed (57%) with the statement that they could read the lecture notes and absorb the content on their own without attending the class than students in the synchronous class (23%; chi-square test; p < 0.001). Conclusions Students in both asynchronous and synchronous settings performed well. However, students taught using asynchronous videotaped lectures had lower satisfaction with the method of content delivery, and preferred live interactive sessions or a mix of interactive sessions and asynchronous videos over delivery of content using the synchronous or asynchronous method alone. PMID:17429516
NASA Astrophysics Data System (ADS)
Fukushige, Toshiyuki; Taiji, Makoto; Makino, Junichiro; Ebisuzaki, Toshikazu; Sugimoto, Daiichiro
1996-09-01
We have developed a parallel, pipelined special-purpose computer for N-body simulations, MD-GRAPE (for "GRAvity PipE"). In gravitational N- body simulations, almost all computing time is spent on the calculation of interactions between particles. GRAPE is specialized hardware to calculate these interactions. It is used with a general-purpose front-end computer that performs all calculations other than the force calculation. MD-GRAPE is the first parallel GRAPE that can calculate an arbitrary central force. A force different from a pure 1/r potential is necessary for N-body simulations with periodic boundary conditions using the Ewald or particle-particle/particle-mesh (P^3^M) method. MD-GRAPE accelerates the calculation of particle-particle force for these algorithms. An MD- GRAPE board has four MD chips and its peak performance is 4.2 GFLOPS. On an MD-GRAPE board, a cosmological N-body simulation takes 6O0(N/10^6^)^3/2^ s per step for the Ewald method, where N is the number of particles, and would take 24O(N/10^6^) s per step for the P^3^M method, in a uniform distribution of particles.
Building asynchronous geospatial processing workflows with web services
NASA Astrophysics Data System (ADS)
Zhao, Peisheng; Di, Liping; Yu, Genong
2012-02-01
Geoscience research and applications often involve a geospatial processing workflow. This workflow includes a sequence of operations that use a variety of tools to collect, translate, and analyze distributed heterogeneous geospatial data. Asynchronous mechanisms, by which clients initiate a request and then resume their processing without waiting for a response, are very useful for complicated workflows that take a long time to run. Geospatial contents and capabilities are increasingly becoming available online as interoperable Web services. This online availability significantly enhances the ability to use Web service chains to build distributed geospatial processing workflows. This paper focuses on how to orchestrate Web services for implementing asynchronous geospatial processing workflows. The theoretical bases for asynchronous Web services and workflows, including asynchrony patterns and message transmission, are examined to explore different asynchronous approaches to and architecture of workflow code for the support of asynchronous behavior. A sample geospatial processing workflow, issued by the Open Geospatial Consortium (OGC) Web Service, Phase 6 (OWS-6), is provided to illustrate the implementation of asynchronous geospatial processing workflows and the challenges in using Web Services Business Process Execution Language (WS-BPEL) to develop them.
Spike train generation and current-to-frequency conversion in silicon diodes
NASA Technical Reports Server (NTRS)
Coon, D. D.; Perera, A. G. U.
1989-01-01
A device physics model is developed to analyze spontaneous neuron-like spike train generation in current driven silicon p(+)-n-n(+) devices in cryogenic environments. The model is shown to explain the very high dynamic range (0 to the 7th) current-to-frequency conversion and experimental features of the spike train frequency as a function of input current. The devices are interesting components for implementation of parallel asynchronous processing adjacent to cryogenically cooled focal planes because of their extremely low current and power requirements, their electronic simplicity, and their pulse coding capability, and could be used to form the hardware basis for neural networks which employ biologically plausible means of information coding.
Real-Time Cognitive Computing Architecture for Data Fusion in a Dynamic Environment
NASA Technical Reports Server (NTRS)
Duong, Tuan A.; Duong, Vu A.
2012-01-01
A novel cognitive computing architecture is conceptualized for processing multiple channels of multi-modal sensory data streams simultaneously, and fusing the information in real time to generate intelligent reaction sequences. This unique architecture is capable of assimilating parallel data streams that could be analog, digital, synchronous/asynchronous, and could be programmed to act as a knowledge synthesizer and/or an "intelligent perception" processor. In this architecture, the bio-inspired models of visual pathway and olfactory receptor processing are combined as processing components, to achieve the composite function of "searching for a source of food while avoiding the predator." The architecture is particularly suited for scene analysis from visual data and odorant.
Pteros: fast and easy to use open-source C++ library for molecular analysis.
Yesylevskyy, Semen O
2012-07-15
An open-source Pteros library for molecular modeling and analysis of molecular dynamics trajectories for C++ programming language is introduced. Pteros provides a number of routine analysis operations ranging from reading and writing trajectory files and geometry transformations to structural alignment and computation of nonbonded interaction energies. The library features asynchronous trajectory reading and parallel execution of several analysis routines, which greatly simplifies development of computationally intensive trajectory analysis algorithms. Pteros programming interface is very simple and intuitive while the source code is well documented and easily extendible. Pteros is available for free under open-source Artistic License from http://sourceforge.net/projects/pteros/. Copyright © 2012 Wiley Periodicals, Inc.
Plasma and energetic particle structure of a collisionless quasi-parallel shock
NASA Technical Reports Server (NTRS)
Kennel, C. F.; Scarf, F. L.; Coroniti, F. V.; Russell, C. T.; Smith, E. J.; Wenzel, K. P.; Reinhard, R.; Sanderson, T. R.; Feldman, W. C.; Parks, G. K.
1983-01-01
The quasi-parallel interplanetary shock of November 11-12, 1978 from both the collisionless shock and energetic particle points of view were studied using measurements of the interplanetary magnetic and electric fields, solar wind electrons, plasma and MHD waves, and intermediate and high energy ions obtained on ISEE-1, -2, and -3. The interplanetary environment through which the shock was propagating when it encountered the three spacecraft was characterized; the observations of this shock are documented and current theories of quasi-parallel shock structure and particle acceleration are tested. These observations tend to confirm present self consistent theories of first order Fermi acceleration by shocks and of collisionless shock dissipation involving firehouse instability.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Zuwei; Zhao, Haibo, E-mail: klinsmannzhb@163.com; Zheng, Chuguang
2015-01-15
This paper proposes a comprehensive framework for accelerating population balance-Monte Carlo (PBMC) simulation of particle coagulation dynamics. By combining Markov jump model, weighted majorant kernel and GPU (graphics processing unit) parallel computing, a significant gain in computational efficiency is achieved. The Markov jump model constructs a coagulation-rule matrix of differentially-weighted simulation particles, so as to capture the time evolution of particle size distribution with low statistical noise over the full size range and as far as possible to reduce the number of time loopings. Here three coagulation rules are highlighted and it is found that constructing appropriate coagulation rule providesmore » a route to attain the compromise between accuracy and cost of PBMC methods. Further, in order to avoid double looping over all simulation particles when considering the two-particle events (typically, particle coagulation), the weighted majorant kernel is introduced to estimate the maximum coagulation rates being used for acceptance–rejection processes by single-looping over all particles, and meanwhile the mean time-step of coagulation event is estimated by summing the coagulation kernels of rejected and accepted particle pairs. The computational load of these fast differentially-weighted PBMC simulations (based on the Markov jump model) is reduced greatly to be proportional to the number of simulation particles in a zero-dimensional system (single cell). Finally, for a spatially inhomogeneous multi-dimensional (multi-cell) simulation, the proposed fast PBMC is performed in each cell, and multiple cells are parallel processed by multi-cores on a GPU that can implement the massively threaded data-parallel tasks to obtain remarkable speedup ratio (comparing with CPU computation, the speedup ratio of GPU parallel computing is as high as 200 in a case of 100 cells with 10 000 simulation particles per cell). These accelerating approaches of PBMC are demonstrated in a physically realistic Brownian coagulation case. The computational accuracy is validated with benchmark solution of discrete-sectional method. The simulation results show that the comprehensive approach can attain very favorable improvement in cost without sacrificing computational accuracy.« less
Refractive multiple optical tweezers for parallel biochemical analysis in micro-fluidics
NASA Astrophysics Data System (ADS)
Merenda, Fabrice; Rohner, Johann; Pascoal, Pedro; Fournier, Jean-Marc; Vogel, Horst; Salathé, René-Paul
2007-02-01
We present a multiple laser tweezers system based on refractive optics. The system produces an array of 100 optical traps thanks to a refractive microlens array, whose focal plane is imaged into the focal plane of a high-NA microscope objective. This refractive multi-tweezers system is combined to micro-fluidics, aiming at performing simultaneous biochemical reactions on ensembles of free floating objects. Micro-fluidics allows both transporting the particles to the trapping area, and conveying biochemical reagents to the trapped particles. Parallel trapping in micro-fluidics is achieved with polystyrene beads as well as with native vesicles produced from mammalian cells. The traps can hold objects against fluid flows exceeding 100 micrometers per second. Parallel fluorescence excitation and detection on the ensemble of trapped particles is also demonstrated. Additionally, the system is capable of selectively and individually releasing particles from the tweezers array using a complementary steerable laser beam. Strategies for high-yield particle capture and individual particle release in a micro-fluidic environment are discussed. A comparison with diffractive optical tweezers enhances the pros and cons of refractive systems.
Cooled particle accelerator target
Degtiarenko, Pavel V.
2005-06-14
A novel particle beam target comprising: a rotating target disc mounted on a retainer and thermally coupled to a first array of spaced-apart parallel plate fins that extend radially inwardly from the retainer and mesh without physical contact with a second array of spaced-apart parallel plate fins that extend radially outwardly from and are thermally coupled to a cooling mechanism capable of removing heat from said second array of spaced-apart fins and located within the first array of spaced-apart parallel fins. Radiant thermal exchange between the two arrays of parallel plate fins provides removal of heat from the rotating disc. A method of cooling the rotating target is also described.
Collisions between quasi-parallel shocks
NASA Technical Reports Server (NTRS)
Cargill, Peter J.
1991-01-01
The collision between pairs of quasi-parallel shocks is examined using hybrid numerical simulations. In the interaction, the two shocks are transmitted through each other leaving behind a hot plasma with a population of particles with energies in excess of 40 E0, where E0 is the kinetic energy of particles in the shock frame prior to the collision. The energization is more efficient for quasi-parallel shocks than parallel shocks. Collisions between shocks of equal strengths are more efficient than those that are unequal. The results are of importance for phenomena during the impulsive phase of solar flares, in the distant solar wind and at planetary bow shocks.
A distributed, dynamic, parallel computational model: the role of noise in velocity storage
Merfeld, Daniel M.
2012-01-01
Networks of neurons perform complex calculations using distributed, parallel computation, including dynamic “real-time” calculations required for motion control. The brain must combine sensory signals to estimate the motion of body parts using imperfect information from noisy neurons. Models and experiments suggest that the brain sometimes optimally minimizes the influence of noise, although it remains unclear when and precisely how neurons perform such optimal computations. To investigate, we created a model of velocity storage based on a relatively new technique–“particle filtering”–that is both distributed and parallel. It extends existing observer and Kalman filter models of vestibular processing by simulating the observer model many times in parallel with noise added. During simulation, the variance of the particles defining the estimator state is used to compute the particle filter gain. We applied our model to estimate one-dimensional angular velocity during yaw rotation, which yielded estimates for the velocity storage time constant, afferent noise, and perceptual noise that matched experimental data. We also found that the velocity storage time constant was Bayesian optimal by comparing the estimate of our particle filter with the estimate of the Kalman filter, which is optimal. The particle filter demonstrated a reduced velocity storage time constant when afferent noise increased, which mimics what is known about aminoglycoside ablation of semicircular canal hair cells. This model helps bridge the gap between parallel distributed neural computation and systems-level behavioral responses like the vestibuloocular response and perception. PMID:22514288
NASA Astrophysics Data System (ADS)
Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng
2018-02-01
De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
Using the Statecharts paradigm for simulation of patient flow in surgical care.
Sobolev, Boris; Harel, David; Vasilakis, Christos; Levy, Adrian
2008-03-01
Computer simulation of patient flow has been used extensively to assess the impacts of changes in the management of surgical care. However, little research is available on the utility of existing modeling techniques. The purpose of this paper is to examine the capacity of Statecharts, a system of graphical specification, for constructing a discrete-event simulation model of the perioperative process. The Statecharts specification paradigm was originally developed for representing reactive systems by extending the formalism of finite-state machines through notions of hierarchy, parallelism, and event broadcasting. Hierarchy permits subordination between states so that one state may contain other states. Parallelism permits more than one state to be active at any given time. Broadcasting of events allows one state to detect changes in another state. In the context of the peri-operative process, hierarchy provides the means to describe steps within activities and to cluster related activities, parallelism provides the means to specify concurrent activities, and event broadcasting provides the means to trigger a series of actions in one activity according to transitions that occur in another activity. Combined with hierarchy and parallelism, event broadcasting offers a convenient way to describe the interaction of concurrent activities. We applied the Statecharts formalism to describe the progress of individual patients through surgical care as a series of asynchronous updates in patient records generated in reaction to events produced by parallel finite-state machines representing concurrent clinical and managerial activities. We conclude that Statecharts capture successfully the behavioral aspects of surgical care delivery by specifying permissible chronology of events, conditions, and actions.
Xiong, Wenjun; Patel, Ragini; Cao, Jinde; Zheng, Wei Xing
In this brief, our purpose is to apply asynchronous and intermittent sampled-data control methods to achieve the synchronization of hierarchical time-varying neural networks. The asynchronous and intermittent sampled-data controllers are proposed for two reasons: 1) the controllers may not transmit the control information simultaneously and 2) the controllers cannot always exist at any time . The synchronization is then discussed for a kind of hierarchical time-varying neural networks based on the asynchronous and intermittent sampled-data controllers. Finally, the simulation results are given to illustrate the usefulness of the developed criteria.In this brief, our purpose is to apply asynchronous and intermittent sampled-data control methods to achieve the synchronization of hierarchical time-varying neural networks. The asynchronous and intermittent sampled-data controllers are proposed for two reasons: 1) the controllers may not transmit the control information simultaneously and 2) the controllers cannot always exist at any time . The synchronization is then discussed for a kind of hierarchical time-varying neural networks based on the asynchronous and intermittent sampled-data controllers. Finally, the simulation results are given to illustrate the usefulness of the developed criteria.
Digital Synchronizer without Metastability
NASA Technical Reports Server (NTRS)
Simle, Robert M.; Cavazos, Jose A.
2009-01-01
A proposed design for a digital synchronizing circuit would eliminate metastability that plagues flip-flop circuits in digital input/output interfaces. This metastability is associated with sampling, by use of flip-flops, of an external signal that is asynchronous with a clock signal that drives the flip-flops: it is a temporary flip-flop failure that can occur when a rising or falling edge of an asynchronous signal occurs during the setup and/or hold time of a flip-flop. The proposed design calls for (1) use of a clock frequency greater than the frequency of the asynchronous signal, (2) use of flip-flop asynchronous preset or clear signals for the asynchronous input, (3) use of a clock asynchronous recovery delay with pulse width discriminator, and (4) tying the data inputs to constant logic levels to obtain (5) two half-rate synchronous partial signals - one for the falling and one for the rising edge. Inasmuch as the flip-flop data inputs would be permanently tied to constant logic levels, setup and hold times would not be violated. The half-rate partial signals would be recombined to construct a signal that would replicate the original asynchronous signal at its original rate but would be synchronous with the clock signal.
Computational Particle Dynamic Simulations on Multicore Processors (CPDMu) Final Report Phase I
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schmalz, Mark S
2011-07-24
Statement of Problem - Department of Energy has many legacy codes for simulation of computational particle dynamics and computational fluid dynamics applications that are designed to run on sequential processors and are not easily parallelized. Emerging high-performance computing architectures employ massively parallel multicore architectures (e.g., graphics processing units) to increase throughput. Parallelization of legacy simulation codes is a high priority, to achieve compatibility, efficiency, accuracy, and extensibility. General Statement of Solution - A legacy simulation application designed for implementation on mainly-sequential processors has been represented as a graph G. Mathematical transformations, applied to G, produce a graph representation {und G}more » for a high-performance architecture. Key computational and data movement kernels of the application were analyzed/optimized for parallel execution using the mapping G {yields} {und G}, which can be performed semi-automatically. This approach is widely applicable to many types of high-performance computing systems, such as graphics processing units or clusters comprised of nodes that contain one or more such units. Phase I Accomplishments - Phase I research decomposed/profiled computational particle dynamics simulation code for rocket fuel combustion into low and high computational cost regions (respectively, mainly sequential and mainly parallel kernels), with analysis of space and time complexity. Using the research team's expertise in algorithm-to-architecture mappings, the high-cost kernels were transformed, parallelized, and implemented on Nvidia Fermi GPUs. Measured speedups (GPU with respect to single-core CPU) were approximately 20-32X for realistic model parameters, without final optimization. Error analysis showed no loss of computational accuracy. Commercial Applications and Other Benefits - The proposed research will constitute a breakthrough in solution of problems related to efficient parallel computation of particle and fluid dynamics simulations. These problems occur throughout DOE, military and commercial sectors: the potential payoff is high. We plan to license or sell the solution to contractors for military and domestic applications such as disaster simulation (aerodynamic and hydrodynamic), Government agencies (hydrological and environmental simulations), and medical applications (e.g., in tomographic image reconstruction). Keywords - High-performance Computing, Graphic Processing Unit, Fluid/Particle Simulation. Summary for Members of Congress - Department of Energy has many simulation codes that must compute faster, to be effective. The Phase I research parallelized particle/fluid simulations for rocket combustion, for high-performance computing systems.« less
1986-12-26
NAVAL TRAINING SYSTEMS CENTER ORLANDO. FLORIDA IT FILE COPY THE EFFECTS OF ASYNCHRONOUS VISUAL DELAYS ON SIMULATOR FLIGHT PERFORMANCE AND THE...ASYNCHRONOUS VISUAL. DELAYS ON SI.WLATOR FLIGHT PERF OMANCE AND THE DEVELOPMENT OF SIMLATOR SICKNESS SYMPTOMATOLOGY K. C. Uliano, E. Y. Lambert, R. S. Kennedy...ACCESSION NO. N63733N SP-01 0785-7P6 I. 4780 11. TITLE (Include Security Classification) The Effects of Asynchronous Visual Delays on Simulator Flight
Parallel grid library for rapid and flexible simulation development
NASA Astrophysics Data System (ADS)
Honkonen, I.; von Alfthan, S.; Sandroos, A.; Janhunen, P.; Palmroth, M.
2013-04-01
We present an easy to use and flexible grid library for developing highly scalable parallel simulations. The distributed cartesian cell-refinable grid (dccrg) supports adaptive mesh refinement and allows an arbitrary C++ class to be used as cell data. The amount of data in grid cells can vary both in space and time allowing dccrg to be used in very different types of simulations, for example in fluid and particle codes. Dccrg transfers the data between neighboring cells on different processes transparently and asynchronously allowing one to overlap computation and communication. This enables excellent scalability at least up to 32 k cores in magnetohydrodynamic tests depending on the problem and hardware. In the version of dccrg presented here part of the mesh metadata is replicated between MPI processes reducing the scalability of adaptive mesh refinement (AMR) to between 200 and 600 processes. Dccrg is free software that anyone can use, study and modify and is available at https://gitorious.org/dccrg. Users are also kindly requested to cite this work when publishing results obtained with dccrg. Catalogue identifier: AEOM_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOM_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: GNU Lesser General Public License version 3 No. of lines in distributed program, including test data, etc.: 54975 No. of bytes in distributed program, including test data, etc.: 974015 Distribution format: tar.gz Programming language: C++. Computer: PC, cluster, supercomputer. Operating system: POSIX. The code has been parallelized using MPI and tested with 1-32768 processes RAM: 10 MB-10 GB per process Classification: 4.12, 4.14, 6.5, 19.3, 19.10, 20. External routines: MPI-2 [1], boost [2], Zoltan [3], sfc++ [4] Nature of problem: Grid library supporting arbitrary data in grid cells, parallel adaptive mesh refinement, transparent remote neighbor data updates and load balancing. Solution method: The simulation grid is represented by an adjacency list (graph) with vertices stored into a hash table and edges into contiguous arrays. Message Passing Interface standard is used for parallelization. Cell data is given as a template parameter when instantiating the grid. Restrictions: Logically cartesian grid. Running time: Running time depends on the hardware, problem and the solution method. Small problems can be solved in under a minute and very large problems can take weeks. The examples and tests provided with the package take less than about one minute using default options. In the version of dccrg presented here the speed of adaptive mesh refinement is at most of the order of 106 total created cells per second. http://www.mpi-forum.org/. http://www.boost.org/. K. Devine, E. Boman, R. Heaphy, B. Hendrickson, C. Vaughan, Zoltan data management services for parallel dynamic applications, Comput. Sci. Eng. 4 (2002) 90-97. http://dx.doi.org/10.1109/5992.988653. https://gitorious.org/sfc++.
Configuration of twins in glass-embedded silver nanoparticles of various origin
NASA Astrophysics Data System (ADS)
Hofmeister, H.; Dubiel, M.; Tan, G. L.; Schicke, K.-D.
2005-09-01
Structural characterization using high resolution electron microscopy and diffractogram analysis of silver nanoparticles embedded in glass by various routes of fabrication was aimed at revealing the characteristic features of twin faults occuring in such particles. Nearly spherical silver nanoparticles well below 10 nm size embedded in commercial soda-lime silicate float glass have been fabricated either by silver/sodium ion exchange or by Ag+ ion implantation. Twinned nanoparticles, besides single crystalline species, have frequently been observed for both fabrication routes, mainly at sizes above 5 nm, but also at smaller sizes, even around 1 nm. The variety of particle forms comprises single crystalline particles of nearly cuboctahedron shape, particles containing single twin faults, and multiply twinned particles containing parallel twin lamellae, or cyclic twinned segments arranged around axes of fivefold symmetry. Parallel twinning is distinctly favoured by ion implantation whereas cyclic twinning preferably occurs upon ion exchange processing. Regardless of single or repeated twinning, parallel or cyclic twin arrangement, one may classify simple twin faults of regular atomic configuration and compound twin faults whose irregular configuration consists of additional planar defects like associated stacking faults or secondary twin faults. Besides, a particular superstructure composed of parallel twin lamellae of only three atomic layers thickness is observed.
Turbulence dissipation challenge: particle-in-cell simulations
NASA Astrophysics Data System (ADS)
Roytershteyn, V.; Karimabadi, H.; Omelchenko, Y.; Germaschewski, K.
2015-12-01
We discuss application of three particle in cell (PIC) codes to the problems relevant to turbulence dissipation challenge. VPIC is a fully kinetic code extensively used to study a variety of diverse problems ranging from laboratory plasmas to astrophysics. PSC is a flexible fully kinetic code offering a variety of algorithms that can be advantageous to turbulence simulations, including high order particle shapes, dynamic load balancing, and ability to efficiently run on Graphics Processing Units (GPUs). Finally, HYPERS is a novel hybrid (kinetic ions+fluid electrons) code, which utilizes asynchronous time advance and a number of other advanced algorithms. We present examples drawn both from large-scale turbulence simulations and from the test problems outlined by the turbulence dissipation challenge. Special attention is paid to such issues as the small-scale intermittency of inertial range turbulence, mode content of the sub-proton range of scales, the formation of electron-scale current sheets and the role of magnetic reconnection, as well as numerical challenges of applying PIC codes to simulations of astrophysical turbulence.
Sliding states of a soft-colloid cluster crystal: Cluster versus single-particle hopping
NASA Astrophysics Data System (ADS)
Rossini, Mirko; Consonni, Lorenzo; Stenco, Andrea; Reatto, Luciano; Manini, Nicola
2018-05-01
We study a two-dimensional model for interacting colloidal particles which displays spontaneous clustering. Within this model we investigate the competition between the pinning to a periodic corrugation potential and a sideways constant pulling force which would promote a sliding state. For a few sample particle densities and amplitudes of the periodic corrugation potential we investigate the depinning from the statically pinned to the dynamically sliding regime. This sliding state exhibits the competition between a dynamics where entire clusters are pulled from a minimum to the next and a dynamics where single colloids or smaller groups leave a cluster and move across the corrugation energy barrier to join the next cluster downstream in the force direction. Both kinds of sliding states can occur either coherently across the entire sample or asynchronously: the two regimes result in different average mobilities. Finite temperature tends to destroy separate sliding regimes, generating a smoother dependence of the mobility on the driving force.
Modeling and Analysis of Mixed Synchronous/Asynchronous Systems
NASA Technical Reports Server (NTRS)
Driscoll, Kevin R.; Madl. Gabor; Hall, Brendan
2012-01-01
Practical safety-critical distributed systems must integrate safety critical and non-critical data in a common platform. Safety critical systems almost always consist of isochronous components that have synchronous or asynchronous interface with other components. Many of these systems also support a mix of synchronous and asynchronous interfaces. This report presents a study on the modeling and analysis of asynchronous, synchronous, and mixed synchronous/asynchronous systems. We build on the SAE Architecture Analysis and Design Language (AADL) to capture architectures for analysis. We present preliminary work targeted to capture mixed low- and high-criticality data, as well as real-time properties in a common Model of Computation (MoC). An abstract, but representative, test specimen system was created as the system to be modeled.
A novel comparator featured with input data characteristic
NASA Astrophysics Data System (ADS)
Jiang, Xiaobo; Ye, Desheng; Xu, Xiangmin; Zheng, Shuai
2016-03-01
Two types of low-power asynchronous comparators featured with input data statistical characteristic are proposed in this article. The asynchronous ripple comparator stops comparing at the first unequal bit but delivers the result to the least significant bit. The pre-stop asynchronous comparator can completely stop comparing and obtain results immediately. The proposed and contrastive comparators were implemented in SMIC 0.18 μm process with different bit widths. Simulation shows that the proposed pre-stop asynchronous comparator features the lowest power consumption, shortest average propagation delay and highest area efficiency among the comparators. Data path of low-density parity check decoder using the proposed pre-stop asynchronous comparators are most power efficient compared with other data paths with synthesised, clock gating and bitwise competition logic comparators.
Multiple grid problems on concurrent-processing computers
NASA Technical Reports Server (NTRS)
Eberhardt, D. S.; Baganoff, D.
1986-01-01
Three computer codes were studied which make use of concurrent processing computer architectures in computational fluid dynamics (CFD). The three parallel codes were tested on a two processor multiple-instruction/multiple-data (MIMD) facility at NASA Ames Research Center, and are suggested for efficient parallel computations. The first code is a well-known program which makes use of the Beam and Warming, implicit, approximate factored algorithm. This study demonstrates the parallelism found in a well-known scheme and it achieved speedups exceeding 1.9 on the two processor MIMD test facility. The second code studied made use of an embedded grid scheme which is used to solve problems having complex geometries. The particular application for this study considered an airfoil/flap geometry in an incompressible flow. The scheme eliminates some of the inherent difficulties found in adapting approximate factorization techniques onto MIMD machines and allows the use of chaotic relaxation and asynchronous iteration techniques. The third code studied is an application of overset grids to a supersonic blunt body problem. The code addresses the difficulties encountered when using embedded grids on a compressible, and therefore nonlinear, problem. The complex numerical boundary system associated with overset grids is discussed and several boundary schemes are suggested. A boundary scheme based on the method of characteristics achieved the best results.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Amadio, G.; et al.
An intensive R&D and programming effort is required to accomplish new challenges posed by future experimental high-energy particle physics (HEP) programs. The GeantV project aims to narrow the gap between the performance of the existing HEP detector simulation software and the ideal performance achievable, exploiting latest advances in computing technology. The project has developed a particle detector simulation prototype capable of transporting in parallel particles in complex geometries exploiting instruction level microparallelism (SIMD and SIMT), task-level parallelism (multithreading) and high-level parallelism (MPI), leveraging both the multi-core and the many-core opportunities. We present preliminary verification results concerning the electromagnetic (EM) physicsmore » models developed for parallel computing architectures within the GeantV project. In order to exploit the potential of vectorization and accelerators and to make the physics model effectively parallelizable, advanced sampling techniques have been implemented and tested. In this paper we introduce a set of automated statistical tests in order to verify the vectorized models by checking their consistency with the corresponding Geant4 models and to validate them against experimental data.« less
Dust Dynamics in Protoplanetary Disks: Parallel Computing with PVM
NASA Astrophysics Data System (ADS)
de La Fuente Marcos, Carlos; Barge, Pierre; de La Fuente Marcos, Raúl
2002-03-01
We describe a parallel version of our high-order-accuracy particle-mesh code for the simulation of collisionless protoplanetary disks. We use this code to carry out a massively parallel, two-dimensional, time-dependent, numerical simulation, which includes dust particles, to study the potential role of large-scale, gaseous vortices in protoplanetary disks. This noncollisional problem is easy to parallelize on message-passing multicomputer architectures. We performed the simulations on a cache-coherent nonuniform memory access Origin 2000 machine, using both the parallel virtual machine (PVM) and message-passing interface (MPI) message-passing libraries. Our performance analysis suggests that, for our problem, PVM is about 25% faster than MPI. Using PVM and MPI made it possible to reduce CPU time and increase code performance. This allows for simulations with a large number of particles (N ~ 105-106) in reasonable CPU times. The performances of our implementation of the pa! rallel code on an Origin 2000 supercomputer are presented and discussed. They exhibit very good speedup behavior and low load unbalancing. Our results confirm that giant gaseous vortices can play a dominant role in giant planet formation.
Electroosmotic velocity in an array of parallel soft cylinders in a salt-free medium.
Ohshima, Hiroyuki
2004-11-15
A theory of electroosmosis in an array of parallel soft cylinders (i.e. polyelectrolyte-coated cylinders) in a salt-free medium is presented. It is shown that there is a certain critical value of the particle charge and that if the particle charge is greater than the critical value, then the electroosmotic velocity becomes constant independent of the particle charge due to the counterion condensation effects, as in the case of other electrokinetic phenomena in salt-free media.
Interactional Coherence in Asynchronous Learning Networks: A Rhetorical Approach
ERIC Educational Resources Information Center
Potter, Andrew
2008-01-01
Numerous studies have affirmed the value of asynchronous online communication as a learning resource. Several investigations, however, have indicated that discussions in asynchronous environments are often neither interactive nor coherent. The research reported sought to develop an enhanced understanding of interactional coherence, argumentation,…
Kinetic theory of turbulence for parallel propagation revisited: Formal results
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoon, Peter H., E-mail: yoonp@umd.edu
2015-08-15
In a recent paper, Gaelzer et al. [Phys. Plasmas 22, 032310 (2015)] revisited the second-order nonlinear kinetic theory for turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field. The original work was according to Yoon and Fang [Phys. Plasmas 15, 122312 (2008)], but Gaelzer et al. noted that the terms pertaining to discrete-particle effects in Yoon and Fang's theory did not enjoy proper dimensionality. The purpose of Gaelzer et al. was to restore the dimensional consistency associated with such terms. However, Gaelzer et al. was concerned only with linear wave-particle interaction terms. The present paper completes the analysis bymore » considering the dimensional correction to nonlinear wave-particle interaction terms in the wave kinetic equation.« less
Exact simulation of polarized light reflectance by particle deposits
NASA Astrophysics Data System (ADS)
Ramezan Pour, B.; Mackowski, D. W.
2015-12-01
The use of polarimetric light reflection measurements as a means of identifying the physical and chemical characteristics of particulate materials obviously relies on an accurate model of predicting the effects of particle size, shape, concentration, and refractive index on polarized reflection. The research examines two methods for prediction of reflection from plane parallel layers of wavelength—sized particles. The first method is based on an exact superposition solution to Maxwell's time harmonic wave equations for a deposit of spherical particles that are exposed to a plane incident wave. We use a FORTRAN-90 implementation of this solution (the Multiple Sphere T Matrix (MSTM) code), coupled with parallel computational platforms, to directly simulate the reflection from particle layers. The second method examined is based upon the vector radiative transport equation (RTE). Mie theory is used in our RTE model to predict the extinction coefficient, albedo, and scattering phase function of the particles, and the solution of the RTE is obtained from adding—doubling method applied to a plane—parallel configuration. Our results show that the MSTM and RTE predictions of the Mueller matrix elements converge when particle volume fraction in the particle layer decreases below around five percent. At higher volume fractions the RTE can yield results that, depending on the particle size and refractive index, significantly depart from the exact predictions. The particle regimes which lead to dependent scattering effects, and the application of methods to correct the vector RTE for particle interaction, will be discussed.
Labeled Postings for Asynchronous Interaction
ERIC Educational Resources Information Center
ChanLin, Lih-Juan; Chen, Yong-Ting; Chan, Kung-Chi
2009-01-01
The Internet promotes computer-mediated communications, and so asynchronous learning network systems permit more flexibility in time, space, and interaction than synchronous mode of learning. The key point of asynchronous learning is the materials for web-aided teaching and the flow of knowledge. This research focuses on improving online…
An Asynchronous Augmentation to Traditional Course Delivery.
ERIC Educational Resources Information Center
Wolverton, Marvin L.; Wolverton, Mimi
Asynchronous augmentation facilitates distributed learning, which relies heavily on technology and self-learning. This paper reports the results of delivering a real estate principles course using an asynchronous course delivery format. It highlights one of many ways to enhance learning using technology, and it provides information concerning how…
A Taxonomy of Learning through Asynchronous Discussion
ERIC Educational Resources Information Center
Knowlton, Dave S.
2005-01-01
This article presents a five-tiered taxonomy that describes the nature of participation in, and learning through, asynchronous discussion. The taxonomy is framed by a constructivist view of asynchronous discussion. The five tiers of the taxonomy include the following: (a) passive participation, (b) developmental participation, (c) generative…
ERIC Educational Resources Information Center
Gao, Fei; Zhang, Tianyi; Franklin, Teresa
2013-01-01
Asynchronous online discussion environments are important platforms to support learning. Research suggests, however, threaded forums, one of the most popular asynchronous discussion environments, do not often foster productive online discussions naturally. This paper explores how certain properties of threaded forums have affected or constrained…
Integrating Asynchronous Digital Design Into the Computer Engineering Curriculum
ERIC Educational Resources Information Center
Smith, S. C.; Al-Assadi, W. K.; Di, J.
2010-01-01
As demand increases for circuits with higher performance, higher complexity, and decreased feature size, asynchronous (clockless) paradigms will become more widely used in the semiconductor industry, as evidenced by the International Technology Roadmap for Semiconductors' (ITRS) prediction of a likely shift from synchronous to asynchronous design…
Xu, Jingxiang; Higuchi, Yuji; Ozawa, Nobuki; Sato, Kazuhisa; Hashida, Toshiyuki; Kubo, Momoji
2017-09-20
Ni sintering in the Ni/YSZ porous anode of a solid oxide fuel cell changes the porous structure, leading to degradation. Preventing sintering and degradation during operation is a great challenge. Usually, a sintering molecular dynamics (MD) simulation model consisting of two particles on a substrate is used; however, the model cannot reflect the porous structure effect on sintering. In our previous study, a multi-nanoparticle sintering modeling method with tens of thousands of atoms revealed the effect of the particle framework and porosity on sintering. However, the method cannot reveal the effect of the particle size on sintering and the effect of sintering on the change in the porous structure. In the present study, we report a strategy to reveal them in the porous structure by using our multi-nanoparticle modeling method and a parallel large-scale multimillion-atom MD simulator. We used this method to investigate the effect of YSZ particle size and tortuosity on sintering and degradation in the Ni/YSZ anodes. Our parallel large-scale MD simulation showed that the sintering degree decreased as the YSZ particle size decreased. The gas fuel diffusion path, which reflects the overpotential, was blocked by pore coalescence during sintering. The degradation of gas diffusion performance increased as the YSZ particle size increased. Furthermore, the gas diffusion performance was quantified by a tortuosity parameter and an optimal YSZ particle size, which is equal to that of Ni, was found for good diffusion after sintering. These findings cannot be obtained by previous MD sintering studies with tens of thousands of atoms. The present parallel large-scale multimillion-atom MD simulation makes it possible to clarify the effects of the particle size and tortuosity on sintering and degradation.
Engineered plant biomass particles coated with biological agents
Dooley, James H.; Lanning, David N.
2014-06-24
Plant biomass particles coated with a biological agent such as a bacterium or seed, characterized by a length dimension (L) aligned substantially parallel to a grain direction and defining a substantially uniform distance along the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L. In particular, the L.times.H dimensions define a pair of substantially parallel side surfaces characterized by substantially intact longitudinally arrayed fibers, the W.times.H dimensions define a pair of substantially parallel end surfaces characterized by crosscut fibers and end checking between fibers, and the L.times.W dimensions define a pair of substantially parallel top and bottom surfaces.
NASA Astrophysics Data System (ADS)
Lu, San; Artemyev, A. V.; Angelopoulos, V.
2017-11-01
Magnetotail current sheet thinning is a distinctive feature of substorm growth phase, during which magnetic energy is stored in the magnetospheric lobes. Investigation of charged particle dynamics in such thinning current sheets is believed to be important for understanding the substorm energy storage and the current sheet destabilization responsible for substorm expansion phase onset. We use Time History of Events and Macroscale Interactions during Substorms (THEMIS) B and C observations in 2008 and 2009 at 18 - 25 RE to show that during magnetotail current sheet thinning, the electron temperature decreases (cooling), and the parallel temperature decreases faster than the perpendicular temperature, leading to a decrease of the initially strong electron temperature anisotropy (isotropization). This isotropization cannot be explained by pure adiabatic cooling or by pitch angle scattering. We use test particle simulations to explore the mechanism responsible for the cooling and isotropization. We find that during the thinning, a fast decrease of a parallel electric field (directed toward the Earth) can speed up the electron parallel cooling, causing it to exceed the rate of perpendicular cooling, and thus lead to isotropization, consistent with observation. If the parallel electric field is too small or does not change fast enough, the electron parallel cooling is slower than the perpendicular cooling, so the parallel electron anisotropy grows, contrary to observation. The same isotropization can also be accomplished by an increasing parallel electric field directed toward the equatorial plane. Our study reveals the existence of a large-scale parallel electric field, which plays an important role in magnetotail particle dynamics during the current sheet thinning process.
Actively Engaging Students in Asynchronous Online Classes. IDEA Paper #64
ERIC Educational Resources Information Center
Riggs, Shannon A.; Linder, Kathryn E.
2016-01-01
Active learning activities and pedagogical strategies can look different in online learning environments, particularly in asynchronous courses when students are not interacting with the instructor, or with each other, in real time. This paper suggests a three-pronged approach for conceptualizing active learning in the online asynchronous class:…
Exploring Asynchronous and Synchronous Tool Use in Online Courses
ERIC Educational Resources Information Center
Oztok, Murat; Zingaro, Daniel; Brett, Clare; Hewitt, Jim
2013-01-01
While the independent contributions of synchronous and asynchronous interaction in online learning are clear, comparatively less is known about the pedagogical consequences of using both modes in the same environment. In this study, we examine relationships between students' use of asynchronous discussion forums and synchronous private messages…
Teaching Presence and Communication Timeliness in Asynchronous Online Courses
ERIC Educational Resources Information Center
Skramstad, Erik; Schlosser, Charles; Orellana, Anymir
2012-01-01
This study examined student perceptions of teaching presence and communication timeliness in asynchronous online courses. Garrison, Anderson, and Archer's (2000) community of inquiry model provided the framework for the survey research methodology used. Participants were 59 student volunteers taking 1 or more asynchronous online graduate courses.…
Two Studies Examining Argumentation in Asynchronous Computer Mediated Communication
ERIC Educational Resources Information Center
Joiner, Richard; Jones, Sarah; Doherty, John
2008-01-01
Asynchronous computer mediated communication (CMC) would seem to be an ideal medium for supporting development in student argumentation. This paper investigates this assumption through two studies. The first study compared asynchronous CMC with face-to-face discussions. The transactional and strategic level of the argumentation (i.e. measures of…
Using Television Sitcoms to Facilitate Asynchronous Discussions in the Online Communication Course
ERIC Educational Resources Information Center
Tolman, Elizabeth; Asbury, Bryan
2012-01-01
Asynchronous discussions are a useful instructional resource in the online communication course. In discussion groups students have the opportunity to actively participate and interact with students and the instructor. Asynchronous communication allows for flexibility because "participants can interact with significant amounts of time between…
Asynchronous Discussion Board Facilitation and Rubric Use in a Blended Learning Environment
ERIC Educational Resources Information Center
Giacumo, Lisa
2012-01-01
The purpose of this study was to investigate the effects of instructor response prompts and rubrics on students' performance in an asynchronous discussion-board assignment, their learning achievement on an objective-type posttest, and their reported satisfaction levels. Researchers who have studied asynchronous computer-mediated student…
Designing Asynchronous, Text-Based Computer Conferences: Ten Research-Based Suggestions
ERIC Educational Resources Information Center
Choitz, Paul; Lee, Doris
2006-01-01
Asynchronous computer conferencing refers to the use of computer software and a network enabling participants to post messages that allow discourse to continue even though interactions may be extended over days and weeks. Asynchronous conferences are time-independent, adapting to multiple time zones and learner schedules. Such activities as…
Asynchronous Learning Forums for Business Acculturation
ERIC Educational Resources Information Center
Pence, Christine Cope; Wulf, Catharina
2009-01-01
The use of IT as a facilitator for student collaboration in higher business education has grown rapidly since 2000. Asynchronous discussion forums are used abundantly for collaborative training purposes and for teaching students business-relevant tools for their future careers. This article presents an analysis of the asynchronous discussion forum…
Kunin, Marc; Julliard, Kell N; Rodriguez, Tobias E
2014-06-01
The Department of Dental Medicine of Lutheran Medical Center has developed an asynchronous online curriculum consisting of prerecorded PowerPoint presentations with audio explanations. The focus of this study was to evaluate if the new asynchronous format satisfied the educational needs of the residents compared to traditional lecture (face-to-face) and synchronous (distance learning) formats. Lectures were delivered to 219 dental residents employing face-to-face and synchronous formats, as well as the new asynchronous format; 169 (77 percent) participated in the study. Outcomes were assessed with pretests, posttests, and individual lecture surveys. Results found the residents preferred face-to-face and asynchronous formats to the synchronous format in terms of effectiveness and clarity of presentations. This preference was directly related to the residents' perception of how well the technology worked in each format. The residents also rated the quality of student-instructor and student-student interactions in the synchronous and asynchronous formats significantly higher after taking the lecture series than they did before taking it. However, they rated the face-to-face format as significantly more conducive to student-instructor and student-student interaction. While the study found technology had a major impact on the efficacy of this curricular model, the results suggest that the asynchronous format can be an effective way to teach a postgraduate course.
Asynchronous glimpsing of speech: Spread of masking and task set-size
Ozmeral, Erol J.; Buss, Emily; Hall, Joseph W.
2012-01-01
Howard-Jones and Rosen [(1993). J. Acoust. Soc. Am. 93, 2915–2922] investigated the ability to integrate glimpses of speech that are separated in time and frequency using a “checkerboard” masker, with asynchronous amplitude modulation (AM) across frequency. Asynchronous glimpsing was demonstrated only for spectrally wide frequency bands. It is possible that the reduced evidence of spectro-temporal integration with narrower bands was due to spread of masking at the periphery. The present study tested this hypothesis with a dichotic condition, in which the even- and odd-numbered bands of the target speech and asynchronous AM masker were presented to opposite ears, minimizing the deleterious effects of masking spread. For closed-set consonant recognition, thresholds were 5.1–8.5 dB better for dichotic than for monotic asynchronous AM conditions. Results were similar for closed-set word recognition, but for open-set word recognition the benefit of dichotic presentation was more modest and level dependent, consistent with the effects of spread of masking being level dependent. There was greater evidence of asynchronous glimpsing in the open-set than closed-set tasks. Presenting stimuli dichotically supported asynchronous glimpsing with narrower frequency bands than previously shown, though the magnitude of glimpsing was reduced for narrower bandwidths even in some dichotic conditions. PMID:22894234
Asynchronous vs didactic education: it's too early to throw in the towel on tradition.
Jordan, Jaime; Jalali, Azadeh; Clarke, Samuel; Dyne, Pamela; Spector, Tahlia; Coates, Wendy
2013-08-08
Asynchronous, computer based instruction is cost effective, allows self-directed pacing and review, and addresses preferences of millennial learners. Current research suggests there is no significant difference in learning compared to traditional classroom instruction. Data are limited for novice learners in emergency medicine. The objective of this study was to compare asynchronous, computer-based instruction with traditional didactics for senior medical students during a week-long intensive course in acute care. We hypothesized both modalities would be equivalent. This was a prospective observational quasi-experimental study of 4th year medical students who were novice learners with minimal prior exposure to curricular elements. We assessed baseline knowledge with an objective pre-test. The curriculum was delivered in either traditional lecture format (shock, acute abdomen, dyspnea, field trauma) or via asynchronous, computer-based modules (chest pain, EKG interpretation, pain management, trauma). An interactive review covering all topics was followed by a post-test. Knowledge retention was measured after 10 weeks. Pre and post-test items were written by a panel of medical educators and validated with a reference group of learners. Mean scores were analyzed using dependent t-test and attitudes were assessed by a 5-point Likert scale. 44 of 48 students completed the protocol. Students initially acquired more knowledge from didactic education as demonstrated by mean gain scores (didactic: 28.39% ± 18.06; asynchronous 9.93% ± 23.22). Mean difference between didactic and asynchronous = 18.45% with 95% CI [10.40 to 26.50]; p = 0.0001. Retention testing demonstrated similar knowledge attrition: mean gain scores -14.94% (didactic); -17.61% (asynchronous), which was not significantly different: 2.68% ± 20.85, 95% CI [-3.66 to 9.02], p = 0.399. The attitudinal survey revealed that 60.4% of students believed the asynchronous modules were educational and 95.8% enjoyed the flexibility of the method. 39.6% of students preferred asynchronous education for required didactics; 37.5% were neutral; 23% preferred traditional lectures. Asynchronous, computer-based instruction was not equivalent to traditional didactics for novice learners of acute care topics. Interactive, standard didactic education was valuable. Retention rates were similar between instructional methods. Students had mixed attitudes toward asynchronous learning but enjoyed the flexibility. We urge caution in trading in traditional didactic lectures in favor of asynchronous education for novice learners in acute care.
Dynamic Load Balancing Based on Constrained K-D Tree Decomposition for Parallel Particle Tracing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Jiang; Guo, Hanqi; Yuan, Xiaoru
Particle tracing is a fundamental technique in flow field data visualization. In this work, we present a novel dynamic load balancing method for parallel particle tracing. Specifically, we employ a constrained k-d tree decomposition approach to dynamically redistribute tasks among processes. Each process is initially assigned a regularly partitioned block along with duplicated ghost layer under the memory limit. During particle tracing, the k-d tree decomposition is dynamically performed by constraining the cutting planes in the overlap range of duplicated data. This ensures that each process is reassigned particles as even as possible, and on the other hand the newmore » assigned particles for a process always locate in its block. Result shows good load balance and high efficiency of our method.« less
Downstream energetic proton and alpha particles during quasi-parallel interplanetary shock events
NASA Technical Reports Server (NTRS)
Tan, L. C.; Mason, G. M.; Gloeckler, G.; Ipavich, F. M.
1988-01-01
This paper considers the energetic particle populations in the downstream region of three quasi-parallel interplanetary shock events, which was explored using the ISEE 3 Ultra Low Energy Charge Analyzer sensor, which unambiguously identifies protons and alpha particles using the electrostatic deflection versus residual energy technique. The downstream particles were found to exhibit anisotropies due largely to convection in the solar wind. The spectral indices of the proton and the alpha-particle distribution functions were found to be remarkably constant during the downstream period, being generally insensitive to changes in particle flux levels, magnetic field direction, and solar wind densities. In two of the three events, the proton and the alpha spectra were the same throughout the entire downstream period, supporting the prediction of diffusive shock acceleration theory.
The 2nd Symposium on the Frontiers of Massively Parallel Computations
NASA Technical Reports Server (NTRS)
Mills, Ronnie (Editor)
1988-01-01
Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.
SU-F-SPS-09: Parallel MC Kernel Calculations for VMAT Plan Improvement
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chamberlain, S; Roswell Park Cancer Institute, Buffalo, NY; French, S
Purpose: Adding kernels (small perturbations in leaf positions) to the existing apertures of VMAT control points may improve plan quality. We investigate the calculation of kernel doses using a parallelized Monte Carlo (MC) method. Methods: A clinical prostate VMAT DICOM plan was exported from Eclipse. An arbitrary control point and leaf were chosen, and a modified MLC file was created, corresponding to the leaf position offset by 0.5cm. The additional dose produced by this 0.5 cm × 0.5 cm kernel was calculated using the DOSXYZnrc component module of BEAMnrc. A range of particle history counts were run (varying from 3more » × 10{sup 6} to 3 × 10{sup 7}); each job was split among 1, 10, or 100 parallel processes. A particle count of 3 × 10{sup 6} was established as the lower range because it provided the minimal accuracy level. Results: As expected, an increase in particle counts linearly increases run time. For the lowest particle count, the time varied from 30 hours for the single-processor run, to 0.30 hours for the 100-processor run. Conclusion: Parallel processing of MC calculations in the EGS framework significantly decreases time necessary for each kernel dose calculation. Particle counts lower than 1 × 10{sup 6} have too large of an error to output accurate dose for a Monte Carlo kernel calculation. Future work will investigate increasing the number of parallel processes and optimizing run times for multiple kernel calculations.« less
Observations of the larval stages of Diceroprocta apache Davis (Homoptera: Tibicinidae)
Ellingson, A.R.; Andersen, D.C.; Kondratieff, B.C.
2002-01-01
Diceroprocta apache Davis is a locally abundant cicada in the riparian woodlands of the southwestern United States. While its ecological importance has often been hypothesized, very little is known of its specific life history. This paper presents preliminary information on life history of D. apache from larvae collected in the field at seasonal intervals as well as a smaller number of reared specimens. Morphological development of the fore-femoral comb closely parallels growth through distinct size classes. The data indicate the presence of five larval instars in D. apache. Development times from greenhouse-reared specimens suggest a 3-4 year life span and overlapping broods were present in the field. Sex ratios among pre-emergent larvae suggest the asynchronous emergence of sexes.
Method and apparatus for offloading compute resources to a flash co-processing appliance
Tzelnic, Percy; Faibish, Sorin; Gupta, Uday K.; Bent, John; Grider, Gary Alan; Chen, Hsing -bung
2015-10-13
Solid-State Drive (SSD) burst buffer nodes are interposed into a parallel supercomputing cluster to enable fast burst checkpoint of cluster memory to or from nearby interconnected solid-state storage with asynchronous migration between the burst buffer nodes and slower more distant disk storage. The SSD nodes also perform tasks offloaded from the compute nodes or associated with the checkpoint data. For example, the data for the next job is preloaded in the SSD node and very fast uploaded to the respective compute node just before the next job starts. During a job, the SSD nodes perform fast visualization and statistical analysis upon the checkpoint data. The SSD nodes can also perform data reduction and encryption of the checkpoint data.
Bonsai: an event-based framework for processing and controlling data streams
Lopes, Gonçalo; Bonacchi, Niccolò; Frazão, João; Neto, Joana P.; Atallah, Bassam V.; Soares, Sofia; Moreira, Luís; Matias, Sara; Itskov, Pavel M.; Correia, Patrícia A.; Medina, Roberto E.; Calcaterra, Lorenza; Dreosti, Elena; Paton, Joseph J.; Kampff, Adam R.
2015-01-01
The design of modern scientific experiments requires the control and monitoring of many different data streams. However, the serial execution of programming instructions in a computer makes it a challenge to develop software that can deal with the asynchronous, parallel nature of scientific data. Here we present Bonsai, a modular, high-performance, open-source visual programming framework for the acquisition and online processing of data streams. We describe Bonsai's core principles and architecture and demonstrate how it allows for the rapid and flexible prototyping of integrated experimental designs in neuroscience. We specifically highlight some applications that require the combination of many different hardware and software components, including video tracking of behavior, electrophysiology and closed-loop control of stimulation. PMID:25904861
FleCSPH - a parallel and distributed SPH implementation based on the FleCSI framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Junghans, Christoph; Loiseau, Julien
2017-06-20
FleCSPH is a multi-physics compact application that exercises FleCSI parallel data structures for tree-based particle methods. In particular, FleCSPH implements a smoothed-particle hydrodynamics (SPH) solver for the solution of Lagrangian problems in astrophysics and cosmology. FleCSPH includes support for gravitational forces using the fast multipole method (FMM).
Elliptically polarizing adjustable phase insertion device
Carr, Roger
1995-01-01
An insertion device for extracting polarized electromagnetic energy from a beam of particles is disclosed. The insertion device includes four linear arrays of magnets which are aligned with the particle beam. The magnetic field strength to which the particles are subjected is adjusted by altering the relative alignment of the arrays in a direction parallel to that of the particle beam. Both the energy and polarization of the extracted energy may be varied by moving the relevant arrays parallel to the beam direction. The present invention requires a substantially simpler and more economical superstructure than insertion devices in which the magnetic field strength is altered by changing the gap between arrays of magnets.
Means and method for the focusing and acceleration of parallel beams of charged particles
Maschke, Alfred W.
1983-07-05
A novel apparatus and method for focussing beams of charged particles comprising planar arrays of electrostatic quadrupoles. The quadrupole arrays may comprise electrodes which are shared by two or more quadrupoles. Such quadrupole arrays are particularly adapted to providing strong focussing forces for high current, high brightness, beams of charged particles, said beams further comprising a plurality of parallel beams, or beamlets, each such beamlet being focussed by one quadrupole of the array. Such arrays may be incorporated in various devices wherein beams of charged particles are accelerated or transported, such as linear accelerators, klystron tubes, beam transport lines, etc.
Preliminary design for a standard 10 sup 7 bit Solid State Memory (SSM)
NASA Technical Reports Server (NTRS)
Hayes, P. J.; Howle, W. M., Jr.; Stermer, R. L., Jr.
1978-01-01
A modular concept with three separate modules roughly separating bubble domain technology, control logic technology, and power supply technology was employed. These modules were respectively the standard memory module (SMM), the data control unit (DCU), and power supply module (PSM). The storage medium was provided by bubble domain chips organized into memory cells. These cells and the circuitry for parallel data access to the cells make up the SMM. The DCU provides a flexible serial data interface to the SMM. The PSM provides adequate power to enable one DCU and one SMM to operate simultaneously at the maximum data rate. The SSM was designed to handle asynchronous data rates from dc to 1.024 Mbs with a bit error rate less than 1 error in 10 to the eight power bits. Two versions of the SSM, a serial data memory and a dual parallel data memory were specified using the standard modules. The SSM specification includes requirements for radiation hardness, temperature and mechanical environments, dc magnetic field emission and susceptibility, electromagnetic compatibility, and reliability.
NASA Astrophysics Data System (ADS)
Kumari, Komal; Donzis, Diego
2017-11-01
Highly resolved computational simulations on massively parallel machines are critical in understanding the physics of a vast number of complex phenomena in nature governed by partial differential equations. Simulations at extreme levels of parallelism present many challenges with communication between processing elements (PEs) being a major bottleneck. In order to fully exploit the computational power of exascale machines one needs to devise numerical schemes that relax global synchronizations across PEs. This asynchronous computations, however, have a degrading effect on the accuracy of standard numerical schemes.We have developed asynchrony-tolerant (AT) schemes that maintain order of accuracy despite relaxed communications. We show, analytically and numerically, that these schemes retain their numerical properties with multi-step higher order temporal Runge-Kutta schemes. We also show that for a range of optimized parameters,the computation time and error for AT schemes is less than their synchronous counterpart. Stability of the AT schemes which depends upon history and random nature of delays, are also discussed. Support from NSF is gratefully acknowledged.
Material Implementation of Hyperincursive Field on Slime Mold Computer
NASA Astrophysics Data System (ADS)
Aono, Masashi; Gunji, Yukio-Pegio
2004-08-01
"Elementary Conflictable Cellular Automaton (ECCA)" was introduced by Aono and Gunji as a problematic computational syntax embracing the non-deterministic/non-algorithmic property due to its hyperincursivity and nonlocality. Although ECCA's hyperincursive evolution equation indicates the occurrence of the deadlock/infinite-loop, we do not consider that this problem declares the fundamental impossibility of implementing ECCA materially. Dubois proposed to call a computing system where uncertainty/contradiction occurs "the hyperincursive field". In this paper we introduce a material implementation of the hyperincursive field by using plasmodia of the true slime mold Physarum polycephalum. The amoeboid organism is adopted as a computing media of ECCA slime mold computer (ECCA-SMC) mainly because; it is a parallel non-distributed system whose locally branched tips (components) can act in parallel with asynchronism and nonlocal correlation. A notable characteristic of ECCA-SMC is that a cell representing a spatio-temporal segment of computation is occupied (overlapped) redundantly by multiple spatially adjacent computing operations and by temporally successive computing events. The overlapped time representation may contribute to the progression of discussions on unconventional notions of the time.
Engineered plant biomass feedstock particles
Dooley, James H [Federal Way, WA; Lanning, David N [Federal Way, WA; Broderick, Thomas F [Lake Forest Park, WA
2011-10-11
A novel class of flowable biomass feedstock particles with unusually large surface areas that can be manufactured in remarkably uniform sizes using low-energy comminution techniques. The feedstock particles are roughly parallelepiped in shape and characterized by a length dimension (L) aligned substantially with the grain direction and defining a substantially uniform distance along the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L. The particles exhibit a disrupted grain structure with prominent end and surface checks that greatly enhances their skeletal surface area as compared to their envelope surface area. The L.times.H dimensions define a pair of substantially parallel side surfaces characterized by substantially intact longitudinally arrayed fibers. The W.times.H dimensions define a pair of substantially parallel end surfaces characterized by crosscut fibers and end checking between fibers. The L.times.W dimensions define a pair of substantially parallel top surfaces characterized by some surface checking between longitudinally arrayed fibers. The feedstock particles are manufactured from a variety of plant biomass materials including wood, crop residues, plantation grasses, hemp, bagasse, and bamboo.
Particle Acceleration, Magnetic Field Generation in Relativistic Shocks
NASA Technical Reports Server (NTRS)
Nishikawa, Ken-Ichi; Hardee, P.; Hededal, C. B.; Richardson, G.; Sol, H.; Preece, R.; Fishman, G. J.
2005-01-01
Shock acceleration is an ubiquitous phenomenon in astrophysical plasmas. Plasma waves and their associated instabilities (e.g., the Buneman instability, two-streaming instability, and the Weibel instability) created in the shocks are responsible for particle (electron, positron, and ion) acceleration. Using a 3-D relativistic electromagnetic particle (REMP) code, we have investigated particle acceleration associated with a relativistic jet front propagating through an ambient plasma with and without initial magnetic fields. We find only small differences in the results between no ambient and weak ambient parallel magnetic fields. Simulations show that the Weibel instability created in the collisionless shock front accelerates particles perpendicular and parallel to the jet propagation direction. New simulations with an ambient perpendicular magnetic field show the strong interaction between the relativistic jet and the magnetic fields. The magnetic fields are piled up by the jet and the jet electrons are bent, which creates currents and displacement currents. At the nonlinear stage, the magnetic fields are reversed by the current and the reconnection may take place. Due to these dynamics the jet and ambient electron are strongly accelerated in both parallel and perpendicular directions.
Particle Acceleration, Magnetic Field Generation, and Emission in Relativistic Shocks
NASA Technical Reports Server (NTRS)
Nishikawa, Ken-IchiI.; Hededal, C.; Hardee, P.; Richardson, G.; Preece, R.; Sol, H.; Fishman, G.
2004-01-01
Shock acceleration is an ubiquitous phenomenon in astrophysical plasmas. Plasma waves and their associated instabilities (e.g., the Buneman instability, two-streaming instability, and the Weibel instability) created in the shocks are responsible for particle (electron, positron, and ion) acceleration. Using a 3-D relativistic electromagnetic particle (m) code, we have investigated particle acceleration associated with a relativistic jet front propagating through an ambient plasma with and without initial magnetic fields. We find only small differences in the results between no ambient and weak ambient parallel magnetic fields. Simulations show that the Weibel instability created in the collisionless shock front accelerates particles perpendicular and parallel to the jet propagation direction. New simulations with an ambient perpendicular magnetic field show the strong interaction between the relativistic jet and the magnetic fields. The magnetic fields are piled up by the jet and the jet electrons are bent, which creates currents and displacement currents. At the nonlinear stage, the magnetic fields are reversed by the current and the reconnection may take place. Due to these dynamics the jet and ambient electron are strongly accelerated in both parallel and perpendicular directions.
Huang, Yu; Guo, Feng; Li, Yongling; Liu, Yufeng
2015-01-01
Parameter estimation for fractional-order chaotic systems is an important issue in fractional-order chaotic control and synchronization and could be essentially formulated as a multidimensional optimization problem. A novel algorithm called quantum parallel particle swarm optimization (QPPSO) is proposed to solve the parameter estimation for fractional-order chaotic systems. The parallel characteristic of quantum computing is used in QPPSO. This characteristic increases the calculation of each generation exponentially. The behavior of particles in quantum space is restrained by the quantum evolution equation, which consists of the current rotation angle, individual optimal quantum rotation angle, and global optimal quantum rotation angle. Numerical simulation based on several typical fractional-order systems and comparisons with some typical existing algorithms show the effectiveness and efficiency of the proposed algorithm. PMID:25603158
Acceleration of charged particles by crossed cyclotron waves, Resonant Moments Method
NASA Astrophysics Data System (ADS)
Ponomarjov, M.; Carati, D.
A mechanism for enhanced acceleration of charged particles in crossing radio frequency or micro waves propagating at different angles with respect to an external magnetic field is investigated. This mechanism consists in introducing low amplitude secondary waves in order to improve the parallel momentum transfer from the high amplitude primary wave to charged particles. The use of two parallel counter-propagating waves has recently been considered (Gell and Nakach, 1999) and numerical tests (Louies et al, 2001) have shown that the two-wave scheme may lead to higher averaged parallel velocity. On the other hand, it has been concluded that it may be more effective to accelerate electrons when the waves propagate obliquely to the external magnetic field (Karimabadi and Angelopoulos 1989, Cohen et al 1991). The idea considered here is similar although no constraint is imposed on the refraction indices of the primary and the secondary waves. The theoretical analysis of the acceleration mechanism is based on the Resonance Moments Method (RMM) in which moments of the velocity distribution are computed by using an averages over the resonant layers (RL)i only instead of a complete phase-space average. The quantities obtained using this approach, referred to as Resonant Moments (RM), suggest the existence of optimal angles of propagation for the primary and secondary waves as long as the maximization of the parallel flux of charged particles is considered. The fraction of charged particles that are close to the resonance conditions, that correspond to the RL, becomes then as important as the time these particles remain resonant. The secondary wave tends to maintain a pseudo-equilibrium velocity distribution by continuously re-filling the RL. Our suggestions are confirmed by direct numerical simulations for a populations of 105 relativistic electrons. The secondary wave yields a clear increase (up to one order of magnitude) of the average parallel velocity of the particles. It is a quite promising result since the amplitude of the secondary wave is ten times lower the one of the first wave. Qualitative results give one of the enhanced acceleration mechanisms of the charged particles (including relativistic electrons in planetary magnetospheres) by the crossed cyclotron waves in ambient magnetic field.
ERIC Educational Resources Information Center
Givhan, Shawn T.
2013-01-01
This dissertation study chronicles the creation of a computer-based, asynchronously delivered diversity training course for a state agency. The course format enabled efficient delivery of a mandatory curriculum to the Massachusetts Department of State Police workforce. However, the asynchronous format posed a challenge to achieving the learning…
Anonymity and Motivation in Asynchronous Discussions and L2 Vocabulary Learning
ERIC Educational Resources Information Center
Polat, Nihat; Mancilla, Rae; Mahalingappa, Laura
2013-01-01
This study investigates L2 attainment in asynchronous online environments, specifically possible relationships among anonymity, L2 motivation, participation in discussions, quality of L2 production, and success in L2 vocabulary learning. It examines, in asynchronous discussions, (a) if participation and (b) motivation contribute to L2 vocabulary…
Exploring the Effect of Scripted Roles on Cognitive Presence in Asynchronous Online Discussions
ERIC Educational Resources Information Center
Olesova, Larisa; Slavin, Margaret; Lim, Jieun
2016-01-01
The purpose of this study was to identify the effect of scripted roles on students' level of cognitive presence in asynchronous online threaded discussions. A quantitative content analysis was used to investigate: (1) what level of cognitive presence is achieved by students' assigned roles in asynchronous online discussions; (2) differences…
ERIC Educational Resources Information Center
Çardak, Çigdem Suzan
2016-01-01
This article focusses on graduate level students' interactions during asynchronous CMC activities of an online course about the teaching profession in Turkey. The instructor of the course designed and facilitated a semester-long asynchronous CMC on forum discussions, and investigated the interaction of learners in multiple perspectives: learners'…
ERIC Educational Resources Information Center
Kitade, Keiko
2006-01-01
Based on recent studies, computer-mediated communication (CMC) has been considered a tool to aid in language learning on account of its distinctive interactional features. However, most studies have referred to "synchronous" CMC and neglected to investigate how "asynchronous" CMC contributes to language learning. Asynchronous CMC possesses…
ERIC Educational Resources Information Center
McGuire, Beverley Foulks
2016-01-01
This paper considers how instructors of asynchronous online courses in the Humanities might integrate intangibles associated with face-to-face instruction into their online environments. It presents a case study of asynchronous online instruction in a philosophy and religion department at a midsize public university in the southeastern United…
ERIC Educational Resources Information Center
Kian-Sam, Hong; Lee, Julia Ai Cheng
2008-01-01
Blended learning, using e-learning tools to supplement existing on campus learning, often incorporates asynchronous computer conferencing as a means of augmenting knowledge construction among students. This case study reports findings about levels of knowledge construction amongst adult postgraduate students in six asynchronous computer…
Asynchronous Learning Sources in a High-Tech Organization
ERIC Educational Resources Information Center
Bouhnik, Dan; Giat, Yahel; Sanderovitch, Yafit
2009-01-01
Purpose: The purpose of this study is to characterize learning from asynchronous sources among research and development (R&D) personnel. It aims to examine four aspects of asynchronous source learning: employee preferences regarding self-learning; extent of source usage; employee satisfaction with these sources and the effect of the sources on the…
Asynchronous reference frame agreement in a quantum network
NASA Astrophysics Data System (ADS)
Islam, Tanvirul; Wehner, Stephanie
2016-03-01
An efficient implementation of many multiparty protocols for quantum networks requires that all the nodes in the network share a common reference frame. Establishing such a reference frame from scratch is especially challenging in an asynchronous network where network links might have arbitrary delays and the nodes do not share synchronised clocks. In this work, we study the problem of establishing a common reference frame in an asynchronous network of n nodes of which at most t are affected by arbitrary unknown error, and the identities of the faulty nodes are not known. We present a protocol that allows all the correctly functioning nodes to agree on a common reference frame as long as the network graph is complete and not more than t\\lt n/4 nodes are faulty. As the protocol is asynchronous, it can be used with some assumptions to synchronise clocks over a network. Also, the protocol has the appealing property that it allows any existing two-node asynchronous protocol for reference frame agreement to be lifted to a robust protocol for an asynchronous quantum network.
Chang, Todd P; Pham, Phung K; Sobolewski, Brad; Doughty, Cara B; Jamal, Nazreen; Kwan, Karen Y; Little, Kim; Brenkert, Timothy E; Mathison, David J
2014-08-01
Asynchronous e-learning allows for targeted teaching, particularly advantageous when bedside and didactic education is insufficient. An asynchronous e-learning curriculum has not been studied across multiple centers in the context of a clinical rotation. We hypothesize that an asynchronous e-learning curriculum during the pediatric emergency medicine (EM) rotation improves medical knowledge among residents and students across multiple participating centers. Trainees on pediatric EM rotations at four large pediatric centers from 2012 to 2013 were randomized in a Solomon four-group design. The experimental arms received an asynchronous e-learning curriculum consisting of nine Web-based, interactive, peer-reviewed Flash/HTML5 modules. Postrotation testing and in-training examination (ITE) scores quantified improvements in knowledge. A 2 × 2 analysis of covariance (ANCOVA) tested interaction and main effects, and Pearson's correlation tested associations between module usage, scores, and ITE scores. A total of 256 of 458 participants completed all study elements; 104 had access to asynchronous e-learning modules, and 152 were controls who used the current education standards. No pretest sensitization was found (p = 0.75). Use of asynchronous e-learning modules was associated with an improvement in posttest scores (p < 0.001), from a mean score of 18.45 (95% confidence interval [CI] = 17.92 to 18.98) to 21.30 (95% CI = 20.69 to 21.91), a large effect (partial η(2) = 0.19). Posttest scores correlated with ITE scores (r(2) = 0.14, p < 0.001) among pediatric residents. Asynchronous e-learning is an effective educational tool to improve knowledge in a clinical rotation. Web-based asynchronous e-learning is a promising modality to standardize education among multiple institutions with common curricula, particularly in clinical rotations where scheduling difficulties, seasonality, and variable experiences limit in-hospital learning. © 2014 by the Society for Academic Emergency Medicine.
Measurement of Anisotropic Particle Interactions with Nonuniform ac Electric Fields.
Rupp, Bradley; Torres-Díaz, Isaac; Hua, Xiaoqing; Bevan, Michael A
2018-02-20
Optical microscopy measurements are reported for single anisotropic polymer particles interacting with nonuniform ac electric fields. The present study is limited to conditions where gravity confines particles with their long axis parallel to the substrate such that particles can be treated using quasi-2D analysis. Field parameters are investigated that result in particles residing at either electric field maxima or minima and with long axes oriented either parallel or perpendicular to the electric field direction. By nonintrusively observing thermally sampled positions and orientations at different field frequencies and amplitudes, a Boltzmann inversion of the time-averaged probability of states yields kT-scale energy landscapes (including dipole-field, particle-substrate, and gravitational potentials). The measured energy landscapes show agreement with theoretical potentials using particle conductivity as the sole adjustable material property. Understanding anisotropic particle-field energy landscapes vs field parameters enables quantitative control of local forces and torques on single anisotropic particles to manipulate their position and orientation within nonuniform fields.
Modeling and Analysis of Asynchronous Systems Using SAL and Hybrid SAL
NASA Technical Reports Server (NTRS)
Tiwari, Ashish; Dutertre, Bruno
2013-01-01
We present formal models and results of formal analysis of two different asynchronous systems. We first examine a mid-value select module that merges the signals coming from three different sensors that are each asynchronously sampling the same input signal. We then consider the phase locking protocol proposed by Daly, Hopkins, and McKenna. This protocol is designed to keep a set of non-faulty (asynchronous) clocks phase locked even in the presence of Byzantine-faulty clocks on the network. All models and verifications have been developed using the SAL model checking tools and the Hybrid SAL abstractor.
Detection of Failure in Asynchronous Motor Using Soft Computing Method
NASA Astrophysics Data System (ADS)
Vinoth Kumar, K.; Sony, Kevin; Achenkunju John, Alan; Kuriakose, Anto; John, Ano P.
2018-04-01
This paper investigates the stator short winding failure of asynchronous motor also their effects on motor current spectrums. A fuzzy logic approach i.e., model based technique possibly will help to detect the asynchronous motor failure. Actually, fuzzy logic similar to humanoid intelligent methods besides expected linguistic empowering inferences through vague statistics. The dynamic model is technologically advanced for asynchronous motor by means of fuzzy logic classifier towards investigate the stator inter turn failure in addition open phase failure. A hardware implementation was carried out with LabVIEW for the online-monitoring of faults.
Asynchronous discrete control of continuous processes
NASA Astrophysics Data System (ADS)
Kaliski, M. E.; Johnson, T. L.
1984-07-01
The research during this second contract year continued to deal with the development of sound theoretical models for asynchronous systems. Two criteria served to shape the research pursued: the first, that the developed models extend and generalize previously developed research for synchronous discrete control; the second, that the models explicitly address the question of how to incorporate system transition times into themselves. The following sections of this report concisely delineate this year's work. Our original proposal for this research identified four general tasks of investigation: (1.1) Analysis of Qualitative Properties of Asynchronous Hybrid Systems; (1.2) Acceptance and Control for Asynchronous Hybrid Systems.
Solar wind interaction with Venus and Mars in a parallel hybrid code
NASA Astrophysics Data System (ADS)
Jarvinen, Riku; Sandroos, Arto
2013-04-01
We discuss the development and applications of a new parallel hybrid simulation, where ions are treated as particles and electrons as a charge-neutralizing fluid, for the interaction between the solar wind and Venus and Mars. The new simulation code under construction is based on the algorithm of the sequential global planetary hybrid model developed at the Finnish Meteorological Institute (FMI) and on the Corsair parallel simulation platform also developed at the FMI. The FMI's sequential hybrid model has been used for studies of plasma interactions of several unmagnetized and weakly magnetized celestial bodies for more than a decade. Especially, the model has been used to interpret in situ particle and magnetic field observations from plasma environments of Mars, Venus and Titan. Further, Corsair is an open source MPI (Message Passing Interface) particle and mesh simulation platform, mainly aimed for simulations of diffusive shock acceleration in solar corona and interplanetary space, but which is now also being extended for global planetary hybrid simulations. In this presentation we discuss challenges and strategies of parallelizing a legacy simulation code as well as possible applications and prospects of a scalable parallel hybrid model for the solar wind interactions of Venus and Mars.
Engineered plant biomass particles coated with bioactive agents
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dooley, James H; Lanning, David N
Plant biomass particles coated with a bioactive agent such as a fertilizer or pesticide, characterized by a length dimension (L) aligned substantially parallel to a grain direction and defining a substantially uniform distance along the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L. In particular, the L.times.H dimensions define a pair of substantially parallel side surfaces characterized by substantially intact longitudinally arrayed fibers, the W.times.H dimensions define a pair of substantially parallel end surfaces characterized by crosscut fibers and end checking between fibers, and the L.times.Wmore » dimensions define a pair of substantially parallel top and bottom surfaces.« less
Electric currents and voltage drops along auroral field lines
NASA Technical Reports Server (NTRS)
Stern, D. P.
1983-01-01
An assessment is presented of the current state of knowledge concerning Birkeland currents and the parallel electric field, with discussions focusing on the Birkeland primary region 1 sheets, the region 2 sheets which parallel them and appear to close in the partial ring current, the cusp currents (which may be correlated with the interplanetary B(y) component), and the Harang filament. The energy required by the parallel electric field and the associated particle acceleration processes appears to be derived from the Birkeland currents, for which evidence is adduced from particles, inverted V spectra, rising ion beams and expanded loss cones. Conics may on the other hand signify acceleration by electrostatic ion cyclotron waves associated with beams accelerated by the parallel electric field.
Hierarchical fractional-step approximations and parallel kinetic Monte Carlo algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arampatzis, Giorgos, E-mail: garab@math.uoc.gr; Katsoulakis, Markos A., E-mail: markos@math.umass.edu; Plechac, Petr, E-mail: plechac@math.udel.edu
2012-10-01
We present a mathematical framework for constructing and analyzing parallel algorithms for lattice kinetic Monte Carlo (KMC) simulations. The resulting algorithms have the capacity to simulate a wide range of spatio-temporal scales in spatially distributed, non-equilibrium physiochemical processes with complex chemistry and transport micro-mechanisms. Rather than focusing on constructing exactly the stochastic trajectories, our approach relies on approximating the evolution of observables, such as density, coverage, correlations and so on. More specifically, we develop a spatial domain decomposition of the Markov operator (generator) that describes the evolution of all observables according to the kinetic Monte Carlo algorithm. This domain decompositionmore » corresponds to a decomposition of the Markov generator into a hierarchy of operators and can be tailored to specific hierarchical parallel architectures such as multi-core processors or clusters of Graphical Processing Units (GPUs). Based on this operator decomposition, we formulate parallel Fractional step kinetic Monte Carlo algorithms by employing the Trotter Theorem and its randomized variants; these schemes, (a) are partially asynchronous on each fractional step time-window, and (b) are characterized by their communication schedule between processors. The proposed mathematical framework allows us to rigorously justify the numerical and statistical consistency of the proposed algorithms, showing the convergence of our approximating schemes to the original serial KMC. The approach also provides a systematic evaluation of different processor communicating schedules. We carry out a detailed benchmarking of the parallel KMC schemes using available exact solutions, for example, in Ising-type systems and we demonstrate the capabilities of the method to simulate complex spatially distributed reactions at very large scales on GPUs. Finally, we discuss work load balancing between processors and propose a re-balancing scheme based on probabilistic mass transport methods.« less
Particle-in-cell simulations with charge-conserving current deposition on graphic processing units
NASA Astrophysics Data System (ADS)
Ren, Chuang; Kong, Xianglong; Huang, Michael; Decyk, Viktor; Mori, Warren
2011-10-01
Recently using CUDA, we have developed an electromagnetic Particle-in-Cell (PIC) code with charge-conserving current deposition for Nvidia graphic processing units (GPU's) (Kong et al., Journal of Computational Physics 230, 1676 (2011). On a Tesla M2050 (Fermi) card, the GPU PIC code can achieve a one-particle-step process time of 1.2 - 3.2 ns in 2D and 2.3 - 7.2 ns in 3D, depending on plasma temperatures. In this talk we will discuss novel algorithms for GPU-PIC including charge-conserving current deposition scheme with few branching and parallel particle sorting. These algorithms have made efficient use of the GPU shared memory. We will also discuss how to replace the computation kernels of existing parallel CPU codes while keeping their parallel structures. This work was supported by U.S. Department of Energy under Grant Nos. DE-FG02-06ER54879 and DE-FC02-04ER54789 and by NSF under Grant Nos. PHY-0903797 and CCF-0747324.
Multinode acoustic focusing for parallel flow cytometry
Piyasena, Menake E.; Suthanthiraraj, Pearlson P. Austin; Applegate, Robert W.; Goumas, Andrew M.; Woods, Travis A.; López, Gabriel P.; Graves, Steven W.
2012-01-01
Flow cytometry can simultaneously measure and analyze multiple properties of single cells or particles with high sensitivity and precision. Yet, conventional flow cytometers have fundamental limitations with regards to analyzing particles larger than about 70 microns, analyzing at flow rates greater than a few hundred microliters per minute, and providing analysis rates greater than 50,000 per second. To overcome these limits, we have developed multi-node acoustic focusing flow cells that can position particles (as small as a red blood cell and as large as 107 microns in diameter) into as many as 37 parallel flow streams. We demonstrate the potential of such flow cells for the development of high throughput, parallel flow cytometers by precision focusing of flow cytometry alignment microspheres, red blood cells, and the analysis of CD4+ cellular immunophenotyping assay. This approach will have significant impact towards the creation of high throughput flow cytometers for rare cell detection applications (e.g. circulating tumor cells), applications requiring large particle analysis, and high volume flow cytometry. PMID:22239072
ERIC Educational Resources Information Center
Teng, Tian-Lih; Taveras, Marypat
2004-01-01
This article outlines the evolution of a unique distance education program that began as a hybrid--combining face-to-face instruction with asynchronous online teaching--and evolved to become an innovative combination of synchronous education using live streaming video, audio, and chat over the Internet, blended with asynchronous online discussions…
Utilizing Spectrum Efficiently (USE)
2011-02-28
18 4.8 Space-Time Coded Asynchronous DS - CDMA with Decentralized MAI Suppression: Performance and...numerical results. 4.8 Space-Time Coded Asynchronous DS - CDMA with Decentralized MAI Suppression: Performance and Spectral Efficiency In [60] multiple...supported at a given signal-to-interference ratio in asynchronous direct-sequence code-division multiple-access ( DS - CDMA ) sys- tems was examined. It was
ERIC Educational Resources Information Center
Hung, Min-Ling; Chou, Chien
2014-01-01
The purpose of this study was to identify dimensions of students' communication satisfaction in an asynchronous discussion forum. An asynchronous discussion may be defined as text-based human-to-human communication via computer networks that provides a platform for the participants to interact with one another to exchange ideas, insights, and…
ERIC Educational Resources Information Center
Jarvela, Sanna; Hakkinen, Paivi
2002-01-01
Examines the quality of asynchronous interaction in Web-based conferencing among preservice teachers. The study combines asynchronous conferencing with peer and mentor collaboration to electronically apprentice student learning. Results point out different levels of Web-based discussion: higher-level, progressive, and lower-level discussion. A…
Asynchronous vs didactic education: it’s too early to throw in the towel on tradition
2013-01-01
Background Asynchronous, computer based instruction is cost effective, allows self-directed pacing and review, and addresses preferences of millennial learners. Current research suggests there is no significant difference in learning compared to traditional classroom instruction. Data are limited for novice learners in emergency medicine. The objective of this study was to compare asynchronous, computer-based instruction with traditional didactics for senior medical students during a week-long intensive course in acute care. We hypothesized both modalities would be equivalent. Methods This was a prospective observational quasi-experimental study of 4th year medical students who were novice learners with minimal prior exposure to curricular elements. We assessed baseline knowledge with an objective pre-test. The curriculum was delivered in either traditional lecture format (shock, acute abdomen, dyspnea, field trauma) or via asynchronous, computer-based modules (chest pain, EKG interpretation, pain management, trauma). An interactive review covering all topics was followed by a post-test. Knowledge retention was measured after 10 weeks. Pre and post-test items were written by a panel of medical educators and validated with a reference group of learners. Mean scores were analyzed using dependent t-test and attitudes were assessed by a 5-point Likert scale. Results 44 of 48 students completed the protocol. Students initially acquired more knowledge from didactic education as demonstrated by mean gain scores (didactic: 28.39% ± 18.06; asynchronous 9.93% ± 23.22). Mean difference between didactic and asynchronous = 18.45% with 95% CI [10.40 to 26.50]; p = 0.0001. Retention testing demonstrated similar knowledge attrition: mean gain scores −14.94% (didactic); -17.61% (asynchronous), which was not significantly different: 2.68% ± 20.85, 95% CI [−3.66 to 9.02], p = 0.399. The attitudinal survey revealed that 60.4% of students believed the asynchronous modules were educational and 95.8% enjoyed the flexibility of the method. 39.6% of students preferred asynchronous education for required didactics; 37.5% were neutral; 23% preferred traditional lectures. Conclusions Asynchronous, computer-based instruction was not equivalent to traditional didactics for novice learners of acute care topics. Interactive, standard didactic education was valuable. Retention rates were similar between instructional methods. Students had mixed attitudes toward asynchronous learning but enjoyed the flexibility. We urge caution in trading in traditional didactic lectures in favor of asynchronous education for novice learners in acute care. PMID:23927420
[A novel proposal explaining sleep disturbance of children in Japan--asynchronization].
Kohyama, Jun
2008-07-01
It has been reported that more than half of the children in Japan suffer from daytime sleepiness. In contrast, about one quarter of junior high-school students in Japan complain of insomnia. According to the International Classification of Sleep Disorders (Second edition), these children could be diagnosed as having behaviorally-induced insufficient sleep syndrome due to inadequate sleeping habits. Getting on adequate amount of sleep should solve such problems;however, such a therapeutic approach often fails. Although social factors are involved in these sleep disturbances, I feel that a novel notion - asynchronization - leads to an understanding of the pathophysiology of disturbances in these children. Further, it could contribute to resolve their problems. The essence of asynchronization is a disturbance of various aspects (e.g., cycle, amplitude, phase, and interrelationship) of the biological rhythms that normally exhibits circadian oscillation. The main cause of asynchronization is hypothesized to be the combination of light exposure during night and the lack of light exposure in the morning. Asynchronization results in the disturbance of variable systems. Thus, symptoms of asynchronization include disturbances of the autonomic nervous system (sleepiness, insomnia, disturbance of hormonal excretion, gastrointestinal problems, etc.) and higher brain function (disorientation, loss of sociality, loss of will or motivation, impaired alertness and performance, etc.). Neurological (attention deficit, aggression, impulsiveness, hyperactivity, etc.), psychiatric (depressive disorders, personality disorders, anxiety disorders, etc.) and somatic (tiredness, fatigue, etc.) disturbances could also be symptoms of asynchronization. At the initial phase of asynchronization, disturbances are functional and can be resolved relatively easily, such as by the establishment of a regular sleep-wakefulness cycle;however, without adequate intervention the disturbances could gradually worsen and become hard to resolve.
NASA Astrophysics Data System (ADS)
Song, Y.; Lysak, R. L.
2015-12-01
Parallel E-fields play a crucial role for the acceleration of charged particles, creating discrete aurorae. However, once the parallel electric fields are produced, they will disappear right away, unless the electric fields can be continuously generated and sustained for a fairly long time. Thus, the crucial question in auroral physics is how to generate such a powerful and self-sustained parallel electric fields which can effectively accelerate charge particles to high energy during a fairly long time. We propose that nonlinear interaction of incident and reflected Alfven wave packets in inhomogeneous auroral acceleration region can produce quasi-stationary non-propagating electromagnetic plasma structures, such as Alfvenic double layers (DLs) and Charge Holes. Such Alfvenic quasi-static structures often constitute powerful high energy particle accelerators. The Alfvenic DL consists of localized self-sustained powerful electrostatic electric fields nested in a low density cavity and surrounded by enhanced magnetic and mechanical stresses. The enhanced magnetic and velocity fields carrying the free energy serve as a local dynamo, which continuously create the electrostatic parallel electric field for a fairly long time. The generated parallel electric fields will deepen the seed low density cavity, which then further quickly boosts the stronger parallel electric fields creating both Alfvenic and quasi-static discrete aurorae. The parallel electrostatic electric field can also cause ion outflow, perpendicular ion acceleration and heating, and may excite Auroral Kilometric Radiation.
Evaluating the performance of the particle finite element method in parallel architectures
NASA Astrophysics Data System (ADS)
Gimenez, Juan M.; Nigro, Norberto M.; Idelsohn, Sergio R.
2014-05-01
This paper presents a high performance implementation for the particle-mesh based method called particle finite element method two (PFEM-2). It consists of a material derivative based formulation of the equations with a hybrid spatial discretization which uses an Eulerian mesh and Lagrangian particles. The main aim of PFEM-2 is to solve transport equations as fast as possible keeping some level of accuracy. The method was found to be competitive with classical Eulerian alternatives for these targets, even in their range of optimal application. To evaluate the goodness of the method with large simulations, it is imperative to use of parallel environments. Parallel strategies for Finite Element Method have been widely studied and many libraries can be used to solve Eulerian stages of PFEM-2. However, Lagrangian stages, such as streamline integration, must be developed considering the parallel strategy selected. The main drawback of PFEM-2 is the large amount of memory needed, which limits its application to large problems with only one computer. Therefore, a distributed-memory implementation is urgently needed. Unlike a shared-memory approach, using domain decomposition the memory is automatically isolated, thus avoiding race conditions; however new issues appear due to data distribution over the processes. Thus, a domain decomposition strategy for both particle and mesh is adopted, which minimizes the communication between processes. Finally, performance analysis running over multicore and multinode architectures are presented. The Courant-Friedrichs-Lewy number used influences the efficiency of the parallelization and, in some cases, a weighted partitioning can be used to improve the speed-up. However the total cputime for cases presented is lower than that obtained when using classical Eulerian strategies.
Acceleration of low-energy ions at parallel shocks with a focused transport model
Zuo, Pingbing; Zhang, Ming; Rassoul, Hamid K.
2013-04-10
Here, we present a test particle simulation on the injection and acceleration of low-energy suprathermal particles by parallel shocks with a focused transport model. The focused transport equation contains all necessary physics of shock acceleration, but avoids the limitation of diffusive shock acceleration (DSA) that requires a small pitch angle anisotropy. This simulation verifies that the particles with speeds of a fraction of to a few times the shock speed can indeed be directly injected and accelerated into the DSA regime by parallel shocks. At higher energies starting from a few times the shock speed, the energy spectrum of acceleratedmore » particles is a power law with the same spectral index as the solution of standard DSA theory, although the particles are highly anisotropic in the upstream region. The intensity, however, is different from that predicted by DSA theory, indicating a different level of injection efficiency. It is found that the shock strength, the injection speed, and the intensity of an electric cross-shock potential (CSP) jump can affect the injection efficiency of the low-energy particles. A stronger shock has a higher injection efficiency. In addition, if the speed of injected particles is above a few times the shock speed, the produced power-law spectrum is consistent with the prediction of standard DSA theory in both its intensity and spectrum index with an injection efficiency of 1. CSP can increase the injection efficiency through direct particle reflection back upstream, but it has little effect on the energetic particle acceleration once the speed of injected particles is beyond a few times the shock speed. This test particle simulation proves that the focused transport theory is an extension of DSA theory with the capability of predicting the efficiency of particle injection.« less
FIRE HOSE INSTABILITY DRIVEN BY ALPHA PARTICLE TEMPERATURE ANISOTROPY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matteini, L.; Schwartz, S. J.; Hellinger, P.
We investigate properties of a solar wind-like plasma, including a secondary alpha particle population exhibiting a parallel temperature anisotropy with respect to the background magnetic field, using linear and quasi-linear predictions and by means of one-dimensional hybrid simulations. We show that anisotropic alpha particles can drive a parallel fire hose instability analogous to that generated by protons, but that, remarkably, can also be triggered when the parallel plasma beta of alpha particles is below unity. The wave activity generated by the alpha anisotropy affects the evolution of the more abundant protons, leading to their anisotropic heating. When both ion speciesmore » have sufficient parallel anisotropies, both of them can drive the instability, and we observe the generation of two distinct peaks in the spectra of the fluctuations, with longer wavelengths associated to alphas and shorter ones to protons. If a non-zero relative drift is present, the unstable modes propagate preferentially in the direction of the drift associated with the unstable species. The generated waves scatter particles and reduce their temperature anisotropy to a marginally stable state, and, moreover, they significantly reduce the relative drift between the two ion populations. The coexistence of modes excited by both species leads to saturation of the plasma in distinct regions of the beta/anisotropy parameter space for protons and alpha particles, in good agreement with in situ solar wind observations. Our results confirm that fire hose instabilities are likely at work in the solar wind and limit the anisotropy of different ion species in the plasma.« less
Stand-Alone and Hybrid Positioning Using Asynchronous Pseudolites
Gioia, Ciro; Borio, Daniele
2015-01-01
global navigation satellite system (GNSS) receivers are usually unable to achieve satisfactory performance in difficult environments, such as open-pit mines, urban canyons and indoors. Pseudolites have the potential to extend GNSS usage and significantly improve receiver performance in such environments by providing additional navigation signals. This also applies to asynchronous pseudolite systems, where different pseudolites operate in an independent way. Asynchronous pseudolite systems require, however, dedicated strategies in order to properly integrate GNSS and pseudolite measurements. In this paper, several asynchronous pseudolite/GNSS integration strategies are considered: loosely- and tightly-coupled approaches are developed and combined with pseudolite proximity and receiver signal strength (RSS)-based positioning. The performance of the approaches proposed has been tested in different scenarios, including static and kinematic conditions. The tests performed demonstrate that the methods developed are effective techniques for integrating heterogeneous measurements from different sources, such as asynchronous pseudolites and GNSS. PMID:25609041
Distributed Consensus of Stochastic Delayed Multi-agent Systems Under Asynchronous Switching.
Wu, Xiaotai; Tang, Yang; Cao, Jinde; Zhang, Wenbing
2016-08-01
In this paper, the distributed exponential consensus of stochastic delayed multi-agent systems with nonlinear dynamics is investigated under asynchronous switching. The asynchronous switching considered here is to account for the time of identifying the active modes of multi-agent systems. After receipt of confirmation of mode's switching, the matched controller can be applied, which means that the switching time of the matched controller in each node usually lags behind that of system switching. In order to handle the coexistence of switched signals and stochastic disturbances, a comparison principle of stochastic switched delayed systems is first proved. By means of this extended comparison principle, several easy to verified conditions for the existence of an asynchronously switched distributed controller are derived such that stochastic delayed multi-agent systems with asynchronous switching and nonlinear dynamics can achieve global exponential consensus. Two examples are given to illustrate the effectiveness of the proposed method.
Hydrodynamic advantages of swimming by salp chains.
Sutherland, Kelly R; Weihs, Daniel
2017-08-01
Salps are marine invertebrates comprising multiple jet-propelled swimming units during a colonial life-cycle stage. Using theory, we show that asynchronous swimming with multiple pulsed jets yields substantial hydrodynamic benefit due to the production of steady swimming velocities, which limit drag. Laboratory comparisons of swimming kinematics of aggregate salps ( Salpa fusiformis and Weelia cylindrica ) using high-speed video supported that asynchronous swimming by aggregates results in a smoother velocity profile and showed that this smoother velocity profile is the result of uncoordinated, asynchronous swimming by individual zooids. In situ flow visualizations of W. cylindrica swimming wakes revealed that another consequence of asynchronous swimming is that fluid interactions between jet wakes are minimized. Although the advantages of multi-jet propulsion have been mentioned elsewhere, this is the first time that the theory has been quantified and the role of asynchronous swimming verified using experimental data from the laboratory and the field. © 2017 The Author(s).
Transport in the plateau regime in a tokamak pedestal
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seol, J.; Shaing, K. C.
In a tokamak H-mode, a strong E Multiplication-Sign B flow shear is generated during the L-H transition. Turbulence in a pedestal is suppressed significantly by this E Multiplication-Sign B flow shear. In this case, neoclassical transport may become important. The neoclassical fluxes are calculated in the plateau regime with the parallel plasma flow using their kinetic definitions. In an axisymmetric tokamak, the neoclassical particles fluxes can be decomposed into the banana-plateau flux and the Pfirsch-Schlueter flux. The banana-plateau particle flux is driven by the parallel viscous force and the Pfirsch-Schlueter flux by the poloidal variation of the friction force. Themore » combined quantity of the radial electric field and the parallel flow is determined by the flux surface averaged parallel momentum balance equation rather than requiring the ambipolarity of the total particle fluxes. In this process, the Pfirsch-Schlueter flux does not appear in the flux surface averaged parallel momentum equation. Only the banana-plateau flux is used to determine the parallel flow in the form of the flux surface averaged parallel viscosity. The heat flux, obtained using the solution of the parallel momentum balance equation, decreases exponentially in the presence of sonic M{sub p} without any enhancement over that in the standard neoclassical theory. Here, M{sub p} is a combination of the poloidal E Multiplication-Sign B flow and the parallel mass flow. The neoclassical bootstrap current in the plateau regime is presented. It indicates that the neoclassical bootstrap current also is related only to the banana-plateau fluxes. Finally, transport fluxes are calculated when M{sub p} is large enough to make the parallel electron viscosity comparable with the parallel ion viscosity. It is found that the bootstrap current has a finite value regardless of the magnitude of M{sub p}.« less
Precision wood particle feedstocks
Dooley, James H; Lanning, David N
2013-07-30
Wood particles having fibers aligned in a grain, wherein: the wood particles are characterized by a length dimension (L) aligned substantially parallel to the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L; the L.times.H dimensions define two side surfaces characterized by substantially intact longitudinally arrayed fibers; the W.times.H dimensions define two cross-grain end surfaces characterized individually as aligned either normal to the grain or oblique to the grain; the L.times.W dimensions define two substantially parallel top and bottom surfaces; and, a majority of the W.times.H surfaces in the mixture of wood particles have end checking.
Elliptically polarizing adjustable phase insertion device
Carr, R.
1995-01-17
An insertion device for extracting polarized electromagnetic energy from a beam of particles is disclosed. The insertion device includes four linear arrays of magnets which are aligned with the particle beam. The magnetic field strength to which the particles are subjected is adjusted by altering the relative alignment of the arrays in a direction parallel to that of the particle beam. Both the energy and polarization of the extracted energy may be varied by moving the relevant arrays parallel to the beam direction. The present invention requires a substantially simpler and more economical superstructure than insertion devices in which the magnetic field strength is altered by changing the gap between arrays of magnets. 3 figures.
Basic concepts and architectural details of the Delphi trigger system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bocci, V.; Booth, P.S.L.; Bozzo, M.
1995-08-01
Delphi (DEtector with Lepton, Photon and Hadron Identification) is one of the four experiments of the LEP (Large Electron Positron) collider at CERN. The detector is laid out to provide a nearly 4 {pi} coverage for charged particle tracking, electromagnetic, hadronic calorimetry and extended particle identification. The trigger system consists of four levels. The first two are synchronous with the BCO (Beam Cross Over) and rely on hardwired control units, while the last two are performed asynchronously with respect to the BCO and are driven by the Delphi host computers. The aim of this paper is to give a comprehensivemore » global view of the trigger system architecture, presenting in detail the first two levels, their various hardware components and the latest modifications introduced in order to improve their performance and make more user friendly the whole software user interface.« less
Parallel momentum input by tangential neutral beam injections in stellarator and heliotron plasmas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nishimura, S., E-mail: nishimura.shin@lhd.nifs.ac.jp; Nakamura, Y.; Nishioka, K.
The configuration dependence of parallel momentum inputs to target plasma particle species by tangentially injected neutral beams is investigated in non-axisymmetric stellarator/heliotron model magnetic fields by assuming the existence of magnetic flux-surfaces. In parallel friction integrals of the full Rosenbluth-MacDonald-Judd collision operator in thermal particles' kinetic equations, numerically obtained eigenfunctions are used for excluding trapped fast ions that cannot contribute to the friction integrals. It is found that the momentum inputs to thermal ions strongly depend on magnetic field strength modulations on the flux-surfaces, while the input to electrons is insensitive to the modulation. In future plasma flow studies requiringmore » flow calculations of all particle species in more general non-symmetric toroidal configurations, the eigenfunction method investigated here will be useful.« less
Mode structure symmetry breaking of energetic particle driven beta-induced Alfvén eigenmode
NASA Astrophysics Data System (ADS)
Lu, Z. X.; Wang, X.; Lauber, Ph.; Zonca, F.
2018-01-01
The mode structure symmetry breaking of energetic particle driven Beta-induced Alfvén Eigenmode (BAE) is studied based on global theory and simulation. The weak coupling formula gives a reasonable estimate of the local eigenvalue compared with global hybrid simulation using XHMGC. The non-perturbative effect of energetic particles on global mode structure symmetry breaking in radial and parallel (along B) directions is demonstrated. With the contribution from energetic particles, two dimensional (radial and poloidal) BAE mode structures with symmetric/asymmetric tails are produced using an analytical model. It is demonstrated that the symmetry breaking in radial and parallel directions is intimately connected. The effects of mode structure symmetry breaking on nonlinear physics, energetic particle transport, and the possible insight for experimental studies are discussed.
Data processing in Software-type Wave-Particle Interaction Analyzer onboard the Arase satellite
NASA Astrophysics Data System (ADS)
Hikishima, Mitsuru; Kojima, Hirotsugu; Katoh, Yuto; Kasahara, Yoshiya; Kasahara, Satoshi; Mitani, Takefumi; Higashio, Nana; Matsuoka, Ayako; Miyoshi, Yoshizumi; Asamura, Kazushi; Takashima, Takeshi; Yokota, Shoichiro; Kitahara, Masahiro; Matsuda, Shoya
2018-05-01
The software-type wave-particle interaction analyzer (S-WPIA) is an instrument package onboard the Arase satellite, which studies the magnetosphere. The S-WPIA represents a new method for directly observing wave-particle interactions onboard a spacecraft in a space plasma environment. The main objective of the S-WPIA is to quantitatively detect wave-particle interactions associated with whistler-mode chorus emissions and electrons over a wide energy range (from several keV to several MeV). The quantity of energy exchanges between waves and particles can be represented as the inner product of the wave electric-field vector and the particle velocity vector. The S-WPIA requires accurate measurement of the phase difference between wave and particle gyration. The leading edge of the S-WPIA system allows us to collect comprehensive information, including the detection time, energy, and incoming direction of individual particles and instantaneous-wave electric and magnetic fields, at a high sampling rate. All the collected particle and waveform data are stored in the onboard large-volume data storage. The S-WPIA executes calculations asynchronously using the collected electric and magnetic wave data, data acquired from multiple particle instruments, and ambient magnetic-field data. The S-WPIA has the role of handling large amounts of raw data that are dedicated to calculations of the S-WPIA. Then, the results are transferred to the ground station. This paper describes the design of the S-WPIA and its calculations in detail, as implemented onboard Arase.[Figure not available: see fulltext.
NASA Astrophysics Data System (ADS)
Yunxiao, CAO; Zhiqiang, WANG; Jinjun, WANG; Guofeng, LI
2018-05-01
Electrostatic separation has been extensively used in mineral processing, and has the potential to separate gangue minerals from raw talcum ore. As for electrostatic separation, the particle charging status is one of important influence factors. To describe the talcum particle charging status in a parallel plate electrostatic separator accurately, this paper proposes a modern images processing method. Based on the actual trajectories obtained from sequence images of particle movement and the analysis of physical forces applied on a charged particle, a numerical model is built, which could calculate the charge-to-mass ratios represented as the charging status of particle and simulate the particle trajectories. The simulated trajectories agree well with the experimental results obtained by images processing. In addition, chemical composition analysis is employed to reveal the relationship between ferrum gangue mineral content and charge-to-mass ratios. Research results show that the proposed method is effective for describing the particle charging status in electrostatic separation.
ERIC Educational Resources Information Center
Alqadoumi, Omar Mohamed
2012-01-01
Previous studies in the field of e-tutoring dealt either with asynchronous tutoring or synchronous conferencing as modes for providing e-tutoring services to English learners. This qualitative research study reports the experiences of Arab ESL tutees with both asynchronous tutoring and synchronous conferencing. It also reports the experiences of…
ERIC Educational Resources Information Center
Angeli, Charoula; Schwartz, Neil H.
2016-01-01
Two hundred and eighty undergraduates from universities in two countries were asked to read didactic material, and then think and write about potential solutions to an ill-defined problem. The writing was conducted within a synchronous or asynchronous computer-mediated communication (CMC) environment. Asynchronous CMC took the form of email…
Localized radio frequency communication using asynchronous transfer mode protocol
Witzke, Edward L [Edgewood, NM; Robertson, Perry J [Albuquerque, NM; Pierson, Lyndon G [Albuquerque, NM
2007-08-14
A localized wireless communication system for communication between a plurality of circuit boards, and between electronic components on the circuit boards. Transceivers are located on each circuit board and electronic component. The transceivers communicate with one another over spread spectrum radio frequencies. An asynchronous transfer mode protocol controls communication flow with asynchronous transfer mode switches located on the circuit boards.
Huang, Yuecheng; Cheng, Wuyi; Luo, Sida; Luo, Yun; Ma, Chengchen; He, Tailin
2016-01-01
The features of the asynchronous correlation between accident indices and the factors that influence accidents can provide an effective reference for warnings of coal mining accidents. However, what are the features of this correlation? To answer this question, data from the China coal price index and the number of deaths from coal mining accidents were selected as the sample data. The fluctuation modes of the asynchronous correlation between the two data sets were defined according to the asynchronous correlation coefficients, symbolization, and sliding windows. We then built several directed and weighted network models, within which the fluctuation modes and the transformations between modes were represented by nodes and edges. Then, the features of the asynchronous correlation between these two variables could be studied from a perspective of network topology. We found that the correlation between the price index and the accidental deaths was asynchronous and fluctuating. Certain aspects, such as the key fluctuation modes, the subgroups characteristics, the transmission medium, the periodicity and transmission path length in the network, were analyzed by using complex network theory, analytical methods and spectral analysis method. These results provide a scientific reference for generating warnings for coal mining accidents based on economic indices.
Engineered plant biomass feedstock particles
Dooley, James H [Federal Way, WA; Lanning, David N [Federal Way, WA; Broderick, Thomas F [Lake Forest Park, WA
2011-10-18
A novel class of flowable biomass feedstock particles with unusually large surface areas that can be manufactured in remarkably uniform sizes using low-energy comminution techniques. The feedstock particles are roughly parallelepiped in shape and characterized by a length dimension (L) aligned substantially with the grain direction and defining a substantially uniform distance along the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L. The particles exhibit a disrupted grain structure with prominent end and surface checks that greatly enhances their skeletal surface area as compared to their envelope surface area. The L.times.H dimensions define a pair of substantially parallel side surfaces characterized by substantially intact longitudinally arrayed fibers. The W.times.H dimensions define a pair of substantially parallel end surfaces characterized by crosscut fibers and end checking between fibers. The L.times.W dimensions define a pair of substantially parallel top surfaces characterized by some surface checking between longitudinally arrayed fibers. At least 80% of the particles pass through a 1/4 inch screen having a 6.3 mm nominal sieve opening but are retained by a No. 10 screen having a 2 mm nominal sieve opening. The feedstock particles are manufactured from a variety of plant biomass materials including wood, crop residues, plantation grasses, hemp, bagasse, and bamboo.
NASA Astrophysics Data System (ADS)
Yan, Beichuan; Regueiro, Richard A.
2018-02-01
A three-dimensional (3D) DEM code for simulating complex-shaped granular particles is parallelized using message-passing interface (MPI). The concepts of link-block, ghost/border layer, and migration layer are put forward for design of the parallel algorithm, and theoretical scalability function of 3-D DEM scalability and memory usage is derived. Many performance-critical implementation details are managed optimally to achieve high performance and scalability, such as: minimizing communication overhead, maintaining dynamic load balance, handling particle migrations across block borders, transmitting C++ dynamic objects of particles between MPI processes efficiently, eliminating redundant contact information between adjacent MPI processes. The code executes on multiple US Department of Defense (DoD) supercomputers and tests up to 2048 compute nodes for simulating 10 million three-axis ellipsoidal particles. Performance analyses of the code including speedup, efficiency, scalability, and granularity across five orders of magnitude of simulation scale (number of particles) are provided, and they demonstrate high speedup and excellent scalability. It is also discovered that communication time is a decreasing function of the number of compute nodes in strong scaling measurements. The code's capability of simulating a large number of complex-shaped particles on modern supercomputers will be of value in both laboratory studies on micromechanical properties of granular materials and many realistic engineering applications involving granular materials.
NASA Astrophysics Data System (ADS)
Maneva, Yana; Poedts, Stefaan
2017-04-01
The electromagnetic fluctuations in the solar wind represent a zoo of plasma waves with different properties, whose wavelengths range from largest fluid scales to the smallest dissipation scales. By nature the power spectrum of the magnetic fluctuations is anisotropic with different spectral slopes in parallel and perpendicular directions with respect to the background magnetic field. Furthermore, the magnetic field power spectra steepen as one moves from the inertial to the dissipation range and we observe multiple spectral breaks with different slopes in parallel and perpendicular direction at the ion scales and beyond. The turbulent dissipation of magnetic field fluctuations at the sub-ion scales is believed to go into local ion heating and acceleration, so that the spectral breaks are typically associated with particle energization. The gained energy can be in the form of anisotropic heating, formation of non-thermal features in the particle velocity distributions functions, and redistribution of the differential acceleration between the different ion populations. To study the relation between the evolution of the anisotropic turbulent spectra and the particle heating at the ion and sub-ion scales we perform a series of 2.5D hybrid simulations in a collisionless drifting proton-alpha plasma. We neglect the fast electron dynamics and treat the electrons as an isothermal fluid electrons, whereas the protons and a minor population of alpha particles are evolved in a fully kinetic manner. We start with a given wave spectrum and study the evolution of the magnetic field spectral slopes as a function of the parallel and perpendicular wave¬numbers. Simultaneously, we track the particle response and the energy exchange between the parallel and perpendicular scales. We observe anisotropic behavior of the turbulent power spectra with steeper slopes along the dominant energy-containing direction. This means that for parallel and quasi-parallel waves we have steeper spectral slope in parallel direction, whereas for highly oblique waves the dissipation occurs predominantly in perpendicular direction and the spectral slopes are steeper across the background magnetic field. The value of the spectral slopes depends on the angle of propagation, the spectral range, as well as the plasma properties. In general the dissipation is stronger at small scales and the corresponding spectral slopes there are steeper. For parallel and quasi-parallel propagation the prevailing energy cascade remains along the magnetic field, whereas for initially isotropic oblique turbulence the cascade develops mainly in perpendicular direction.
Support for Online Calibration in the ALICE HLT Framework
NASA Astrophysics Data System (ADS)
Krzewicki, Mikolaj; Rohr, David; Zampolli, Chiara; Wiechula, Jens; Gorbunov, Sergey; Chauvin, Alex; Vorobyev, Ivan; Weber, Steffen; Schweda, Kai; Shahoyan, Ruben; Lindenstruth, Volker;
2017-10-01
The ALICE detector employs sub detectors sensitive to environmental conditions such as pressure and temperature, e.g. the time projection chamber (TPC). A precise reconstruction of particle trajectories requires precise calibration of these detectors. Performing the calibration in real time in the HLT improves the online reconstruction and potentially renders certain offline calibration steps obsolete, speeding up offline physics analysis. For LHC Run 3, starting in 2020 when data reduction will rely on reconstructed data, online calibration becomes a necessity. In order to run the calibration online, the HLT now supports the processing of tasks that typically run offline. These tasks run massively in parallel on all HLT compute nodes and their output is gathered and merged periodically. The calibration results are both stored offline for later use and fed back into the HLT chain via a feedback loop in order to apply calibration information to the online track reconstruction. Online calibration and feedback loop are subject to certain time constraints in order to provide up-to-date calibration information and they must not interfere with ALICE data taking. Our approach to run these tasks in asynchronous processes enables us to separate them from normal data taking in a way that makes it failure resilient. We performed a first test of online TPC drift time calibration under real conditions during the heavy-ion run in December 2015. We present an analysis and conclusions of this first test, new improvements and developments based on this, as well as our current scheme to commission this for production use.
NASA Astrophysics Data System (ADS)
Dinesh, K. K.; Jayaraj, S.
2008-10-01
Present paper deals with temperature driven mass deposition rate of particles known as thermophoretic wall flux when a hot flue gas in natural convection flow through a cooled isothermal vertical parallel plate channel. Present study finds application in particle filters used to trap soot particles from post combustion gases issuing out of small furnaces with low technical implications. Governing equations are solved using finite difference marching technique with channel inlet values as initial values. Channel heights required to regain hydrostatic pressure at the exit are estimated for various entry velocities. Effect of temperature ratio between wall and gas on thermophoretic wall flux is analysed and wall flux found to increase with decrease in temperature ratio. Results are compared with published works wherever possible and can be used to predict particle deposition rate as well as the conditions favourable for maximum particle deposition rate.
PARAVT: Parallel Voronoi tessellation code
NASA Astrophysics Data System (ADS)
González, R. E.
2016-10-01
In this study, we present a new open source code for massive parallel computation of Voronoi tessellations (VT hereafter) in large data sets. The code is focused for astrophysical purposes where VT densities and neighbors are widely used. There are several serial Voronoi tessellation codes, however no open source and parallel implementations are available to handle the large number of particles/galaxies in current N-body simulations and sky surveys. Parallelization is implemented under MPI and VT using Qhull library. Domain decomposition takes into account consistent boundary computation between tasks, and includes periodic conditions. In addition, the code computes neighbors list, Voronoi density, Voronoi cell volume, density gradient for each particle, and densities on a regular grid. Code implementation and user guide are publicly available at https://github.com/regonzar/paravt.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Xujun; Li, Jiyuan; Jiang, Xikai
An efficient parallel Stokes’s solver is developed towards the complete inclusion of hydrodynamic interactions of Brownian particles in any geometry. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green’s function formalism. We present a scalable parallel computational approach, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the General geometry Ewald-like method. Our approach employs a highly-efficient iterative finite element Stokes’ solver for the accurate treatment of long-range hydrodynamic interactions within arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallelmore » Stokes’ solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem result in an O(N) parallel algorithm. We also illustrate the new algorithm in the context of the dynamics of confined polymer solutions in equilibrium and non-equilibrium conditions. Our method is extended to treat suspended finite size particles of arbitrary shape in any geometry using an Immersed Boundary approach.« less
Zhao, Xujun; Li, Jiyuan; Jiang, Xikai; ...
2017-06-29
An efficient parallel Stokes’s solver is developed towards the complete inclusion of hydrodynamic interactions of Brownian particles in any geometry. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green’s function formalism. We present a scalable parallel computational approach, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the General geometry Ewald-like method. Our approach employs a highly-efficient iterative finite element Stokes’ solver for the accurate treatment of long-range hydrodynamic interactions within arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallelmore » Stokes’ solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem result in an O(N) parallel algorithm. We also illustrate the new algorithm in the context of the dynamics of confined polymer solutions in equilibrium and non-equilibrium conditions. Our method is extended to treat suspended finite size particles of arbitrary shape in any geometry using an Immersed Boundary approach.« less
NASA Astrophysics Data System (ADS)
Zhou, Pengwei; Zhong, Yunbo; Wang, Huai; Long, Qiong; Li, Fu; Sun, Zongqian; Dong, Licheng; Fan, Lijun
2013-10-01
The influence of an external parallel strong parallel magnetic field (respect to current) on the electrocodeposition of nano-silicon particles into an iron matrix has been studied in this paper. Test results show that magnetic field has a great influence on the distribution of silicon, as well as the surface morphology and the thickness of the composite coatings. When no magnetic field was applied, a high current density was needed to get high concentration of silicon particles, while that could be easily obtained at a low current density with a 2 T parallel magnetic field. However, Owing to the unevenness of the current density J-distribution on the surface of the electrode in 8 T, the thicker and rougher composite deposits appear in the edge region (L or R region), and the thinner and smoother ones appear in the middle region (M). Meanwhile, the distribution curve of silicon content looks like a “pan” along the center line of coatings. A possible mechanism combining to the numerical simulation results was suggested out to illustrate the obtained experiment results.
Myria: Scalable Analytics as a Service
NASA Astrophysics Data System (ADS)
Howe, B.; Halperin, D.; Whitaker, A.
2014-12-01
At the UW eScience Institute, we're working to empower non-experts, especially in the sciences, to write and use data-parallel algorithms. To this end, we are building Myria, a web-based platform for scalable analytics and data-parallel programming. Myria's internal model of computation is the relational algebra extended with iteration, such that every program is inherently data-parallel, just as every query in a database is inherently data-parallel. But unlike databases, iteration is a first class concept, allowing us to express machine learning tasks, graph traversal tasks, and more. Programs can be expressed in a number of languages and can be executed on a number of execution environments, but we emphasize a particular language called MyriaL that supports both imperative and declarative styles and a particular execution engine called MyriaX that uses an in-memory column-oriented representation and asynchronous iteration. We deliver Myria over the web as a service, providing an editor, performance analysis tools, and catalog browsing features in a single environment. We find that this web-based "delivery vector" is critical in reaching non-experts: they are insulated from irrelevant effort technical work associated with installation, configuration, and resource management. The MyriaX backend, one of several execution runtimes we support, is a main-memory, column-oriented, RDBMS-on-the-worker system that supports cyclic data flows as a first-class citizen and has been shown to outperform competitive systems on 100-machine cluster sizes. I will describe the Myria system, give a demo, and present some new results in large-scale oceanographic microbiology.
Ros, Wynand JG; Schrijvers, Guus
2014-01-01
Background In support of professional practice, asynchronous communication between the patient and the provider is implemented separately or in combination with Internet-based self-management interventions. This interaction occurs primarily through electronic messaging or discussion boards. There is little evidence as to whether it is a useful tool for chronically ill patients to support their self-management and increase the effectiveness of interventions. Objective The aim of our study was to review the use and usability of patient-provider asynchronous communication for chronically ill patients and the effects of such communication on health behavior, health outcomes, and patient satisfaction. Methods A literature search was performed using PubMed and Embase. The quality of the articles was appraised according to the National Institute for Health and Clinical Excellence (NICE) criteria. The use and usability of the asynchronous communication was analyzed by examining the frequency of use and the number of users of the interventions with asynchronous communication, as well as of separate electronic messaging. The effectiveness of asynchronous communication was analyzed by examining effects on health behavior, health outcomes, and patient satisfaction. Results Patients’ knowledge concerning their chronic condition increased and they seemed to appreciate being able to communicate asynchronously with their providers. They not only had specific questions but also wanted to communicate about feeling ill. A decrease in visits to the physician was shown in two studies (P=.07, P=.07). Increases in self-management/self-efficacy for patients with back pain, dyspnea, and heart failure were found. Positive health outcomes were shown in 12 studies, where the clinical outcomes for diabetic patients (HbA1c level) and for asthmatic patients (forced expiratory volume [FEV]) improved. Physical symptoms improved in five studies. Five studies generated a variety of positive psychosocial outcomes. Conclusions The effect of asynchronous communication is not shown unequivocally in these studies. Patients seem to be interested in using email. Patients are willing to participate and are taking the initiative to discuss health issues with their providers. Additional testing of the effects of asynchronous communication on self-management in chronically ill patients is needed. PMID:24434570
de Jong, Catharina Carolina; Ros, Wynand Jg; Schrijvers, Guus
2014-01-16
In support of professional practice, asynchronous communication between the patient and the provider is implemented separately or in combination with Internet-based self-management interventions. This interaction occurs primarily through electronic messaging or discussion boards. There is little evidence as to whether it is a useful tool for chronically ill patients to support their self-management and increase the effectiveness of interventions. The aim of our study was to review the use and usability of patient-provider asynchronous communication for chronically ill patients and the effects of such communication on health behavior, health outcomes, and patient satisfaction. A literature search was performed using PubMed and Embase. The quality of the articles was appraised according to the National Institute for Health and Clinical Excellence (NICE) criteria. The use and usability of the asynchronous communication was analyzed by examining the frequency of use and the number of users of the interventions with asynchronous communication, as well as of separate electronic messaging. The effectiveness of asynchronous communication was analyzed by examining effects on health behavior, health outcomes, and patient satisfaction. Patients' knowledge concerning their chronic condition increased and they seemed to appreciate being able to communicate asynchronously with their providers. They not only had specific questions but also wanted to communicate about feeling ill. A decrease in visits to the physician was shown in two studies (P=.07, P=.07). Increases in self-management/self-efficacy for patients with back pain, dyspnea, and heart failure were found. Positive health outcomes were shown in 12 studies, where the clinical outcomes for diabetic patients (HbA1c level) and for asthmatic patients (forced expiratory volume [FEV]) improved. Physical symptoms improved in five studies. Five studies generated a variety of positive psychosocial outcomes. The effect of asynchronous communication is not shown unequivocally in these studies. Patients seem to be interested in using email. Patients are willing to participate and are taking the initiative to discuss health issues with their providers. Additional testing of the effects of asynchronous communication on self-management in chronically ill patients is needed.
Progress on the Multiphysics Capabilities of the Parallel Electromagnetic ACE3P Simulation Suite
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kononenko, Oleksiy
2015-03-26
ACE3P is a 3D parallel simulation suite that is being developed at SLAC National Accelerator Laboratory. Effectively utilizing supercomputer resources, ACE3P has become a key tool for the coupled electromagnetic, thermal and mechanical research and design of particle accelerators. Based on the existing finite-element infrastructure, a massively parallel eigensolver is developed for modal analysis of mechanical structures. It complements a set of the multiphysics tools in ACE3P and, in particular, can be used for the comprehensive study of microphonics in accelerating cavities ensuring the operational reliability of a particle accelerator.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lusk, Ewing; Butler, Ralph; Pieper, Steven C.
Here, we take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how an extremely simple task model has evolved into a system, asynchronous dynamic load balancing (ADLB), and a scalable implementation capable of supporting sophisticated applications on today’s (and tomorrow’s) largest supercomputers; and we illustrate the use of ADLB with a Green’s function Monte Carlo application, a modern, mature nuclear physics code in production use. Our lesson is that by surrendering a certain amount of generalitymore » and thus applicability, a minimal programming model (in terms of its basic concepts and the size of its application programmer interface) can achieve extreme scalability without introducing complexity.« less
A low delay transmission method of multi-channel video based on FPGA
NASA Astrophysics Data System (ADS)
Fu, Weijian; Wei, Baozhi; Li, Xiaobin; Wang, Quan; Hu, Xiaofei
2018-03-01
In order to guarantee the fluency of multi-channel video transmission in video monitoring scenarios, we designed a kind of video format conversion method based on FPGA and its DMA scheduling for video data, reduces the overall video transmission delay.In order to sace the time in the conversion process, the parallel ability of FPGA is used to video format conversion. In order to improve the direct memory access (DMA) writing transmission rate of PCIe bus, a DMA scheduling method based on asynchronous command buffer is proposed. The experimental results show that this paper designs a low delay transmission method based on FPGA, which increases the DMA writing transmission rate by 34% compared with the existing method, and then the video overall delay is reduced to 23.6ms.
Optical data communication: fundamentals and future directions
NASA Astrophysics Data System (ADS)
DeCusatis, Casimer M.
1998-12-01
An overview of optical data communications is provided, beginning with a brief history and discussion of the unique requirements that distinguish this subfield from related areas such as telecommunications. Each of the major datacom standards is then discussed, including the physical layer specification, distances and data rates, fiber and connector types, data frame structures, and network considerations. These standards can be categorized by their prevailing applications, either storage [Enterprise System Connection, Fiber Channel Connection, and Fiber Channel], coupling (Fiber Channel), or networking [Fiber Distributed Data Interface, Gigabit Ethernet, and asynchronous transfer mode/synchronous optical network]. We also present some emerging technologies and their applications, including parallel optical interconnects, plastic optical fiber, wavelength multiplexing, and free- space optical links. We conclude with some cost/performance trade-offs and predictions of future bandwidth trends.
A robust low-rate coding scheme for packet video
NASA Technical Reports Server (NTRS)
Chen, Y. C.; Sayood, Khalid; Nelson, D. J.; Arikan, E. (Editor)
1991-01-01
Due to the rapidly evolving field of image processing and networking, video information promises to be an important part of telecommunication systems. Although up to now video transmission has been transported mainly over circuit-switched networks, it is likely that packet-switched networks will dominate the communication world in the near future. Asynchronous transfer mode (ATM) techniques in broadband-ISDN can provide a flexible, independent and high performance environment for video communication. For this paper, the network simulator was used only as a channel in this simulation. Mixture blocking coding with progressive transmission (MBCPT) has been investigated for use over packet networks and has been found to provide high compression rate with good visual performance, robustness to packet loss, tractable integration with network mechanics and simplicity in parallel implementation.
Asynchronous vibration problem of centrifugal compressor
NASA Technical Reports Server (NTRS)
Fujikawa, T.; Ishiguro, N.; Ito, M.
1980-01-01
An unstable asynchronous vibration problem in a high pressure centrifugal compressor and the remedial actions against it are described. Asynchronous vibration of the compressor took place when the discharge pressure (Pd) was increased, after the rotor was already at full speed. The typical spectral data of the shaft vibration indicate that as the pressure Pd increases, pre-unstable vibration appears and becomes larger, and large unstable asynchronous vibration occurs suddenly (Pd = 5.49MPa). A computer program was used which calculated the logarithmic decrement and the damped natural frequency of the rotor bearing systems. The analysis of the log-decrement is concluded to be effective in preventing unstable vibration in both the design stage and remedial actions.
Minimizing Concentration Effects in Water-Based, Laminar-Flow Condensation Particle Counters
Lewis, Gregory S.; Hering, Susanne V.
2013-01-01
Concentration effects in water condensation systems, such as used in the water-based condensation particle counter, are explored through numeric modeling and direct measurements. Modeling shows that the condensation heat release and vapor depletion associated with particle activation and growth lowers the peak supersaturation. At higher number concentrations, the diameter of the droplets formed is smaller, and the threshold particle size for activation is higher. This occurs in both cylindrical and parallel plate geometries. For water-based systems we find that condensational heat release is more important than is vapor depletion. We also find that concentration effects can be minimized through use of smaller tube diameters, or more closely spaced parallel plates. Experimental measurements of droplet diameter confirm modeling results. PMID:24436507
A parallel direct-forcing fictitious domain method for simulating microswimmers
NASA Astrophysics Data System (ADS)
Gao, Tong; Lin, Zhaowu
2017-11-01
We present a 3D parallel direct-forcing fictitious domain method for simulating swimming micro-organisms at small Reynolds numbers. We treat the motile micro-swimmers as spherical rigid particles using the ``Squirmer'' model. The particle dynamics are solved on the moving Larangian meshes that overlay upon a fixed Eulerian mesh for solving the fluid motion, and the momentum exchange between the two phases is resolved by distributing pseudo body-forces over the particle interior regions which constrain the background fictitious fluids to follow the particle movement. While the solid and fluid subproblems are solved separately, no inner-iterations are required to enforce numerical convergence. We demonstrate the accuracy and robustness of the method by comparing our results with the existing analytical and numerical studies for various cases of single particle dynamics and particle-particle interactions. We also perform a series of numerical explorations to obtain statistical and rheological measurements to characterize the dynamics and structures of Squirmer suspensions. NSF DMS 1619960.
Optimization of parameters of special asynchronous electric drives
NASA Astrophysics Data System (ADS)
Karandey, V. Yu; Popov, B. K.; Popova, O. B.; Afanasyev, V. L.
2018-03-01
The article considers the solution of the problem of parameters optimization of special asynchronous electric drives. The solution of the problem will allow one to project and create special asynchronous electric drives for various industries. The created types of electric drives will have optimum mass-dimensional and power parameters. It will allow one to realize and fulfill the set characteristics of management of technological processes with optimum level of expenses of electric energy, time of completing the process or other set parameters. The received decision allows one not only to solve a certain optimizing problem, but also to construct dependences between the optimized parameters of special asynchronous electric drives, for example, with the change of power, current in a winding of the stator or rotor, induction in a gap or steel of magnetic conductors and other parameters. On the constructed dependences, it is possible to choose necessary optimum values of parameters of special asynchronous electric drives and their components without carrying out repeated calculations.
Asynchronous Communication of TLNS3DMB Boundary Exchange
NASA Technical Reports Server (NTRS)
Hammond, Dana P.
1997-01-01
This paper describes the recognition of implicit serialization due to coarse-grain, synchronous communication and demonstrates the conversion to asynchronous communication for the exchange of boundary condition information in the Thin-Layer Navier Stokes 3-Dimensional Multi Block (TLNS3DMB) code. The implementation details of using asynchronous communication is provided including buffer allocation, message identification, and barrier control. The IBM SP2 was used for the tests presented.
Barrera-Valencia, Camilo; Benito-Devia, Alexis Vladimir; Vélez-Álvarez, Consuelo; Figueroa-Barrera, Mario; Franco-Idárraga, Sandra Milena
Telepsychiatry is defined as the use of information and communication technology (ICT) in providing remote psychiatric services. Telepsychiatry is applied using two types of communication: synchronous (real time) and asynchronous (store and forward). To determine the cost-effectiveness of a synchronous and an asynchronous telepsychiatric model in prison inmate patients with symptoms of depression. A cost-effectiveness study was performed on a population consisting of 157 patients from the Establecimiento Penitenciario y Carcelario de Mediana Seguridad de Manizales, Colombia. The sample was determined by applying Zung self-administered surveys for depression (1965) and the Hamilton Depression Rating Scale (HDRS), the latter being the tool used for the comparison. Initial Hamilton score, arrival time, duration of system downtime, and clinical effectiveness variables had normal distributions (P>.05). There were significant differences (P<.001) between care costs for the different models, showing that the mean cost of the asynchronous model is less than synchronous model, and making the asynchronous model more cost-effective. The asynchronous model is the most cost-effective model of telepsychiatry care for patients with depression admitted to a detention centre, according to the results of clinical effectiveness, cost measurement, and patient satisfaction. Copyright © 2016 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.
Huang, Yuecheng; Cheng, Wuyi; Luo, Sida; Luo, Yun; Ma, Chengchen; He, Tailin
2016-01-01
The features of the asynchronous correlation between accident indices and the factors that influence accidents can provide an effective reference for warnings of coal mining accidents. However, what are the features of this correlation? To answer this question, data from the China coal price index and the number of deaths from coal mining accidents were selected as the sample data. The fluctuation modes of the asynchronous correlation between the two data sets were defined according to the asynchronous correlation coefficients, symbolization, and sliding windows. We then built several directed and weighted network models, within which the fluctuation modes and the transformations between modes were represented by nodes and edges. Then, the features of the asynchronous correlation between these two variables could be studied from a perspective of network topology. We found that the correlation between the price index and the accidental deaths was asynchronous and fluctuating. Certain aspects, such as the key fluctuation modes, the subgroups characteristics, the transmission medium, the periodicity and transmission path length in the network, were analyzed by using complex network theory, analytical methods and spectral analysis method. These results provide a scientific reference for generating warnings for coal mining accidents based on economic indices. PMID:27902748
Wray, Alisa; Bennett, Kathryn; Boysen-Osborn, Megan; Wiechmann, Warren; Toohey, Shannon
2017-01-01
The aim of this study was to measure the effect of an iPad-based asynchronous curriculum on emergency medicine resident performance on the in-training exam (ITE). We hypothesized that the implementation of an asynchronous curriculum (replacing 1 hour of weekly didactic time) would result in non-inferior ITE scores compared to the historical scores of residents who had participated in the traditional 5-hour weekly didactic curriculum. The study was a retrospective, non-inferiority study. conducted at the University of California, Irvine Emergency Medicine Residency Program. We compared ITE scores from 2012 and 2013, when there were 5 weekly hours of didactic content, with scores from 2014 and 2015, when 1 hour of conference was replaced with asynchro-nous content. Examination results were compared using a non-inferiority data analysis with a 10% margin of difference. Using a non-inferiority test with a 95% confidence interval, there was no difference between the 2 groups (before and after implementation of asynchronous learning), as the confidence interval for the change of the ITE was -3.5 to 2.3 points, whereas the 10% non-inferiority margin was 7.8 points. Replacing 1 hour of didactic conference with asynchronous learning showed no negative impact on resident ITE scores.
NASA Astrophysics Data System (ADS)
Sasikala, R.; Govindarajan, A.; Gayathri, R.
2018-04-01
This paper focus on the result of dust particle between two parallel plates through porous medium in the presence of magnetic field with constant suction in the upper plate and constant injection in the lower plate. The partial differential equations governing the flow are solved by similarity transformation. The velocity of the fluid and the dust particle decreases when there is an increase in the Hartmann number.
Transport of cosmic-ray protons in intermittent heliospheric turbulence: Model and simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alouani-Bibi, Fathallah; Le Roux, Jakobus A., E-mail: fb0006@uah.edu
The transport of charged energetic particles in the presence of strong intermittent heliospheric turbulence is computationally analyzed based on known properties of the interplanetary magnetic field and solar wind plasma at 1 astronomical unit. The turbulence is assumed to be static, composite, and quasi-three-dimensional with a varying energy distribution between a one-dimensional Alfvénic (slab) and a structured two-dimensional component. The spatial fluctuations of the turbulent magnetic field are modeled either as homogeneous with a Gaussian probability distribution function (PDF), or as intermittent on large and small scales with a q-Gaussian PDF. Simulations showed that energetic particle diffusion coefficients both parallelmore » and perpendicular to the background magnetic field are significantly affected by intermittency in the turbulence. This effect is especially strong for parallel transport where for large-scale intermittency results show an extended phase of subdiffusive parallel transport during which cross-field transport diffusion dominates. The effects of intermittency are found to depend on particle rigidity and the fraction of slab energy in the turbulence, yielding a perpendicular to parallel mean free path ratio close to 1 for large-scale intermittency. Investigation of higher order transport moments (kurtosis) indicates that non-Gaussian statistical properties of the intermittent turbulent magnetic field are present in the parallel transport, especially for low rigidity particles at all times.« less
Asynchronous networks: modularization of dynamics theorem
NASA Astrophysics Data System (ADS)
Bick, Christian; Field, Michael
2017-02-01
Building on the first part of this paper, we develop the theory of functional asynchronous networks. We show that a large class of functional asynchronous networks can be (uniquely) represented as feedforward networks connecting events or dynamical modules. For these networks we can give a complete description of the network function in terms of the function of the events comprising the network: the modularization of dynamics theorem. We give examples to illustrate the main results.
2014-08-01
consensus algorithm called randomized gossip is more suitable [7, 8]. In asynchronous randomized gossip algorithms, pairs of neighboring nodes exchange...messages and perform updates in an asynchronous and unattended manner, and they also 1 The class of broadcast gossip algorithms [9, 10, 11, 12] are...dynamics [2] and asynchronous pairwise randomized gossip [7, 8], broadcast gossip algorithms do not require that nodes know the identities of their
Simulating fail-stop in asynchronous distributed systems
NASA Technical Reports Server (NTRS)
Sabel, Laura; Marzullo, Keith
1994-01-01
The fail-stop failure model appears frequently in the distributed systems literature. However, in an asynchronous distributed system, the fail-stop model cannot be implemented. In particular, it is impossible to reliably detect crash failures in an asynchronous system. In this paper, we show that it is possible to specify and implement a failure model that is indistinguishable from the fail-stop model from the point of view of any process within an asynchronous system. We give necessary conditions for a failure model to be indistinguishable from the fail-stop model, and derive lower bounds on the amount of process replication needed to implement such a failure model. We present a simple one-round protocol for implementing one such failure model, which we call simulated fail-stop.
Brenner, Falko S.; Ortner, Tuulia M.; Fay, Doris
2016-01-01
The present study aimed to integrate findings from technology acceptance research with research on applicant reactions to new technology for the emerging selection procedure of asynchronous video interviewing. One hundred six volunteers experienced asynchronous video interviewing and filled out several questionnaires including one on the applicants’ personalities. In line with previous technology acceptance research, the data revealed that perceived usefulness and perceived ease of use predicted attitudes toward asynchronous video interviewing. Furthermore, openness revealed to moderate the relation between perceived usefulness and attitudes toward this particular selection technology. No significant effects emerged for computer self-efficacy, job interview self-efficacy, extraversion, neuroticism, and conscientiousness. Theoretical and practical implications are discussed. PMID:27378969
Development Of A Parallel Performance Model For The THOR Neutral Particle Transport Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yessayan, Raffi; Azmy, Yousry; Schunert, Sebastian
The THOR neutral particle transport code enables simulation of complex geometries for various problems from reactor simulations to nuclear non-proliferation. It is undergoing a thorough V&V requiring computational efficiency. This has motivated various improvements including angular parallelization, outer iteration acceleration, and development of peripheral tools. For guiding future improvements to the code’s efficiency, better characterization of its parallel performance is useful. A parallel performance model (PPM) can be used to evaluate the benefits of modifications and to identify performance bottlenecks. Using INL’s Falcon HPC, the PPM development incorporates an evaluation of network communication behavior over heterogeneous links and a functionalmore » characterization of the per-cell/angle/group runtime of each major code component. After evaluating several possible sources of variability, this resulted in a communication model and a parallel portion model. The former’s accuracy is bounded by the variability of communication on Falcon while the latter has an error on the order of 1%.« less
Yokohama, Noriya; Tsuchimoto, Tadashi; Oishi, Masamichi; Itou, Katsuya
2007-01-20
It has been noted that the downtime of medical informatics systems is often long. Many systems encounter downtimes of hours or even days, which can have a critical effect on daily operations. Such systems remain especially weak in the areas of database and medical imaging data. The scheme design shows the three-layer architecture of the system: application, database, and storage layers. The application layer uses the DICOM protocol (Digital Imaging and Communication in Medicine) and HTTP (Hyper Text Transport Protocol) with AJAX (Asynchronous JavaScript+XML). The database is designed to decentralize in parallel using cluster technology. Consequently, restoration of the database can be done not only with ease but also with improved retrieval speed. In the storage layer, a network RAID (Redundant Array of Independent Disks) system, it is possible to construct exabyte-scale parallel file systems that exploit storage spread. Development and evaluation of the test-bed has been successful in medical information data backup and recovery in a network environment. This paper presents a schematic design of the new medical informatics system that can be accommodated from a recovery and the dynamic Web application for medical imaging distribution using AJAX.
Modified Petri net model sensitivity to workload manipulations
NASA Technical Reports Server (NTRS)
White, S. A.; Mackinnon, D. P.; Lyman, J.
1986-01-01
Modified Petri Nets (MPNs) are investigated as a workload modeling tool. The results of an exploratory study of the sensitivity of MPNs to work load manipulations in a dual task are described. Petri nets have been used to represent systems with asynchronous, concurrent and parallel activities (Peterson, 1981). These characteristics led some researchers to suggest the use of Petri nets in workload modeling where concurrent and parallel activities are common. Petri nets are represented by places and transitions. In the workload application, places represent operator activities and transitions represent events. MPNs have been used to formally represent task events and activities of a human operator in a man-machine system. Some descriptive applications demonstrate the usefulness of MPNs in the formal representation of systems. It is the general hypothesis herein that in addition to descriptive applications, MPNs may be useful for workload estimation and prediction. The results are reported of the first of a series of experiments designed to develop and test a MPN system of workload estimation and prediction. This first experiment is a screening test of MPN model general sensitivity to changes in workload. Positive results from this experiment will justify the more complicated analyses and techniques necessary for developing a workload prediction system.
Line-of-sight deposition method
Patten, J.W.; McClanahan, E.D.; Bayne, M.A.
1980-04-16
A line-of-sight method of depositing a film having substantially 100% of theoretical density on a substrate. A pressure vessel contains a target source having a surface thereof capable of emitting particles therefrom and a substrate with the source surface and the substrate surface positioned such that the source surface is substantially parallel to the direction of the particles impinging upon the substrate surface, the distance between the most remote portion of the substrate surface receiving the particles and the source surface emitting the particles in a direction parallel to the substrate surface being relatively small. The pressure in the vessel is maintained less than about 5 microns to prevent scattering and permit line-of-sight deposition. By this method the angles of incidence of the particles impinging upon the substrate surface are in the range of from about 45/sup 0/ to 90/sup 0/ even when the target surface area is greatly expanded to increase the deposition rate.
Kanarska, Yuliya; Walton, Otis
2015-11-30
Fluid-granular flows are common phenomena in nature and industry. Here, an efficient computational technique based on the distributed Lagrange multiplier method is utilized to simulate complex fluid-granular flows. Each particle is explicitly resolved on an Eulerian grid as a separate domain, using solid volume fractions. The fluid equations are solved through the entire computational domain, however, Lagrange multiplier constrains are applied inside the particle domain such that the fluid within any volume associated with a solid particle moves as an incompressible rigid body. The particle–particle interactions are implemented using explicit force-displacement interactions for frictional inelastic particles similar to the DEMmore » method with some modifications using the volume of an overlapping region as an input to the contact forces. Here, a parallel implementation of the method is based on the SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) library.« less
A 2D electrostatic PIC code for the Mark III Hypercube
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferraro, R.D.; Liewer, P.C.; Decyk, V.K.
We have implemented a 2D electrostastic plasma particle in cell (PIC) simulation code on the Caltech/JPL Mark IIIfp Hypercube. The code simulates plasma effects by evolving in time the trajectories of thousands to millions of charged particles subject to their self-consistent fields. Each particle`s position and velocity is advanced in time using a leap frog method for integrating Newton`s equations of motion in electric and magnetic fields. The electric field due to these moving charged particles is calculated on a spatial grid at each time by solving Poisson`s equation in Fourier space. These two tasks represent the largest part ofmore » the computation. To obtain efficient operation on a distributed memory parallel computer, we are using the General Concurrent PIC (GCPIC) algorithm previously developed for a 1D parallel PIC code.« less
Line-of-sight deposition method
Patten, James W.; McClanahan, Edwin D.; Bayne, Michael A.
1981-01-01
A line-of-sight method of depositing a film having substantially 100% of theoretical density on a substrate. A pressure vessel contains a target source having a surface thereof capable of emitting particles therefrom and a substrate with the source surface and the substrate surface positioned such that the source surface is substantially parallel to the direction of the particles impinging upon the substrate surface, the distance between the most remote portion of the substrate surface receiving the particles and the source surface emitting the particles in a direction parallel to the substrate surface being relatively small. The pressure in the vessel is maintained less than about 5 microns to prevent scattering and permit line-of-sight deposition. By this method the angles of incidence of the particles impinging upon the substrate surface are in the range of from about 45.degree. to 90.degree. even when the target surface area is greatly expanded to increase the deposition rate.
Particle Based Simulations of Complex Systems with MP2C : Hydrodynamics and Electrostatics
NASA Astrophysics Data System (ADS)
Sutmann, Godehard; Westphal, Lidia; Bolten, Matthias
2010-09-01
Particle based simulation methods are well established paths to explore system behavior on microscopic to mesoscopic time and length scales. With the development of new computer architectures it becomes more and more important to concentrate on local algorithms which do not need global data transfer or reorganisation of large arrays of data across processors. This requirement strongly addresses long-range interactions in particle systems, i.e. mainly hydrodynamic and electrostatic contributions. In this article, emphasis is given to the implementation and parallelization of the Multi-Particle Collision Dynamics method for hydrodynamic contributions and a splitting scheme based on Multigrid for electrostatic contributions. Implementations are done for massively parallel architectures and are demonstrated for the IBM Blue Gene/P architecture Jugene in Jülich.
Precision wood particle feedstocks with retained moisture contents of greater than 30% dry basis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dooley, James H; Lanning, David N
Wood particles having fibers aligned in a grain, wherein: the wood particles are characterized by a length dimension (L) aligned substantially parallel to the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L; the L.times.H dimensions define two side surfaces characterized by substantially intact longitudinally arrayed fibers; the W.times.H dimensions define two cross-grain end surfaces characterized individually as aligned either normal to the grain or oblique to the grain; the L.times.W dimensions define two substantially parallel top and bottom surfaces; and, a majority of the W.times.H surfacesmore » in the mixture of wood particles have end checking.« less
Limits on the Efficiency of Event-Based Algorithms for Monte Carlo Neutron Transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romano, Paul K.; Siegel, Andrew R.
The traditional form of parallelism in Monte Carlo particle transport simulations, wherein each individual particle history is considered a unit of work, does not lend itself well to data-level parallelism. Event-based algorithms, which were originally used for simulations on vector processors, may offer a path toward better utilizing data-level parallelism in modern computer architectures. In this study, a simple model is developed for estimating the efficiency of the event-based particle transport algorithm under two sets of assumptions. Data collected from simulations of four reactor problems using OpenMC was then used in conjunction with the models to calculate the speedup duemore » to vectorization as a function of the size of the particle bank and the vector width. When each event type is assumed to have constant execution time, the achievable speedup is directly related to the particle bank size. We observed that the bank size generally needs to be at least 20 times greater than vector size to achieve vector efficiency greater than 90%. Lastly, when the execution times for events are allowed to vary, the vector speedup is also limited by differences in execution time for events being carried out in a single event-iteration.« less
Limits on the Efficiency of Event-Based Algorithms for Monte Carlo Neutron Transport
Romano, Paul K.; Siegel, Andrew R.
2017-07-01
The traditional form of parallelism in Monte Carlo particle transport simulations, wherein each individual particle history is considered a unit of work, does not lend itself well to data-level parallelism. Event-based algorithms, which were originally used for simulations on vector processors, may offer a path toward better utilizing data-level parallelism in modern computer architectures. In this study, a simple model is developed for estimating the efficiency of the event-based particle transport algorithm under two sets of assumptions. Data collected from simulations of four reactor problems using OpenMC was then used in conjunction with the models to calculate the speedup duemore » to vectorization as a function of the size of the particle bank and the vector width. When each event type is assumed to have constant execution time, the achievable speedup is directly related to the particle bank size. We observed that the bank size generally needs to be at least 20 times greater than vector size to achieve vector efficiency greater than 90%. Lastly, when the execution times for events are allowed to vary, the vector speedup is also limited by differences in execution time for events being carried out in a single event-iteration.« less
Simulating coupled dynamics of a rigid-flexible multibody system and compressible fluid
NASA Astrophysics Data System (ADS)
Hu, Wei; Tian, Qiang; Hu, HaiYan
2018-04-01
As a subsequent work of previous studies of authors, a new parallel computation approach is proposed to simulate the coupled dynamics of a rigid-flexible multibody system and compressible fluid. In this approach, the smoothed particle hydrodynamics (SPH) method is used to model the compressible fluid, the natural coordinate formulation (NCF) and absolute nodal coordinate formulation (ANCF) are used to model the rigid and flexible bodies, respectively. In order to model the compressible fluid properly and efficiently via SPH method, three measures are taken as follows. The first is to use the Riemann solver to cope with the fluid compressibility, the second is to define virtual particles of SPH to model the dynamic interaction between the fluid and the multibody system, and the third is to impose the boundary conditions of periodical inflow and outflow to reduce the number of SPH particles involved in the computation process. Afterwards, a parallel computation strategy is proposed based on the graphics processing unit (GPU) to detect the neighboring SPH particles and to solve the dynamic equations of SPH particles in order to improve the computation efficiency. Meanwhile, the generalized-alpha algorithm is used to solve the dynamic equations of the multibody system. Finally, four case studies are given to validate the proposed parallel computation approach.
Network evolution induced by asynchronous stimuli through spike-timing-dependent plasticity.
Yuan, Wu-Jie; Zhou, Jian-Fang; Zhou, Changsong
2013-01-01
In sensory neural system, external asynchronous stimuli play an important role in perceptual learning, associative memory and map development. However, the organization of structure and dynamics of neural networks induced by external asynchronous stimuli are not well understood. Spike-timing-dependent plasticity (STDP) is a typical synaptic plasticity that has been extensively found in the sensory systems and that has received much theoretical attention. This synaptic plasticity is highly sensitive to correlations between pre- and postsynaptic firings. Thus, STDP is expected to play an important role in response to external asynchronous stimuli, which can induce segregative pre- and postsynaptic firings. In this paper, we study the impact of external asynchronous stimuli on the organization of structure and dynamics of neural networks through STDP. We construct a two-dimensional spatial neural network model with local connectivity and sparseness, and use external currents to stimulate alternately on different spatial layers. The adopted external currents imposed alternately on spatial layers can be here regarded as external asynchronous stimuli. Through extensive numerical simulations, we focus on the effects of stimulus number and inter-stimulus timing on synaptic connecting weights and the property of propagation dynamics in the resulting network structure. Interestingly, the resulting feedforward structure induced by stimulus-dependent asynchronous firings and its propagation dynamics reflect both the underlying property of STDP. The results imply a possible important role of STDP in generating feedforward structure and collective propagation activity required for experience-dependent map plasticity in developing in vivo sensory pathways and cortices. The relevance of the results to cue-triggered recall of learned temporal sequences, an important cognitive function, is briefly discussed as well. Furthermore, this finding suggests a potential application for examining STDP by measuring neural population activity in a cultured neural network.
FAST: A fully asynchronous and status-tracking pattern for geoprocessing services orchestration
NASA Astrophysics Data System (ADS)
Wu, Huayi; You, Lan; Gui, Zhipeng; Gao, Shuang; Li, Zhenqiang; Yu, Jingmin
2014-09-01
Geoprocessing service orchestration (GSO) provides a unified and flexible way to implement cross-application, long-lived, and multi-step geoprocessing service workflows by coordinating geoprocessing services collaboratively. Usually, geoprocessing services and geoprocessing service workflows are data and/or computing intensive. The intensity feature may make the execution process of a workflow time-consuming. Since it initials an execution request without blocking other interactions on the client side, an asynchronous mechanism is especially appropriate for GSO workflows. Many critical problems remain to be solved in existing asynchronous patterns for GSO including difficulties in improving performance, status tracking, and clarifying the workflow structure. These problems are a challenge when orchestrating performance efficiency, making statuses instantly available, and constructing clearly structured GSO workflows. A Fully Asynchronous and Status-Tracking (FAST) pattern that adopts asynchronous interactions throughout the whole communication tier of a workflow is proposed for GSO. The proposed FAST pattern includes a mechanism that actively pushes the latest status to clients instantly and economically. An independent proxy was designed to isolate the status tracking logic from the geoprocessing business logic, which assists the formation of a clear GSO workflow structure. A workflow was implemented in the FAST pattern to simulate the flooding process in the Poyang Lake region. Experimental results show that the proposed FAST pattern can efficiently tackle data/computing intensive geoprocessing tasks. The performance of all collaborative partners was improved due to the asynchronous mechanism throughout communication tier. A status-tracking mechanism helps users retrieve the latest running status of a GSO workflow in an efficient and instant way. The clear structure of the GSO workflow lowers the barriers for geospatial domain experts and model designers to compose asynchronous GSO workflows. Most importantly, it provides better support for locating and diagnosing potential exceptions.
Wolff, Sebastian; Bucher, Christian
2013-01-01
This article presents asynchronous collision integrators and a simple asynchronous method treating nodal restraints. Asynchronous discretizations allow individual time step sizes for each spatial region, improving the efficiency of explicit time stepping for finite element meshes with heterogeneous element sizes. The article first introduces asynchronous variational integration being expressed by drift and kick operators. Linear nodal restraint conditions are solved by a simple projection of the forces that is shown to be equivalent to RATTLE. Unilateral contact is solved by an asynchronous variant of decomposition contact response. Therein, velocities are modified avoiding penetrations. Although decomposition contact response is solving a large system of linear equations (being critical for the numerical efficiency of explicit time stepping schemes) and is needing special treatment regarding overconstraint and linear dependency of the contact constraints (for example from double-sided node-to-surface contact or self-contact), the asynchronous strategy handles these situations efficiently and robust. Only a single constraint involving a very small number of degrees of freedom is considered at once leading to a very efficient solution. The treatment of friction is exemplified for the Coulomb model. Special care needs the contact of nodes that are subject to restraints. Together with the aforementioned projection for restraints, a novel efficient solution scheme can be presented. The collision integrator does not influence the critical time step. Hence, the time step can be chosen independently from the underlying time-stepping scheme. The time step may be fixed or time-adaptive. New demands on global collision detection are discussed exemplified by position codes and node-to-segment integration. Numerical examples illustrate convergence and efficiency of the new contact algorithm. Copyright © 2013 The Authors. International Journal for Numerical Methods in Engineering published by John Wiley & Sons, Ltd. PMID:23970806
Bowers, E K; Thompson, C F; Sakaluk, S K
2016-03-01
Sex allocation theory assumes individual plasticity in maternal strategies, but few studies have investigated within-individual changes across environments. In house wrens, differences between nests in the degree of hatching synchrony of eggs represent a behavioural polyphenism in females, and its expression varies with seasonal changes in the environment. Between-nest differences in hatching asynchrony also create different environments for offspring, and sons are more strongly affected than daughters by sibling competition when hatching occurs asynchronously over several days. Here, we examined variation in hatching asynchrony and sex allocation, and its consequences for offspring fitness. The number and condition of fledglings declined seasonally, and the frequency of asynchronous hatching increased. In broods hatched asynchronously, sons, which are over-represented in the earlier-laid eggs, were in better condition than daughters, which are over-represented in the later-laid eggs. Nonetheless, asynchronous broods were more productive later within seasons. The proportion of sons in asynchronous broods increased seasonally, whereas there was a seasonal increase in the production of daughters by mothers hatching their eggs synchronously, which was characterized by within-female changes in offspring sex and not by sex-biased mortality. As adults, sons from asynchronous broods were in better condition and produced more broods of their own than males from synchronous broods, and both males and females from asynchronous broods had higher lifetime reproductive success than those from synchronous broods. In conclusion, hatching patterns are under maternal control, representing distinct strategies for allocating offspring within broods, and are associated with offspring sex ratios and differences in offspring reproductive success. © 2015 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2015 European Society For Evolutionary Biology.
Acceleration of Particles Near Earth's Bow Shock
NASA Astrophysics Data System (ADS)
Sandroos, A.
2012-12-01
Collisionless shock waves, for example, near planetary bodies or driven by coronal mass ejections, are a key source of energetic particles in the heliosphere. When the solar wind hits Earth's bow shock, some of the incident particles get reflected back towards the Sun and are accelerated in the process. Reflected ions are responsible for the creation of a turbulent foreshock in quasi-parallel regions of Earth's bow shock. We present first results of foreshock macroscopic structure and of particle distributions upstream of Earth's bow shock, obtained with a new 2.5-dimensional self-consistent diffusive shock acceleration model. In the model particles' pitch angle scattering rates are calculated from Alfvén wave power spectra using quasilinear theory. Wave power spectra in turn are modified by particles' energy changes due to the scatterings. The new model has been implemented on massively parallel simulation platform Corsair. We have used an earlier version of the model to study ion acceleration in a shock-shock interaction event (Hietala, Sandroos, and Vainio, 2012).
Aloise, Fabio; Schettini, Francesca; Aricò, Pietro; Salinari, Serenella; Guger, Christoph; Rinsma, Johanna; Aiello, Marco; Mattia, Donatella; Cincotti, Febo
2011-10-01
Motor disability and/or ageing can prevent individuals from fully enjoying home facilities, thus worsening their quality of life. Advances in the field of accessible user interfaces for domotic appliances can represent a valuable way to improve the independence of these persons. An asynchronous P300-based Brain-Computer Interface (BCI) system was recently validated with the participation of healthy young volunteers for environmental control. In this study, the asynchronous P300-based BCI for the interaction with a virtual home environment was tested with the participation of potential end-users (clients of a Frisian home care organization) with limited autonomy due to ageing and/or motor disabilities. System testing revealed that the minimum number of stimulation sequences needed to achieve correct classification had a higher intra-subject variability in potential end-users with respect to what was previously observed in young controls. Here we show that the asynchronous modality performed significantly better as compared to the synchronous mode in continuously adapting its speed to the users' state. Furthermore, the asynchronous system modality confirmed its reliability in avoiding misclassifications and false positives, as previously shown in young healthy subjects. The asynchronous modality may contribute to filling the usability gap between BCI systems and traditional input devices, representing an important step towards their use in the activities of daily living.
NASA Astrophysics Data System (ADS)
Kang, Soo-Min; Kim, Chang-Hun; Han, Sang-Kook
2016-02-01
In passive optical network (PON), orthogonal frequency division multiplexing (OFDM) has been studied actively due to its advantages such as high spectra efficiency (SE), dynamic resource allocation in time or frequency domain, and dispersion robustness. However, orthogonal frequency division multiple access (OFDMA)-PON requires tight synchronization among multiple access signals. If not, frequency orthogonality could not be maintained. Also its sidelobe causes inter-channel interference (ICI) to adjacent channel. To prevent ICI caused by high sidelobes, guard band (GB) is usually used which degrades SE. Thus, OFDMA-PON is not suitable for asynchronous uplink transmission in optical access network. In this paper, we propose intensity modulation/direct detection (IM/DD) based universal filtered multi-carrier (UFMC) PON for asynchronous multiple access. The UFMC uses subband filtering to subsets of subcarriers. Since it reduces sidelobe of each subband by applying subband filtering, it could achieve better performance compared to OFDM. For the experimental demonstration, different sample delay was applied to subbands to implement asynchronous transmission condition. As a result, time synchronization robustness of UFMC was verified in asynchronous multiple access system.
Petascale turbulence simulation using a highly parallel fast multipole method on GPUs
NASA Astrophysics Data System (ADS)
Yokota, Rio; Barba, L. A.; Narumi, Tetsu; Yasuoka, Kenji
2013-03-01
This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on GPU hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (FMM) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the FFT algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the FMM-based vortex method achieving 74% parallel efficiency on 4096 processes (one GPU per MPI process, 3 GPUs per node of the TSUBAME-2.0 system). The FFT-based spectral method is able to achieve just 14% parallel efficiency on the same number of MPI processes (using only CPU cores), due to the all-to-all communication pattern of the FFT algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.
On the consequences of bi-Maxwellian plasma distributions for parallel electric fields
NASA Technical Reports Server (NTRS)
Olsen, Richard C.
1992-01-01
The objective is to use the measurements of the equatorial particle distributions to obtain the parallel electric field structure and the evolution of the plasma distribution function along the field line. Appropriate uses of kinetic theory allows us to use the measured ( and inferred) particle distributions to obtain the electric field, and hence the variation on plasma density along the magnetic field line. The approach, here, is to utilize the adiabatic invariants, and assume the plasma distributions are in equilibrium.
Yokohama, Noriya
2013-07-01
This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
NASA Astrophysics Data System (ADS)
Abramov, G. V.; Gavrilov, A. N.
2018-03-01
The article deals with the numerical solution of the mathematical model of the particles motion and interaction in multicomponent plasma by the example of electric arc synthesis of carbon nanostructures. The high order of the particles and the number of their interactions requires a significant input of machine resources and time for calculations. Application of the large particles method makes it possible to reduce the amount of computation and the requirements for hardware resources without affecting the accuracy of numerical calculations. The use of technology of GPGPU parallel computing using the Nvidia CUDA technology allows organizing all General purpose computation on the basis of the graphical processor graphics card. The comparative analysis of different approaches to parallelization of computations to speed up calculations with the choice of the algorithm in which to calculate the accuracy of the solution shared memory is used. Numerical study of the influence of particles density in the macro particle on the motion parameters and the total number of particle collisions in the plasma for different modes of synthesis has been carried out. The rational range of the coherence coefficient of particle in the macro particle is computed.
Martin, Jessica L; Cao, Sheng; Maldonado, Jose O; Zhang, Wei; Mansky, Louis M
2016-09-15
The Gag protein is the main retroviral structural protein, and its expression alone is usually sufficient for production of virus-like particles (VLPs). In this study, we sought to investigate-in parallel comparative analyses-Gag cellular distribution, VLP size, and basic morphological features using Gag expression constructs (Gag or Gag-YFP, where YFP is yellow fluorescent protein) created from all representative retroviral genera: Alpharetrovirus, Betaretrovirus, Deltaretrovirus, Epsilonretrovirus, Gammaretrovirus, Lentivirus, and Spumavirus. We analyzed Gag cellular distribution by confocal microscopy, VLP budding by thin-section transmission electron microscopy (TEM), and general morphological features of the VLPs by cryogenic transmission electron microscopy (cryo-TEM). Punctate Gag was observed near the plasma membrane for all Gag constructs tested except for the representative Beta- and Epsilonretrovirus Gag proteins. This is the first report of Epsilonretrovirus Gag localizing to the nucleus of HeLa cells. While VLPs were not produced by the representative Beta- and Epsilonretrovirus Gag proteins, the other Gag proteins produced VLPs as confirmed by TEM, and morphological differences were observed by cryo-TEM. In particular, we observed Deltaretrovirus-like particles with flat regions of electron density that did not follow viral membrane curvature, Lentivirus-like particles with a narrow range and consistent electron density, suggesting a tightly packed Gag lattice, and Spumavirus-like particles with large envelope protein spikes and no visible electron density associated with a Gag lattice. Taken together, these parallel comparative analyses demonstrate for the first time the distinct morphological features that exist among retrovirus-like particles. Investigation of these differences will provide greater insights into the retroviral assembly pathway. Comparative analysis among retroviruses has been critically important in enhancing our understanding of retroviral replication and pathogenesis, including that of important human pathogens such as human T-cell leukemia virus type 1 (HTLV-1) and HIV-1. In this study, parallel comparative analyses have been used to study Gag expression and virus-like particle morphology among representative retroviruses in the known retroviral genera. Distinct differences were observed, which enhances current knowledge of the retroviral assembly pathway. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
NASA Astrophysics Data System (ADS)
Buaria, D.; Yeung, P. K.
2017-12-01
A new parallel algorithm utilizing a partitioned global address space (PGAS) programming model to achieve high scalability is reported for particle tracking in direct numerical simulations of turbulent fluid flow. The work is motivated by the desire to obtain Lagrangian information necessary for the study of turbulent dispersion at the largest problem sizes feasible on current and next-generation multi-petaflop supercomputers. A large population of fluid particles is distributed among parallel processes dynamically, based on instantaneous particle positions such that all of the interpolation information needed for each particle is available either locally on its host process or neighboring processes holding adjacent sub-domains of the velocity field. With cubic splines as the preferred interpolation method, the new algorithm is designed to minimize the need for communication, by transferring between adjacent processes only those spline coefficients determined to be necessary for specific particles. This transfer is implemented very efficiently as a one-sided communication, using Co-Array Fortran (CAF) features which facilitate small data movements between different local partitions of a large global array. The cost of monitoring transfer of particle properties between adjacent processes for particles migrating across sub-domain boundaries is found to be small. Detailed benchmarks are obtained on the Cray petascale supercomputer Blue Waters at the University of Illinois, Urbana-Champaign. For operations on the particles in a 81923 simulation (0.55 trillion grid points) on 262,144 Cray XE6 cores, the new algorithm is found to be orders of magnitude faster relative to a prior algorithm in which each particle is tracked by the same parallel process at all times. This large speedup reduces the additional cost of tracking of order 300 million particles to just over 50% of the cost of computing the Eulerian velocity field at this scale. Improving support of PGAS models on major compilers suggests that this algorithm will be of wider applicability on most upcoming supercomputers.
Transmittance tuning by particle chain polarization in electrowetting-driven droplets
Fan, Shih-Kang; Chiu, Cheng-Pu; Huang, Po-Wen
2010-01-01
A tiny droplet containing nano∕microparticles commonly handled in digital microfluidic lab-on-a-chip is regarded as a micro-optical component with tunable transmittance at programmable positions for the application of micro-opto-fluidic-systems. Cross-scale electric manipulations of droplets on a millimeter scale as well as suspended particles on a micrometer scale are demonstrated by electrowetting-on-dielectric (EWOD) and particle chain polarization, respectively. By applying electric fields at proper frequency ranges, EWOD and polarization can be selectively achieved in designed and fabricated parallel plate devices. At low frequencies, the applied signal generates EWOD to pump suspension droplets. The evenly dispersed particles reflect and∕or absorb the incident light to exhibit a reflective or dark droplet. When sufficiently high frequencies are used on to the nonsegmented parallel electrodes, a uniform electric field is established across the liquid to polarize the dispersed neutral particles. The induced dipole moments attract the particles each other to form particle chains and increase the transmittance of the suspension, demonstrating a transmissive or bright droplet. In addition, the reflectance of the droplet is measured at various frequencies with different amplitudes. PMID:21267088
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sullivan, M.; Anderson, D.P.
1988-01-01
Marionette is a system for distributed parallel programming in an environment of networked heterogeneous computer systems. It is based on a master/slave model. The master process can invoke worker operations (asynchronous remote procedure calls to single slaves) and context operations (updates to the state of all slaves). The master and slaves also interact through shared data structures that can be modified only by the master. The master and slave processes are programmed in a sequential language. The Marionette runtime system manages slave process creation, propagates shared data structures to slaves as needed, queues and dispatches worker and context operations, andmore » manages recovery from slave processor failures. The Marionette system also includes tools for automated compilation of program binaries for multiple architectures, and for distributing binaries to remote fuel systems. A UNIX-based implementation of Marionette is described.« less
Latency Hiding in Dynamic Partitioning and Load Balancing of Grid Computing Applications
NASA Technical Reports Server (NTRS)
Das, Sajal K.; Harvey, Daniel J.; Biswas, Rupak
2001-01-01
The Information Power Grid (IPG) concept developed by NASA is aimed to provide a metacomputing platform for large-scale distributed computations, by hiding the intricacies of highly heterogeneous environment and yet maintaining adequate security. In this paper, we propose a latency-tolerant partitioning scheme that dynamically balances processor workloads on the.IPG, and minimizes data movement and runtime communication. By simulating an unsteady adaptive mesh application on a wide area network, we study the performance of our load balancer under the Globus environment. The number of IPG nodes, the number of processors per node, and the interconnected speeds are parameterized to derive conditions under which the IPG would be suitable for parallel distributed processing of such applications. Experimental results demonstrate that effective solution are achieved when the IPG nodes are connected by a high-speed asynchronous interconnection network.
Space-time modeling using environmental constraints in a mobile robot system
NASA Technical Reports Server (NTRS)
Slack, Marc G.
1990-01-01
Grid-based models of a robot's local environment have been used by many researchers building mobile robot control systems. The attraction of grid-based models is their clear parallel between the internal model and the external world. However, the discrete nature of such representations does not match well with the continuous nature of actions and usually serves to limit the abilities of the robot. This work describes a spatial modeling system that extracts information from a grid-based representation to form a symbolic representation of the robot's local environment. The approach makes a separation between the representation provided by the sensing system and the representation used by the action system. Separation allows asynchronous operation between sensing and action in a mobile robot, as well as the generation of a more continuous representation upon which to base actions.
Interpolation algorithm for asynchronous ADC-data
NASA Astrophysics Data System (ADS)
Bramburger, Stefan; Zinke, Benny; Killat, Dirk
2017-09-01
This paper presents a modified interpolation algorithm for signals with variable data rate from asynchronous ADCs. The Adaptive weights Conjugate gradient Toeplitz matrix (ACT) algorithm is extended to operate with a continuous data stream. An additional preprocessing of data with constant and linear sections and a weighted overlap of step-by-step into spectral domain transformed signals improve the reconstruction of the asycnhronous ADC signal. The interpolation method can be used if asynchronous ADC data is fed into synchronous digital signal processing.
Application of intelligent soft start in asynchronous motor
NASA Astrophysics Data System (ADS)
Du, Xue; Ye, Ying; Wang, Yuelong; Peng, Lei; Zhang, Suying
2018-05-01
The starting way of three phase asynchronous motor has full voltage start and step-down start. Direct starting brings large current impact, causing excessive local temperature to the power grid and larger starting torque will also impact the motor equipment and affect the service life of the motor. Aim at the problem of large current and torque caused by start-up, an intelligent soft starter is proposed. Through the application of intelligent soft start on asynchronous motor, highlights its application advantage in motor control.
Kinetic Simulations of Particle Acceleration at Shocks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caprioli, Damiano; Guo, Fan
2015-07-16
Collisionless shocks are mediated by collective electromagnetic interactions and are sources of non-thermal particles and emission. The full particle-in-cell approach and a hybrid approach are sketched, simulations of collisionless shocks are shown using a multicolor presentation. Results for SN 1006, a case involving ion acceleration and B field amplification where the shock is parallel, are shown. Electron acceleration takes place in planetary bow shocks and galaxy clusters. It is concluded that acceleration at shocks can be efficient: >15%; CRs amplify B field via streaming instability; ion DSA is efficient at parallel, strong shocks; ions are injected via reflection and shockmore » drift acceleration; and electron DSA is efficient at oblique shocks.« less
The energetic ion signature of an O-type neutral line in the geomagnetic tail
NASA Technical Reports Server (NTRS)
Martin, R. F., Jr.; Johnson, D. F.; Speiser, T. W.
1991-01-01
An energetic ion signature is presented which has the potential for remote sensing of an O-type neutral line embedded in a current sheet. A source plasma with a tailward flowing Kappa distribution yields a strongly non-Kappa distribution after interacting with the neutral line: sharp jumps, or ridges, occur in the velocity space distribution function f(nu-perpendicular, nu-parallel) associated with both increases and decreases in f. The jumps occur when orbits are reversed in the x-direction: a reversal causing initially earthward particles (low probability in the source distribution) to be observed results in a decrease in f, while a reversal causing initially tailward particles to be observed produces an increase in f. The reversals, and hence the jumps, occur at approximately constant values of perpendicular velocity in both the positive nu parallel and negative nu parallel half planes. The results were obtained using single particle simulations in a fixed magnetic field model.
A Parallel Fast Sweeping Method for the Eikonal Equation
NASA Astrophysics Data System (ADS)
Baker, B.
2017-12-01
Recently, there has been an exciting emergence of probabilistic methods for travel time tomography. Unlike gradient-based optimization strategies, probabilistic tomographic methods are resistant to becoming trapped in a local minimum and provide a much better quantification of parameter resolution than, say, appealing to ray density or performing checkerboard reconstruction tests. The benefits associated with random sampling methods however are only realized by successive computation of predicted travel times in, potentially, strongly heterogeneous media. To this end this abstract is concerned with expediting the solution of the Eikonal equation. While many Eikonal solvers use a fast marching method, the proposed solver will use the iterative fast sweeping method because the eight fixed sweep orderings in each iteration are natural targets for parallelization. To reduce the number of iterations and grid points required the high-accuracy finite difference stencil of Nobel et al., 2014 is implemented. A directed acyclic graph (DAG) is created with a priori knowledge of the sweep ordering and finite different stencil. By performing a topological sort of the DAG sets of independent nodes are identified as candidates for concurrent updating. Additionally, the proposed solver will also address scalability during earthquake relocation, a necessary step in local and regional earthquake tomography and a barrier to extending probabilistic methods from active source to passive source applications, by introducing an asynchronous parallel forward solve phase for all receivers in the network. Synthetic examples using the SEG over-thrust model will be presented.
Asynchronous oscillations of rigid rods drive viscous fluid to swirl
NASA Astrophysics Data System (ADS)
Hayashi, Rintaro; Takagi, Daisuke
2017-12-01
We present a minimal system for generating flow at low Reynolds number by oscillating a pair of rigid rods in silicone oil. Experiments show that oscillating them in phase produces no net flow, but a phase difference alone can generate rich flow fields. Tracer particles follow complex trajectory patterns consisting of small orbital movements every cycle and then drifting or swirling in larger regions after many cycles. Observations are consistent with simulations performed using the method of regularized Stokeslets, which reveal complex three-dimensional flow structures emerging from simple oscillatory actuation. Our findings reveal the basic underlying flow structure around oscillatory protrusions such as hairs and legs as commonly featured on living and nonliving bodies.
Parallelization of MRCI based on hole-particle symmetry.
Suo, Bing; Zhai, Gaohong; Wang, Yubin; Wen, Zhenyi; Hu, Xiangqian; Li, Lemin
2005-01-15
The parallel implementation of multireference configuration interaction program based on the hole-particle symmetry is described. The platform to implement the parallelization is an Intel-Architectural cluster consisting of 12 nodes, each of which is equipped with two 2.4-G XEON processors, 3-GB memory, and 36-GB disk, and are connected by a Gigabit Ethernet Switch. The dependence of speedup on molecular symmetries and task granularities is discussed. Test calculations show that the scaling with the number of nodes is about 1.9 (for C1 and Cs), 1.65 (for C2v), and 1.55 (for D2h) when the number of nodes is doubled. The largest calculation performed on this cluster involves 5.6 x 10(8) CSFs.
Kalb, Daniel M; Fencl, Frank A; Woods, Travis A; Swanson, August; Maestas, Gian C; Juárez, Jaime J; Edwards, Bruce S; Shreve, Andrew P; Graves, Steven W
2017-09-19
Flow cytometry provides highly sensitive multiparameter analysis of cells and particles but has been largely limited to the use of a single focused sample stream. This limits the analytical rate to ∼50K particles/s and the volumetric rate to ∼250 μL/min. Despite the analytical prowess of flow cytometry, there are applications where these rates are insufficient, such as rare cell analysis in high cellular backgrounds (e.g., circulating tumor cells and fetal cells in maternal blood), detection of cells/particles in large dilute samples (e.g., water quality, urine analysis), or high-throughput screening applications. Here we report a highly parallel acoustic flow cytometer that uses an acoustic standing wave to focus particles into 16 parallel analysis points across a 2.3 mm wide optical flow cell. A line-focused laser and wide-field collection optics are used to excite and collect the fluorescence emission of these parallel streams onto a high-speed camera for analysis. With this instrument format and fluorescent microsphere standards, we obtain analysis rates of 100K/s and flow rates of 10 mL/min, while maintaining optical performance comparable to that of a commercial flow cytometer. The results with our initial prototype instrument demonstrate that the integration of key parallelizable components, including the line-focused laser, particle focusing using multinode acoustic standing waves, and a spatially arrayed detector, can increase analytical and volumetric throughputs by orders of magnitude in a compact, simple, and cost-effective platform. Such instruments will be of great value to applications in need of high-throughput yet sensitive flow cytometry analysis.
Determination of power and moment on shaft of special asynchronous electric drives
NASA Astrophysics Data System (ADS)
Karandey, V. Yu; Popov, B. K.; Popova, O. B.; Afanasyev, V. L.
2018-03-01
In the article, questions and tasks of determination of power and the moment on a shaft of special asynchronous electric drives are considered. Use of special asynchronous electric drives in mechanical engineering and other industries is relevant. The considered types of electric drives possess the improved mass-dimensional indicators in comparison with singleengine systems. Also these types of electric drives have constructive advantages; the improved characteristics allow one to realize the technological process. But creation and design of new electric drives demands adjustment of existing or development of new methods and approaches of calculation of parameters. Determination of power and the moment on a shaft of special asynchronous electric drives is the main objective during design of electric drives. This task has been solved based on a method of electromechanical transformation of energy.
Distributed asynchronous microprocessor architectures in fault tolerant integrated flight systems
NASA Technical Reports Server (NTRS)
Dunn, W. R.
1983-01-01
The paper discusses the implementation of fault tolerant digital flight control and navigation systems for rotorcraft application. It is shown that in implementing fault tolerance at the systems level using advanced LSI/VLSI technology, aircraft physical layout and flight systems requirements tend to define a system architecture of distributed, asynchronous microprocessors in which fault tolerance can be achieved locally through hardware redundancy and/or globally through application of analytical redundancy. The effects of asynchronism on the execution of dynamic flight software is discussed. It is shown that if the asynchronous microprocessors have knowledge of time, these errors can be significantly reduced through appropiate modifications of the flight software. Finally, the papear extends previous work to show that through the combined use of time referencing and stable flight algorithms, individual microprocessors can be configured to autonomously tolerate intermittent faults.
The Design of Finite State Machine for Asynchronous Replication Protocol
NASA Astrophysics Data System (ADS)
Wang, Yanlong; Li, Zhanhuai; Lin, Wei; Hei, Minglei; Hao, Jianhua
Data replication is a key way to design a disaster tolerance system and to achieve reliability and availability. It is difficult for a replication protocol to deal with the diverse and complex environment. This means that data is less well replicated than it ought to be. To reduce data loss and to optimize replication protocols, we (1) present a finite state machine, (2) run it to manage an asynchronous replication protocol and (3) report a simple evaluation of the asynchronous replication protocol based on our state machine. It's proved that our state machine is applicable to guarantee the asynchronous replication protocol running in the proper state to the largest extent in the event of various possible events. It also can helpful to build up replication-based disaster tolerance systems to ensure the business continuity.
The Use of Efficient Broadcast Protocols in Asynchronous Distributed Systems. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Schmuck, Frank Bernhard
1988-01-01
Reliable broadcast protocols are important tools in distributed and fault-tolerant programming. They are useful for sharing information and for maintaining replicated data in a distributed system. However, a wide range of such protocols has been proposed. These protocols differ in their fault tolerance and delivery ordering characteristics. There is a tradeoff between the cost of a broadcast protocol and how much ordering it provides. It is, therefore, desirable to employ protocols that support only a low degree of ordering whenever possible. This dissertation presents techniques for deciding how strongly ordered a protocol is necessary to solve a given application problem. It is shown that there are two distinct classes of application problems: problems that can be solved with efficient, asynchronous protocols, and problems that require global ordering. The concept of a linearization function that maps partially ordered sets of events to totally ordered histories is introduced. How to construct an asynchronous implementation that solves a given problem if a linearization function for it can be found is shown. It is proved that in general the question of whether a problem has an asynchronous solution is undecidable. Hence there exists no general algorithm that would automatically construct a suitable linearization function for a given problem. Therefore, an important subclass of problems that have certain commutativity properties are considered. Techniques for constructing asynchronous implementations for this class are presented. These techniques are useful for constructing efficient asynchronous implementations for a broad range of practical problems.
Penas, David R; González, Patricia; Egea, Jose A; Doallo, Ramón; Banga, Julio R
2017-01-21
The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models.
High-Performance, Multi-Node File Copies and Checksums for Clustered File Systems
NASA Technical Reports Server (NTRS)
Kolano, Paul Z.; Ciotti, Robert B.
2012-01-01
Modern parallel file systems achieve high performance using a variety of techniques, such as striping files across multiple disks to increase aggregate I/O bandwidth and spreading disks across multiple servers to increase aggregate interconnect bandwidth. To achieve peak performance from such systems, it is typically necessary to utilize multiple concurrent readers/writers from multiple systems to overcome various singlesystem limitations, such as number of processors and network bandwidth. The standard cp and md5sum tools of GNU coreutils found on every modern Unix/Linux system, however, utilize a single execution thread on a single CPU core of a single system, and hence cannot take full advantage of the increased performance of clustered file systems. Mcp and msum are drop-in replacements for the standard cp and md5sum programs that utilize multiple types of parallelism and other optimizations to achieve maximum copy and checksum performance on clustered file systems. Multi-threading is used to ensure that nodes are kept as busy as possible. Read/write parallelism allows individual operations of a single copy to be overlapped using asynchronous I/O. Multinode cooperation allows different nodes to take part in the same copy/checksum. Split-file processing allows multiple threads to operate concurrently on the same file. Finally, hash trees allow inherently serial checksums to be performed in parallel. Mcp and msum provide significant performance improvements over standard cp and md5sum using multiple types of parallelism and other optimizations. The total speed-ups from all improvements are significant. Mcp improves cp performance over 27x, msum improves md5sum performance almost 19x, and the combination of mcp and msum improves verified copies via cp and md5sum by almost 22x. These improvements come in the form of drop-in replacements for cp and md5sum, so are easily used and are available for download as open source software at http://mutil.sourceforge.net.
Adam, Asrul; Shapiai, Mohd Ibrahim; Tumari, Mohd Zaidi Mohd; Mohamad, Mohd Saberi; Mubin, Marizan
2014-01-01
Electroencephalogram (EEG) signal peak detection is widely used in clinical applications. The peak point can be detected using several approaches, including time, frequency, time-frequency, and nonlinear domains depending on various peak features from several models. However, there is no study that provides the importance of every peak feature in contributing to a good and generalized model. In this study, feature selection and classifier parameters estimation based on particle swarm optimization (PSO) are proposed as a framework for peak detection on EEG signals in time domain analysis. Two versions of PSO are used in the study: (1) standard PSO and (2) random asynchronous particle swarm optimization (RA-PSO). The proposed framework tries to find the best combination of all the available features that offers good peak detection and a high classification rate from the results in the conducted experiments. The evaluation results indicate that the accuracy of the peak detection can be improved up to 99.90% and 98.59% for training and testing, respectively, as compared to the framework without feature selection adaptation. Additionally, the proposed framework based on RA-PSO offers a better and reliable classification rate as compared to standard PSO as it produces low variance model.
Rohani, Ali; Varhue, Walter; Su, Yi-Hsuan; Swami, Nathan S
2014-07-01
Electrorotation (ROT) is a powerful tool for characterizing the dielectric properties of cells and bioparticles. However, its application has been somewhat limited by the need to mitigate disruptions to particle rotation by translation under positive DEP and by frictional interactions with the substrate. While these disruptions may be overcome by implementing particle positioning schemes or field cages, these methods restrict the frequency bandwidth to the negative DEP range and permit only single particle measurements within a limited spatial extent of the device geometry away from field nonuniformities. Herein, we present an electrical tweezer methodology based on a sequence of electrical signals, composed of negative DEP using 180-degree phase-shifted fields for trapping and levitation of the particles, followed by 90-degree phase-shifted fields over a wide frequency bandwidth for highly parallelized electrorotation measurements. Through field simulations of the rotating electrical field under this wave-sequence, we illustrate the enhanced spatial extent for electrorotation measurements, with no limitations to frequency bandwidth. We apply this methodology to characterize subtle modifications in morphology and electrophysiology of Cryptosporidium parvum with varying degrees of heat treatment, in terms of shifts in the electrorotation spectra over the 0.05-40 MHz region. Given the single particle sensitivity and the ability for highly parallelized electrorotation measurements, we envision its application toward characterizing heterogeneous subpopulations of microbial and stem cells. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A flexible algorithm for calculating pair interactions on SIMD architectures
NASA Astrophysics Data System (ADS)
Páll, Szilárd; Hess, Berk
2013-12-01
Calculating interactions or correlations between pairs of particles is typically the most time-consuming task in particle simulation or correlation analysis. Straightforward implementations using a double loop over particle pairs have traditionally worked well, especially since compilers usually do a good job of unrolling the inner loop. In order to reach high performance on modern CPU and accelerator architectures, single-instruction multiple-data (SIMD) parallelization has become essential. Avoiding memory bottlenecks is also increasingly important and requires reducing the ratio of memory to arithmetic operations. Moreover, when pairs only interact within a certain cut-off distance, good SIMD utilization can only be achieved by reordering input and output data, which quickly becomes a limiting factor. Here we present an algorithm for SIMD parallelization based on grouping a fixed number of particles, e.g. 2, 4, or 8, into spatial clusters. Calculating all interactions between particles in a pair of such clusters improves data reuse compared to the traditional scheme and results in a more efficient SIMD parallelization. Adjusting the cluster size allows the algorithm to map to SIMD units of various widths. This flexibility not only enables fast and efficient implementation on current CPUs and accelerator architectures like GPUs or Intel MIC, but it also makes the algorithm future-proof. We present the algorithm with an application to molecular dynamics simulations, where we can also make use of the effective buffering the method introduces.
RT DDA: A hybrid method for predicting the scattering properties by densely packed media
NASA Astrophysics Data System (ADS)
Ramezan Pour, B.; Mackowski, D.
2017-12-01
The most accurate approaches to predicting the scattering properties of particulate media are based on exact solutions of the Maxwell's equations (MEs), such as the T-matrix and discrete dipole methods. Applying these techniques for optically thick targets is challenging problem due to the large-scale computations and are usually substituted by phenomenological radiative transfer (RT) methods. On the other hand, the RT technique is of questionable validity in media with large particle packing densities. In recent works, we used numerically exact ME solvers to examine the effects of particle concentration on the polarized reflection properties of plane parallel random media. The simulations were performed for plane parallel layers of wavelength-sized spherical particles, and results were compared with RT predictions. We have shown that RTE results monotonically converge to the exact solution as the particle volume fraction becomes smaller and one can observe a nearly perfect fit for packing densities of 2%-5%. This study describes the hybrid technique composed of exact and numerical scalar RT methods. The exact methodology in this work is the plane parallel discrete dipole approximation whereas the numerical method is based on the adding and doubling method. This approach not only decreases the computational time owing to the RT method but also includes the interference and multiple scattering effects, so it may be applicable to large particle density conditions.
Qiao, Gang; Gan, Shuwei; Liu, Songzuo; Ma, Lu; Sun, Zongxin
2018-05-24
To improve the throughput of underwater acoustic (UWA) networking, the In-band full-duplex (IBFD) communication is one of the most vital pieces of research. The major drawback of IBFD-UWA communication is Self-Interference (SI). This paper presents a digital SI cancellation algorithm for asynchronous IBFD-UWA communication system. We focus on two issues: one is asynchronous communication dissimilar to IBFD radio communication, the other is nonlinear distortion caused by power amplifier (PA). First, we discuss asynchronous IBFD-UWA signal model with the nonlinear distortion of PA. Then, we design a scheme for asynchronous IBFD-UWA communication utilizing the non-overlapping region between SI and intended signal to estimate the nonlinear SI channel. To cancel the nonlinear distortion caused by PA, we propose an Over-Parameterization based Recursive Least Squares (RLS) algorithm (OPRLS) to estimate the nonlinear SI channel. Furthermore, we present the OPRLS with a sparse constraint to estimate the SI channel, which reduces the requirement of the length of the non-overlapping region. Finally, we verify our concept through simulation and the pool experiment. Results demonstrate that the proposed digital SI cancellation scheme can cancel SI efficiently.
Intelligent neuroprocessors for in-situ launch vehicle propulsion systems health management
NASA Technical Reports Server (NTRS)
Gulati, S.; Tawel, R.; Thakoor, A. P.
1993-01-01
Efficacy of existing on-board propulsion systems health management systems (HMS) are severely impacted by computational limitations (e.g., low sampling rates); paradigmatic limitations (e.g., low-fidelity logic/parameter redlining only, false alarms due to noisy/corrupted sensor signatures, preprogrammed diagnostics only); and telemetry bandwidth limitations on space/ground interactions. Ultra-compact/light, adaptive neural networks with massively parallel, asynchronous, fast reconfigurable and fault-tolerant information processing properties have already demonstrated significant potential for inflight diagnostic analyses and resource allocation with reduced ground dependence. In particular, they can automatically exploit correlation effects across multiple sensor streams (plume analyzer, flow meters, vibration detectors, etc.) so as to detect anomaly signatures that cannot be determined from the exploitation of single sensor. Furthermore, neural networks have already demonstrated the potential for impacting real-time fault recovery in vehicle subsystems by adaptively regulating combustion mixture/power subsystems and optimizing resource utilization under degraded conditions. A class of high-performance neuroprocessors, developed at JPL, that have demonstrated potential for next-generation HMS for a family of space transportation vehicles envisioned for the next few decades, including HLLV, NLS, and space shuttle is presented. Of fundamental interest are intelligent neuroprocessors for real-time plume analysis, optimizing combustion mixture-ratio, and feedback to hydraulic, pneumatic control systems. This class includes concurrently asynchronous reprogrammable, nonvolatile, analog neural processors with high speed, high bandwidth electronic/optical I/O interfaced, with special emphasis on NASA's unique requirements in terms of performance, reliability, ultra-high density ultra-compactness, ultra-light weight devices, radiation hardened devices, power stringency, and long life terms.
Scenario Decomposition for 0-1 Stochastic Programs: Improvements and Asynchronous Implementation
Ryan, Kevin; Rajan, Deepak; Ahmed, Shabbir
2016-05-01
We recently proposed scenario decomposition algorithm for stochastic 0-1 programs finds an optimal solution by evaluating and removing individual solutions that are discovered by solving scenario subproblems. In our work, we develop an asynchronous, distributed implementation of the algorithm which has computational advantages over existing synchronous implementations of the algorithm. Improvements to both the synchronous and asynchronous algorithm are proposed. We also test the results on well known stochastic 0-1 programs from the SIPLIB test library and is able to solve one previously unsolved instance from the test set.
Wu, Yuanyuan; Cao, Jinde; Li, Qingbo; Alsaedi, Ahmed; Alsaadi, Fuad E
2017-01-01
This paper deals with the finite-time synchronization problem for a class of uncertain coupled switched neural networks under asynchronous switching. By constructing appropriate Lyapunov-like functionals and using the average dwell time technique, some sufficient criteria are derived to guarantee the finite-time synchronization of considered uncertain coupled switched neural networks. Meanwhile, the asynchronous switching feedback controller is designed to finite-time synchronize the concerned networks. Finally, two numerical examples are introduced to show the validity of the main results. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Casse, F.; van Marle, A. J.; Marcowith, A.
2018-01-01
We present simulations of magnetized astrophysical shocks taking into account the interplay between the thermal plasma of the shock and supra-thermal particles. Such interaction is depicted by combining a grid-based magneto-hydrodynamics description of the thermal fluid with particle-in-cell techniques devoted to the dynamics of supra-thermal particles. This approach, which incorporates the use of adaptive mesh refinement features, is potentially a key to simulate astrophysical systems on spatial scales that are beyond the reach of pure particle-in-cell simulations. We consider non-relativistic super-Alfénic shocks with various magnetic field obliquity. We recover all the features from previous studies when the magnetic field is parallel to the normal to the shock. In contrast with previous particle-in-cell and hybrid simulations, we find that particle acceleration and magnetic field amplification also occur when the magnetic field is oblique to the normal to the shock but on larger timescales than in the parallel case. We show that in our oblique shock simulations the streaming of supra-thermal particles induces a corrugation of the shock front. Such oscillations of both the shock front and the magnetic field then locally helps the particles to enter the upstream region and to initiate a non-resonant streaming instability and finally to induce diffuse particle acceleration.
Particle beam and crabbing and deflecting structure
Delayen, Jean [Yorktown, VA
2011-02-08
A new type of structure for the deflection and crabbing of particle bunches in particle accelerators comprising a number of parallel transverse electromagnetic (TEM)-resonant) lines operating in opposite phase from each other. Such a structure is significantly more compact than conventional crabbing cavities operating the transverse magnetic TM mode, thus allowing low frequency designs.
Particle-in-Cell laser-plasma simulation on Xeon Phi coprocessors
NASA Astrophysics Data System (ADS)
Surmin, I. A.; Bastrakov, S. I.; Efimenko, E. S.; Gonoskov, A. A.; Korzhimanov, A. V.; Meyerov, I. B.
2016-05-01
This paper concerns the development of a high-performance implementation of the Particle-in-Cell method for plasma simulation on Intel Xeon Phi coprocessors. We discuss the suitability of the method for Xeon Phi architecture and present our experience in the porting and optimization of the existing parallel Particle-in-Cell code PICADOR. Direct porting without code modification gives performance on Xeon Phi close to that of an 8-core CPU on a benchmark problem with 50 particles per cell. We demonstrate step-by-step optimization techniques, such as improving data locality, enhancing parallelization efficiency and vectorization leading to an overall 4.2 × speedup on CPU and 7.5 × on Xeon Phi compared to the baseline version. The optimized version achieves 16.9 ns per particle update on an Intel Xeon E5-2660 CPU and 9.3 ns per particle update on an Intel Xeon Phi 5110P. For a real problem of laser ion acceleration in targets with surface grating, where a large number of macroparticles per cell is required, the speedup of Xeon Phi compared to CPU is 1.6 ×.
Amaya, Ronny; Cancel, Limary M; Tarbell, John M
2016-01-01
Hemodynamic forces play an important role in the non-uniform distribution of atherosclerotic lesions. Endothelial cells are exposed simultaneously to fluid wall shear stress (WSS) and solid circumferential stress (CS). Due to variations in impedance (global factors) and geometric complexities (local factors) in the arterial circulation a time lag arises between these two forces that can be characterized by the temporal phase angle between CS and WSS (stress phase angle-SPA). Asynchronous flows (SPA close to -180°) that are most prominent in coronary arteries have been associated with localization of atherosclerosis. Reversing oscillatory flows characterized by an oscillatory shear index (OSI) that is great than zero are also associated with atherosclerosis localization. In this study we examined the relationship between asynchronous flows and reversing flows in altering the expression of 37 genes relevant to atherosclerosis development. In the case of reversing oscillatory flow, we observed that the asynchronous condition upregulated 8 genes compared to synchronous hemodynamics, most of them proatherogenic. Upregulation of the pro-inflammatory transcription factor NFκB p65 was confirmed by western blot, and nuclear translocation of NFκB p65 was confirmed by immunofluorescence staining. A comparative study between non-reversing flow and reversing flow found that in the case of synchronous hemodynamics, reversing flow altered the expression of 11 genes, while in the case of asynchronous hemodynamics, reversing flow altered the expression of 17 genes. Reversing flow significantly upregulated protein expression of NFκB p65 for both synchronous and asynchronous conditions. Nuclear translocation of NFκB p65 was confirmed for synchronous and asynchronous conditions in the presence of flow reversal. These data suggest that asynchronous hemodynamics and reversing flow can elicit proatherogenic responses in endothelial cells compared to synchronous hemodynamics without shear stress reversal, indicating that SPA as well as reversal flow (OSI) are important parameters characterizing arterial susceptibility to disease.
Amaya, Ronny; Cancel, Limary M.; Tarbell, John M.
2016-01-01
Hemodynamic forces play an important role in the non-uniform distribution of atherosclerotic lesions. Endothelial cells are exposed simultaneously to fluid wall shear stress (WSS) and solid circumferential stress (CS). Due to variations in impedance (global factors) and geometric complexities (local factors) in the arterial circulation a time lag arises between these two forces that can be characterized by the temporal phase angle between CS and WSS (stress phase angle–SPA). Asynchronous flows (SPA close to -180°) that are most prominent in coronary arteries have been associated with localization of atherosclerosis. Reversing oscillatory flows characterized by an oscillatory shear index (OSI) that is great than zero are also associated with atherosclerosis localization. In this study we examined the relationship between asynchronous flows and reversing flows in altering the expression of 37 genes relevant to atherosclerosis development. In the case of reversing oscillatory flow, we observed that the asynchronous condition upregulated 8 genes compared to synchronous hemodynamics, most of them proatherogenic. Upregulation of the pro-inflammatory transcription factor NFκB p65 was confirmed by western blot, and nuclear translocation of NFκB p65 was confirmed by immunofluorescence staining. A comparative study between non-reversing flow and reversing flow found that in the case of synchronous hemodynamics, reversing flow altered the expression of 11 genes, while in the case of asynchronous hemodynamics, reversing flow altered the expression of 17 genes. Reversing flow significantly upregulated protein expression of NFκB p65 for both synchronous and asynchronous conditions. Nuclear translocation of NFκB p65 was confirmed for synchronous and asynchronous conditions in the presence of flow reversal. These data suggest that asynchronous hemodynamics and reversing flow can elicit proatherogenic responses in endothelial cells compared to synchronous hemodynamics without shear stress reversal, indicating that SPA as well as reversal flow (OSI) are important parameters characterizing arterial susceptibility to disease. PMID:27846267
50 GFlops molecular dynamics on the Connection Machine 5
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lomdahl, P.S.; Tamayo, P.; Groenbech-Jensen, N.
1993-12-31
The authors present timings and performance numbers for a new short range three dimensional (3D) molecular dynamics (MD) code, SPaSM, on the Connection Machine-5 (CM-5). They demonstrate that runs with more than 10{sup 8} particles are now possible on massively parallel MIMD computers. To the best of their knowledge this is at least an order of magnitude more particles than what has previously been reported. Typical production runs show sustained performance (including communication) in the range of 47--50 GFlops on a 1024 node CM-5 with vector units (VUs). The speed of the code scales linearly with the number of processorsmore » and with the number of particles and shows 95% parallel efficiency in the speedup.« less
NASA Astrophysics Data System (ADS)
Kotenev, A. V.; Kotenev, V. I.; Kochetkov, V. V.; Elkin, D. A.
2018-01-01
For the purpose of reactive power control error reduction and decrease of the voltage sags in the electric power system caused by the asynchronous motors started the mathematical model of the load bus was developed. The model was built up of the sub-models of the following elements: a transformer, a transmission line, a synchronous and an asynchronous loads and a capacitor bank load, and represents the automatic reactive power control system taking into account electromagnetic processes of the asynchronous motors started and reactive power changing of the electric power system elements caused by the voltage fluctuation. The active power/time and reactive power/time characteristics based on the recommended procedure of the equivalent electric circuit parameters calculation were obtained. The derived automatic reactive power control system was shown to eliminate the voltage sags in the electric power system caused by the asynchronous motors started.
PsychVACS: a system for asynchronous telepsychiatry.
Odor, Alberto; Yellowlees, Peter; Hilty, Donald; Parish, Michelle Burke; Nafiz, Najia; Iosif, Ana-Maria
2011-05-01
To describe the technical development of an asynchronous telepsychiatry application, the Psychiatric Video Archiving and Communication System. A client-server application was developed in Visual Basic.Net with Microsoft(®) SQL database as the backend. It includes the capability of storing video-recorded psychiatric interviews and manages the workflow of the system with automated messaging. Psychiatric Video Archiving and Communication System has been used to conduct the first ever series of asynchronous telepsychiatry consultations worldwide. A review of the software application and the process as part of this project has led to a number of improvements that are being implemented in the next version, which is being written in Java. This is the first description of the use of video recorded data in an asynchronous telemedicine application. Primary care providers and consulting psychiatrists have found it easy to work with and a valuable resource to increase the availability of psychiatric consultation in remote rural locations.
NASA Astrophysics Data System (ADS)
Bashashati, Ali; Mason, Steve; Ward, Rabab K.; Birch, Gary E.
2006-06-01
The low-frequency asynchronous switch design (LF-ASD) has been introduced as a direct brain interface (BI) for asynchronous control applications. Asynchronous interfaces, as opposed to synchronous interfaces, have the advantage of being operational at all times and not only at specific system-defined periods. This paper modifies the LF-ASD design by incorporating into the system more knowledge about the attempted movements. Specifically, the history of feature values extracted from the EEG signal is used to detect a right index finger movement attempt. Using data collected from individuals with high-level spinal cord injuries and able-bodied subjects, it is shown that the error characteristics of the modified design are significantly better than the previous LF-ASD design. The true positive rate percentage increased by up to 15 which corresponds to 50% improvement when the system is operating with false positive rates in the 1-2% range.
NASA Astrophysics Data System (ADS)
Mišković, Zoran L.; Akbari, Kamran; Segui, Silvina; Gervasoni, Juana L.; Arista, Néstor R.
2018-05-01
We present a fully relativistic formulation for the energy loss rate of a charged particle moving parallel to a sheet containing two-dimensional electron gas, allowing that its in-plane polarization may be described by different longitudinal and transverse conductivities. We apply our formulation to the case of a doped graphene layer in the terahertz range of frequencies, where excitation of the Dirac plasmon polariton (DPP) in graphene plays a major role. By using the Drude model with zero damping we evaluate the energy loss rate due to excitation of the DPP, and show that the retardation effects are important when the incident particle speed and its distance from graphene both increase. Interestingly, the retarded energy loss rate obtained in this manner may be both larger and smaller than its non-retarded counterpart for different combinations of the particle speed and distance.
NASA Astrophysics Data System (ADS)
Capecelatro, Jesse
2018-03-01
It has long been suggested that a purely Lagrangian solution to global-scale atmospheric/oceanic flows can potentially outperform tradition Eulerian schemes. Meanwhile, a demonstration of a scalable and practical framework remains elusive. Motivated by recent progress in particle-based methods when applied to convection dominated flows, this work presents a fully Lagrangian method for solving the inviscid shallow water equations on a rotating sphere in a smooth particle hydrodynamics framework. To avoid singularities at the poles, the governing equations are solved in Cartesian coordinates, augmented with a Lagrange multiplier to ensure that fluid particles are constrained to the surface of the sphere. An underlying grid in spherical coordinates is used to facilitate efficient neighbor detection and parallelization. The method is applied to a suite of canonical test cases, and conservation, accuracy, and parallel performance are assessed.
Are supernova remnants quasi-parallel or quasi-perpendicular accelerators
NASA Technical Reports Server (NTRS)
Spangler, S. R.; Leckband, J. A.; Cairns, I. H.
1989-01-01
Observations of shock waves in the solar system which show a pronounced difference in the plasma wave and particle environment depending on whether the shock is propagating along or perpendicular to the interplanetary magnetic field are discussed. Theories for particle acceleration developed for quasi-parallel and quasi-perpendicular shocks, when extended to the interstellar medium suggest that the relativistic electrons in radio supernova remnants are accelerated by either the Q parallel or Q perpendicular mechanisms. A model for the galactic magnetic field and published maps of supernova remnants were used to search for a dependence of structure on the angle Phi. Results show no tendency for the remnants as a whole to favor the relationship expected for either mechanism, although individual sources resemble model remnants of one or the other acceleration process.
A parallel direct numerical simulation of dust particles in a turbulent flow
NASA Astrophysics Data System (ADS)
Nguyen, H. V.; Yokota, R.; Stenchikov, G.; Kocurek, G.
2012-04-01
Due to their effects on radiation transport, aerosols play an important role in the global climate. Mineral dust aerosol is a predominant natural aerosol in the desert and semi-desert regions of the Middle East and North Africa (MENA). The Arabian Peninsula is one of the three predominant source regions on the planet "exporting" dust to almost the entire world. Mineral dust aerosols make up about 50% of the tropospheric aerosol mass and therefore produces a significant impact on the Earth's climate and the atmospheric environment, especially in the MENA region that is characterized by frequent dust storms and large aerosol generation. Understanding the mechanisms of dust emission, transport and deposition is therefore essential for correctly representing dust in numerical climate prediction. In this study we present results of numerical simulations of dust particles in a turbulent flow to study the interaction between dust and the atmosphere. Homogenous and passive dust particles in the boundary layers are entrained and advected under the influence of a turbulent flow. Currently no interactions between particles are included. Turbulence is resolved through direct numerical simulation using a parallel incompressible Navier-Stokes flow solver. Model output provides information on particle trajectories, turbulent transport of dust and effects of gravity on dust motion, which will be used to compare with the wind tunnel experiments at University of Texas at Austin. Results of testing of parallel efficiency and scalability is provided. Future versions of the model will include air-particle momentum exchanges, varying particle sizes and saltation effect. The results will be used for interpreting wind tunnel and field experiments and for improvement of dust generation parameterizations in meteorological models.
NASA Astrophysics Data System (ADS)
van Marle, Allard Jan; Casse, Fabien; Marcowith, Alexandre
2018-01-01
We present simulations of magnetized astrophysical shocks taking into account the interplay between the thermal plasma of the shock and suprathermal particles. Such interaction is depicted by combining a grid-based magnetohydrodynamics description of the thermal fluid with particle in cell techniques devoted to the dynamics of suprathermal particles. This approach, which incorporates the use of adaptive mesh refinement features, is potentially a key to simulate astrophysical systems on spatial scales that are beyond the reach of pure particle-in-cell simulations. We consider in this study non-relativistic shocks with various Alfvénic Mach numbers and magnetic field obliquity. We recover all the features of both magnetic field amplification and particle acceleration from previous studies when the magnetic field is parallel to the normal to the shock. In contrast with previous particle-in-cell-hybrid simulations, we find that particle acceleration and magnetic field amplification also occur when the magnetic field is oblique to the normal to the shock but on larger time-scales than in the parallel case. We show that in our simulations, the suprathermal particles are experiencing acceleration thanks to a pre-heating process of the particle similar to a shock drift acceleration leading to the corrugation of the shock front. Such oscillations of the shock front and the magnetic field locally help the particles to enter the upstream region and to initiate a non-resonant streaming instability and finally to induce diffuse particle acceleration.
NASA Astrophysics Data System (ADS)
Furuichi, Mikito; Nishiura, Daisuke
2017-10-01
We developed dynamic load-balancing algorithms for Particle Simulation Methods (PSM) involving short-range interactions, such as Smoothed Particle Hydrodynamics (SPH), Moving Particle Semi-implicit method (MPS), and Discrete Element method (DEM). These are needed to handle billions of particles modeled in large distributed-memory computer systems. Our method utilizes flexible orthogonal domain decomposition, allowing the sub-domain boundaries in the column to be different for each row. The imbalances in the execution time between parallel logical processes are treated as a nonlinear residual. Load-balancing is achieved by minimizing the residual within the framework of an iterative nonlinear solver, combined with a multigrid technique in the local smoother. Our iterative method is suitable for adjusting the sub-domain frequently by monitoring the performance of each computational process because it is computationally cheaper in terms of communication and memory costs than non-iterative methods. Numerical tests demonstrated the ability of our approach to handle workload imbalances arising from a non-uniform particle distribution, differences in particle types, or heterogeneous computer architecture which was difficult with previously proposed methods. We analyzed the parallel efficiency and scalability of our method using Earth simulator and K-computer supercomputer systems.
Mechanism of travelling-wave transport of particles
NASA Astrophysics Data System (ADS)
Kawamoto, Hiroyuki; Seki, Kyogo; Kuromiya, Naoyuki
2006-03-01
Numerical and experimental investigations have been carried out on transport of particles in an electrostatic travelling field. A three-dimensional hard-sphere model of the distinct element method was developed to simulate the dynamics of particles. Forces applied to particles in the model were the Coulomb force, the dielectrophoresis force on polarized dipole particles in a non-uniform field, the image force, gravity and the air drag. Friction and repulsion between particle-particle and particle-conveyer were included in the model to replace initial conditions after mechanical contacts. Two kinds of experiments were performed to confirm the model. One was the measurement of charge of particles that is indispensable to determine the Coulomb force. Charge distribution was measured from the locus of free-fallen particles in a parallel electrostatic field. The averaged charge of the bulk particle was confirmed by measurement with a Faraday cage. The other experiment was measurements of the differential dynamics of particles on a conveyer consisting of parallel electrodes to which a four-phase travelling electrostatic wave was applied. Calculated results agreed with measurements, and the following characteristics were clarified. (1) The Coulomb force is the predominant force to drive particles compared with the other kinds of forces, (2) the direction of particle transport did not always coincide with that of the travelling wave but changed partially. It depended on the frequency of the travelling wave, the particle diameter and the electric field, (3) although some particles overtook the travelling wave at a very low frequency, the motion of particles was almost synchronized with the wave at the low frequency and (4) the transport of some particles was delayed to the wave at medium frequency; the majority of particles were transported backwards at high frequency and particles were not transported but only vibrated at very high frequency.
Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-core Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aktulga, Hasan Metin; Coffman, Paul; Shan, Tzu-Ray
2015-12-01
Hybrid parallelism allows high performance computing applications to better leverage the increasing on-node parallelism of modern supercomputers. In this paper, we present a hybrid parallel implementation of the widely used LAMMPS/ReaxC package, where the construction of bonded and nonbonded lists and evaluation of complex ReaxFF interactions are implemented efficiently using OpenMP parallelism. Additionally, the performance of the QEq charge equilibration scheme is examined and a dual-solver is implemented. We present the performance of the resulting ReaxC-OMP package on a state-of-the-art multi-core architecture Mira, an IBM BlueGene/Q supercomputer. For system sizes ranging from 32 thousand to 16.6 million particles, speedups inmore » the range of 1.5-4.5x are observed using the new ReaxC-OMP software. Sustained performance improvements have been observed for up to 262,144 cores (1,048,576 processes) of Mira with a weak scaling efficiency of 91.5% in larger simulations containing 16.6 million particles.« less
Information criteria for quantifying loss of reversibility in parallelized KMC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gourgoulias, Konstantinos, E-mail: gourgoul@math.umass.edu; Katsoulakis, Markos A., E-mail: markos@math.umass.edu; Rey-Bellet, Luc, E-mail: luc@math.umass.edu
Parallel Kinetic Monte Carlo (KMC) is a potent tool to simulate stochastic particle systems efficiently. However, despite literature on quantifying domain decomposition errors of the particle system for this class of algorithms in the short and in the long time regime, no study yet explores and quantifies the loss of time-reversibility in Parallel KMC. Inspired by concepts from non-equilibrium statistical mechanics, we propose the entropy production per unit time, or entropy production rate, given in terms of an observable and a corresponding estimator, as a metric that quantifies the loss of reversibility. Typically, this is a quantity that cannot bemore » computed explicitly for Parallel KMC, which is why we develop a posteriori estimators that have good scaling properties with respect to the size of the system. Through these estimators, we can connect the different parameters of the scheme, such as the communication time step of the parallelization, the choice of the domain decomposition, and the computational schedule, with its performance in controlling the loss of reversibility. From this point of view, the entropy production rate can be seen both as an information criterion to compare the reversibility of different parallel schemes and as a tool to diagnose reversibility issues with a particular scheme. As a demonstration, we use Sandia Lab's SPPARKS software to compare different parallelization schemes and different domain (lattice) decompositions.« less
Information criteria for quantifying loss of reversibility in parallelized KMC
NASA Astrophysics Data System (ADS)
Gourgoulias, Konstantinos; Katsoulakis, Markos A.; Rey-Bellet, Luc
2017-01-01
Parallel Kinetic Monte Carlo (KMC) is a potent tool to simulate stochastic particle systems efficiently. However, despite literature on quantifying domain decomposition errors of the particle system for this class of algorithms in the short and in the long time regime, no study yet explores and quantifies the loss of time-reversibility in Parallel KMC. Inspired by concepts from non-equilibrium statistical mechanics, we propose the entropy production per unit time, or entropy production rate, given in terms of an observable and a corresponding estimator, as a metric that quantifies the loss of reversibility. Typically, this is a quantity that cannot be computed explicitly for Parallel KMC, which is why we develop a posteriori estimators that have good scaling properties with respect to the size of the system. Through these estimators, we can connect the different parameters of the scheme, such as the communication time step of the parallelization, the choice of the domain decomposition, and the computational schedule, with its performance in controlling the loss of reversibility. From this point of view, the entropy production rate can be seen both as an information criterion to compare the reversibility of different parallel schemes and as a tool to diagnose reversibility issues with a particular scheme. As a demonstration, we use Sandia Lab's SPPARKS software to compare different parallelization schemes and different domain (lattice) decompositions.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.
ERIC Educational Resources Information Center
Hickson, Mark, III; And Others
1991-01-01
Develops a communication perspective on sexual harassment in asynchronous relationships. Presents a six-step process model to predict private harassing behavior among faculty members in higher education. Makes suggestions for prevention of sexual harassment. (SR)
An implementation of a tree code on a SIMD, parallel computer
NASA Technical Reports Server (NTRS)
Olson, Kevin M.; Dorband, John E.
1994-01-01
We describe a fast tree algorithm for gravitational N-body simulation on SIMD parallel computers. The tree construction uses fast, parallel sorts. The sorted lists are recursively divided along their x, y and z coordinates. This data structure is a completely balanced tree (i.e., each particle is paired with exactly one other particle) and maintains good spatial locality. An implementation of this tree-building algorithm on a 16k processor Maspar MP-1 performs well and constitutes only a small fraction (approximately 15%) of the entire cycle of finding the accelerations. Each node in the tree is treated as a monopole. The tree search and the summation of accelerations also perform well. During the tree search, node data that is needed from another processor is simply fetched. Roughly 55% of the tree search time is spent in communications between processors. We apply the code to two problems of astrophysical interest. The first is a simulation of the close passage of two gravitationally, interacting, disk galaxies using 65,636 particles. We also simulate the formation of structure in an expanding, model universe using 1,048,576 particles. Our code attains speeds comparable to one head of a Cray Y-MP, so single instruction, multiple data (SIMD) type computers can be used for these simulations. The cost/performance ratio for SIMD machines like the Maspar MP-1 make them an extremely attractive alternative to either vector processors or large multiple instruction, multiple data (MIMD) type parallel computers. With further optimizations (e.g., more careful load balancing), speeds in excess of today's vector processing computers should be possible.
Interpretations of the impact of cross-field drifts on divertor flows in DIII-D with UEDGE
Jaervinen, Aaro E.; Allen, Steve L.; Groth, Mathias; ...
2017-01-27
Simulations using the multi-fluid code UEDGE indicates that, in low confinement (Lmode) plasmas in DIII-D, recycling driven flows dominate poloidal particle flows in the divertor, whereas E×B drift flows dominate the radial particle flows. In contrast, in high confinement (H-mode) conditions E×B drift flows dominate both poloidal and radial particle flows in the divertor. UEDGE indicates that the toroidal C 2+ flow velocities in the divertor plasma are entrained within 30% to the background deuterium flow in both Land H-mode plasmas in the plasma region where the CIII 465 nm emission is measured. Therefore, UEDGE indicates that the Carbon Dopplermore » Coherence Imaging System (CIS), measuring the toroidal velocity of the C 2+ ions, can provide insight to the deuterium flows in the divertor. Parallel-to-B velocity dominates the toroidal divertor flow; direct drift impact being less than 1%. Toroidal divertor flow is predicted to reverse when the magnetic field is reversed. This is explained by the parallel-B flow towards the nearest divertor plate corresponding to opposite toroidal directions in opposite toroidal field configurations. Due to strong poloidal E×B flows in H-mode, net poloidal particle transport can be in opposite direction than the poloidal component of the parallel-B plasma flow.« less
NDL-v2.0: A new version of the numerical differentiation library for parallel architectures
NASA Astrophysics Data System (ADS)
Hadjidoukas, P. E.; Angelikopoulos, P.; Voglis, C.; Papageorgiou, D. G.; Lagaris, I. E.
2014-07-01
We present a new version of the numerical differentiation library (NDL) used for the numerical estimation of first and second order partial derivatives of a function by finite differencing. In this version we have restructured the serial implementation of the code so as to achieve optimal task-based parallelization. The pure shared-memory parallelization of the library has been based on the lightweight OpenMP tasking model allowing for the full extraction of the available parallelism and efficient scheduling of multiple concurrent library calls. On multicore clusters, parallelism is exploited by means of TORC, an MPI-based multi-threaded tasking library. The new MPI implementation of NDL provides optimal performance in terms of function calls and, furthermore, supports asynchronous execution of multiple library calls within legacy MPI programs. In addition, a Python interface has been implemented for all cases, exporting the functionality of our library to sequential Python codes. Catalog identifier: AEDG_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDG_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 63036 No. of bytes in distributed program, including test data, etc.: 801872 Distribution format: tar.gz Programming language: ANSI Fortran-77, ANSI C, Python. Computer: Distributed systems (clusters), shared memory systems. Operating system: Linux, Unix. Has the code been vectorized or parallelized?: Yes. RAM: The library uses O(N) internal storage, N being the dimension of the problem. It can use up to O(N2) internal storage for Hessian calculations, if a task throttling factor has not been set by the user. Classification: 4.9, 4.14, 6.5. Catalog identifier of previous version: AEDG_v1_0 Journal reference of previous version: Comput. Phys. Comm. 180(2009)1404 Does the new version supersede the previous version?: Yes Nature of problem: The numerical estimation of derivatives at several accuracy levels is a common requirement in many computational tasks, such as optimization, solution of nonlinear systems, and sensitivity analysis. For a large number of scientific and engineering applications, the underlying functions correspond to simulation codes for which analytical estimation of derivatives is difficult or almost impossible. A parallel implementation that exploits systems with multiple CPUs is very important for large scale and computationally expensive problems. Solution method: Finite differencing is used with a carefully chosen step that minimizes the sum of the truncation and round-off errors. The parallel versions employ both OpenMP and MPI libraries. Reasons for new version: The updated version was motivated by our endeavors to extend a parallel Bayesian uncertainty quantification framework [1], by incorporating higher order derivative information as in most state-of-the-art stochastic simulation methods such as Stochastic Newton MCMC [2] and Riemannian Manifold Hamiltonian MC [3]. The function evaluations are simulations with significant time-to-solution, which also varies with the input parameters such as in [1, 4]. The runtime of the N-body-type of problem changes considerably with the introduction of a longer cut-off between the bodies. In the first version of the library, the OpenMP-parallel subroutines spawn a new team of threads and distribute the function evaluations with a PARALLEL DO directive. This limits the functionality of the library as multiple concurrent calls require nested parallelism support from the OpenMP environment. Therefore, either their function evaluations will be serialized or processor oversubscription is likely to occur due to the increased number of OpenMP threads. In addition, the Hessian calculations include two explicit parallel regions that compute first the diagonal and then the off-diagonal elements of the array. Due to the barrier between the two regions, the parallelism of the calculations is not fully exploited. These issues have been addressed in the new version by first restructuring the serial code and then running the function evaluations in parallel using OpenMP tasks. Although the MPI-parallel implementation of the first version is capable of fully exploiting the task parallelism of the PNDL routines, it does not utilize the caching mechanism of the serial code and, therefore, performs some redundant function evaluations in the Hessian and Jacobian calculations. This can lead to: (a) higher execution times if the number of available processors is lower than the total number of tasks, and (b) significant energy consumption due to wasted processor cycles. Overcoming these drawbacks, which become critical as the time of a single function evaluation increases, was the primary goal of this new version. Due to the code restructure, the MPI-parallel implementation (and the OpenMP-parallel in accordance) avoids redundant calls, providing optimal performance in terms of the number of function evaluations. Another limitation of the library was that the library subroutines were collective and synchronous calls. In the new version, each MPI process can issue any number of subroutines for asynchronous execution. We introduce two library calls that provide global and local task synchronizations, similarly to the BARRIER and TASKWAIT directives of OpenMP. The new MPI-implementation is based on TORC, a new tasking library for multicore clusters [5-7]. TORC improves the portability of the software, as it relies exclusively on the POSIX-Threads and MPI programming interfaces. It allows MPI processes to utilize multiple worker threads, offering a hybrid programming and execution environment similar to MPI+OpenMP, in a completely transparent way. Finally, to further improve the usability of our software, a Python interface has been implemented on top of both the OpenMP and MPI versions of the library. This allows sequential Python codes to exploit shared and distributed memory systems. Summary of revisions: The revised code improves the performance of both parallel (OpenMP and MPI) implementations. The functionality and the user-interface of the MPI-parallel version have been extended to support the asynchronous execution of multiple PNDL calls, issued by one or multiple MPI processes. A new underlying tasking library increases portability and allows MPI processes to have multiple worker threads. For both implementations, an interface to the Python programming language has been added. Restrictions: The library uses only double precision arithmetic. The MPI implementation assumes the homogeneity of the execution environment provided by the operating system. Specifically, the processes of a single MPI application must have identical address space and a user function resides at the same virtual address. In addition, address space layout randomization should not be used for the application. Unusual features: The software takes into account bound constraints, in the sense that only feasible points are used to evaluate the derivatives, and given the level of the desired accuracy, the proper formula is automatically employed. Running time: Running time depends on the function's complexity. The test run took 23 ms for the serial distribution, 25 ms for the OpenMP with 2 threads, 53 ms and 1.01 s for the MPI parallel distribution using 2 threads and 2 processes respectively and yield-time for idle workers equal to 10 ms. References: [1] P. Angelikopoulos, C. Paradimitriou, P. Koumoutsakos, Bayesian uncertainty quantification and propagation in molecular dynamics simulations: a high performance computing framework, J. Chem. Phys 137 (14). [2] H.P. Flath, L.C. Wilcox, V. Akcelik, J. Hill, B. van Bloemen Waanders, O. Ghattas, Fast algorithms for Bayesian uncertainty quantification in large-scale linear inverse problems based on low-rank partial Hessian approximations, SIAM J. Sci. Comput. 33 (1) (2011) 407-432. [3] M. Girolami, B. Calderhead, Riemann manifold Langevin and Hamiltonian Monte Carlo methods, J. R. Stat. Soc. Ser. B (Stat. Methodol.) 73 (2) (2011) 123-214. [4] P. Angelikopoulos, C. Paradimitriou, P. Koumoutsakos, Data driven, predictive molecular dynamics for nanoscale flow simulations under uncertainty, J. Phys. Chem. B 117 (47) (2013) 14808-14816. [5] P.E. Hadjidoukas, E. Lappas, V.V. Dimakopoulos, A runtime library for platform-independent task parallelism, in: PDP, IEEE, 2012, pp. 229-236. [6] C. Voglis, P.E. Hadjidoukas, D.G. Papageorgiou, I. Lagaris, A parallel hybrid optimization algorithm for fitting interatomic potentials, Appl. Soft Comput. 13 (12) (2013) 4481-4492. [7] P.E. Hadjidoukas, C. Voglis, V.V. Dimakopoulos, I. Lagaris, D.G. Papageorgiou, Supporting adaptive and irregular parallelism for non-linear numerical optimization, Appl. Math. Comput. 231 (2014) 544-559.
Component Framework for Loosely Coupled High Performance Integrated Plasma Simulations
NASA Astrophysics Data System (ADS)
Elwasif, W. R.; Bernholdt, D. E.; Shet, A. G.; Batchelor, D. B.; Foley, S.
2010-11-01
We present the design and implementation of a component-based simulation framework for the execution of coupled time-dependent plasma modeling codes. The Integrated Plasma Simulator (IPS) provides a flexible lightweight component model that streamlines the integration of stand alone codes into coupled simulations. Standalone codes are adapted to the IPS component interface specification using a thin wrapping layer implemented in the Python programming language. The framework provides services for inter-component method invocation, configuration, task, and data management, asynchronous event management, simulation monitoring, and checkpoint/restart capabilities. Services are invoked, as needed, by the computational components to coordinate the execution of different aspects of coupled simulations on Massive parallel Processing (MPP) machines. A common plasma state layer serves as the foundation for inter-component, file-based data exchange. The IPS design principles, implementation details, and execution model will be presented, along with an overview of several use cases.
Qiao, W; Stephan, D; Hasselbeck, M; Liang, Q; Dekorsy, T
2012-08-27
A compact high-resolution THz time-domain waveguide spectrometer that is operated inside a cryostat is demonstrated. A THz photo-Dember emitter and a ZnTe electro-optic detection crystal are directly attached to a parallel copper-plate waveguide. This allows the THz beam to be excited and detected entirely inside the cryostat, obviating the need for THz-transparent windows or external THz mirrors. Since no external bias for the emitter is required, no electric feed-through into the cryostat is necessary. Using asynchronous optical sampling, high resolution THz spectra are obtained in the frequency range from 0.2 to 2.0 THz. The THz emission from the photo-Dember emitter and the absorption spectrum of 1,2-dicyanobenzene film are measured as a function of temperature. An absorption peak around 750 GHz of 1,2-dicyanobenzene displays a blue shift with increasing temperature.
Barrier-breaking performance for industrial problems on the CRAY C916
DOE Office of Scientific and Technical Information (OSTI.GOV)
Graffunder, S.K.
1993-12-31
Nine applications, including third-party codes, were submitted to the Gordon Bell Prize committee showing the CRAY C916 supercomputer providing record-breaking time to solution for industrial problems in several disciplines. Performance was obtained by balancing raw hardware speed; effective use of large, real, shared memory; compiler vectorization and autotasking; hand optimization; asynchronous I/O techniques; and new algorithms. The highest GFLOPS performance for the submissions was 11.1 GFLOPS out of a peak advertised performance of 16 GFLOPS for the CRAY C916 system. One program achieved a 15.45 speedup from the compiler with just two hand-inserted directives to scope variables properly for themore » mathematical library. New I/O techniques hide tens of gigabytes of I/O behind parallel computations. Finally, new iterative solver algorithms have demonstrated times to solution on 1 CPU as high as 70 times faster than the best direct solvers.« less
Automated high-throughput flow-through real-time diagnostic system
Regan, John Frederick
2012-10-30
An automated real-time flow-through system capable of processing multiple samples in an asynchronous, simultaneous, and parallel fashion for nucleic acid extraction and purification, followed by assay assembly, genetic amplification, multiplex detection, analysis, and decontamination. The system is able to hold and access an unlimited number of fluorescent reagents that may be used to screen samples for the presence of specific sequences. The apparatus works by associating extracted and purified sample with a series of reagent plugs that have been formed in a flow channel and delivered to a flow-through real-time amplification detector that has a multiplicity of optical windows, to which the sample-reagent plugs are placed in an operative position. The diagnostic apparatus includes sample multi-position valves, a master sample multi-position valve, a master reagent multi-position valve, reagent multi-position valves, and an optical amplification/detection system.
A Latency-Tolerant Partitioner for Distributed Computing on the Information Power Grid
NASA Technical Reports Server (NTRS)
Das, Sajal K.; Harvey, Daniel J.; Biwas, Rupak; Kwak, Dochan (Technical Monitor)
2001-01-01
NASA's Information Power Grid (IPG) is an infrastructure designed to harness the power of graphically distributed computers, databases, and human expertise, in order to solve large-scale realistic computational problems. This type of a meta-computing environment is necessary to present a unified virtual machine to application developers that hides the intricacies of a highly heterogeneous environment and yet maintains adequate security. In this paper, we present a novel partitioning scheme. called MinEX, that dynamically balances processor workloads while minimizing data movement and runtime communication, for applications that are executed in a parallel distributed fashion on the IPG. We also analyze the conditions that are required for the IPG to be an effective tool for such distributed computations. Our results show that MinEX is a viable load balancer provided the nodes of the IPG are connected by a high-speed asynchronous interconnection network.
Zonal methods for the parallel execution of range-limited N-body simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bowers, Kevin J.; Dror, Ron O.; Shaw, David E.
2007-01-20
Particle simulations in fields ranging from biochemistry to astrophysics require the evaluation of interactions between all pairs of particles separated by less than some fixed interaction radius. The applicability of such simulations is often limited by the time required for calculation, but the use of massive parallelism to accelerate these computations is typically limited by inter-processor communication requirements. Recently, Snir [M. Snir, A note on N-body computations with cutoffs, Theor. Comput. Syst. 37 (2004) 295-318] and Shaw [D.E. Shaw, A fast, scalable method for the parallel evaluation of distance-limited pairwise particle interactions, J. Comput. Chem. 26 (2005) 1318-1328] independently introducedmore » two distinct methods that offer asymptotic reductions in the amount of data transferred between processors. In the present paper, we show that these schemes represent special cases of a more general class of methods, and introduce several new algorithms in this class that offer practical advantages over all previously described methods for a wide range of problem parameters. We also show that several of these algorithms approach an approximate lower bound on inter-processor data transfer.« less
2008-01-01
A second objective is to characterize variability in the volume scattering function and particle size distribution for various optical water types...volume scattering function (VSF) and the particle size distribution (PSD) • Analysis of in situ optical measurements and particle size distributions ...SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION /AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY
High-Performance Design Patterns for Modern Fortran
Haveraaen, Magne; Morris, Karla; Rouson, Damian; ...
2015-01-01
This paper presents ideas for using coordinate-free numerics in modern Fortran to achieve code flexibility in the partial differential equation (PDE) domain. We also show how Fortran, over the last few decades, has changed to become a language well-suited for state-of-the-art software development. Fortran’s new coarray distributed data structure, the language’s class mechanism, and its side-effect-free, pure procedure capability provide the scaffolding on which we implement HPC software. These features empower compilers to organize parallel computations with efficient communication. We present some programming patterns that support asynchronous evaluation of expressions comprised of parallel operations on distributed data. We implemented thesemore » patterns using coarrays and the message passing interface (MPI). We compared the codes’ complexity and performance. The MPI code is much more complex and depends on external libraries. The MPI code on Cray hardware using the Cray compiler is 1.5–2 times faster than the coarray code on the same hardware. The Intel compiler implements coarrays atop Intel’s MPI library with the result apparently being 2–2.5 times slower than manually coded MPI despite exhibiting nearly linear scaling efficiency. As compilers mature and further improvements to coarrays comes in Fortran 2015, we expect this performance gap to narrow.« less
Yousefzadeh, Amirreza; Jablonski, Miroslaw; Iakymchuk, Taras; Linares-Barranco, Alejandro; Rosado, Alfredo; Plana, Luis A; Temple, Steve; Serrano-Gotarredona, Teresa; Furber, Steve B; Linares-Barranco, Bernabe
2017-10-01
Address event representation (AER) is a widely employed asynchronous technique for interchanging "neural spikes" between different hardware elements in neuromorphic systems. Each neuron or cell in a chip or a system is assigned an address (or ID), which is typically communicated through a high-speed digital bus, thus time-multiplexing a high number of neural connections. Conventional AER links use parallel physical wires together with a pair of handshaking signals (request and acknowledge). In this paper, we present a fully serial implementation using bidirectional SATA connectors with a pair of low-voltage differential signaling (LVDS) wires for each direction. The proposed implementation can multiplex a number of conventional parallel AER links for each physical LVDS connection. It uses flow control, clock correction, and byte alignment techniques to transmit 32-bit address events reliably over multiplexed serial connections. The setup has been tested using commercial Spartan6 FPGAs attaining a maximum event transmission speed of 75 Meps (Mega events per second) for 32-bit events at a line rate of 3.0 Gbps. Full HDL codes (vhdl/verilog) and example demonstration codes for the SpiNNaker platform will be made available.
Determination of freeze damage on HPV vaccines by use of flow cytometry.
Østergaard, Erik; Frandsen, Peer Lyng; Sandberg, Eva
2015-07-01
The human papillomavirus (HPV) vaccines Gardasil, Silgard and Cervarix were labeled with antibodies against HPV strain 6 or 16/FITC conjugated secondary antibodies and analyzed by flow cytometry. The vaccines showed distinct peaks of fluorescent particles, and a shift towards decreased fluorescent particles was observed after incubation of the vaccines over night at -20 °C. Since parallel distributed vaccines could have longer route of transportation there is an increased risk of freeze damage for these types of vaccine. Shift in fluorescence of labeled vaccine particles was used to indicate whether parallel distributed Silgard, which is a vaccine type identical to Gardasil, was exposed to freeze damage during transportation, but no shift was observed. Additional experiments showed that the HPV vaccines could be degraded to smaller particles by citric acid/phosphate buffer treatment. The majority of particles detected in degraded Gardasil were very small indicating that the particles are HPV virus like particle (VLPs) labeled with antibodies, but Cervarix could only be degraded partially due to the presence of another type adjuvant in this vaccine. The described method may be useful in characterization of adjuvanted vaccines with respect to freeze damage, and to characterize vaccines containing particles corresponding to VLPs in size. Copyright © 2015 The International Alliance for Biological Standardization. Published by Elsevier Ltd. All rights reserved.
Student Satisfaction with Asynchronous Learning
ERIC Educational Resources Information Center
Dziuban, Charles; Moskal, Patsy; Brophy, Jay; Shea, Peter
2007-01-01
The authors discuss elements that potentially impact student satisfaction with asynchronous learning: the media culture, digital, personal and mobile technologies, student learning preferences, pedagogy, complexities of measurement, and the digital generation. They describe a pilot study to identify the underlying dimensions of student…
Software and hardware complex for research and management of the separation process
NASA Astrophysics Data System (ADS)
Borisov, A. P.
2018-01-01
The article is devoted to the development of a program for studying the operation of an asynchronous electric drive using vector-algorithmic switching of windings, as well as the development of a hardware-software complex for controlling parameters and controlling the speed of rotation of an asynchronous electric drive for investigating the operation of a cyclone. To study the operation of an asynchronous electric drive, a method was used in which the average value of flux linkage is found and a method for vector-algorithmic calculation of the power and electromagnetic moment of an asynchronous electric drive feeding from a single-phase network is developed, with vector-algorithmic commutation, and software for calculating parameters. The software part of the complex allows to regulate the speed of rotation of the motor by vector-algorithmic switching of transistors or, using pulse-width modulation (PWM), set any engine speed. Also sensors are connected to the hardware-software complex at the inlet and outlet of the cyclone. The developed cyclone with an inserted complex allows to receive high efficiency of product separation at various entrance speeds. At an inlet air speed of 18 m / s, the cyclone’s maximum efficiency is achieved. For this, it is necessary to provide the rotational speed of an asynchronous electric drive with a frequency of 45 Hz.
An exploration of teaching presence in online interprofessional education facilitation.
Evans, Sherryn Maree; Ward, Catherine; Reeves, Scott
2017-07-01
Although the prevalence of online asynchronous interprofessional education (IPE) has increased in the last decade, little is known about the processes of facilitation in this environment. The teaching presence element of the Community of Inquiry Framework offers an approach to analyze the contributions of online facilitators, however, to date it has only been used on a limited basis in health professions education literature. Using an exploratory case study design, we explored the types of contributions made by IPE facilitators to asynchronous interprofessional team discussions by applying the notion of teaching presence. Using a purposeful sampling approach, we analyzed 14 facilitators' contributions to asynchronous team discussion boards in an online IPE course. We analyzed data using directed content analysis based on the key indicators of teaching presence. The online IPE facilitators undertook the three critical pedagogical functions identified in teaching presence: facilitating discourse, direct instruction, and instructional design and organization. While our data fitted well with a number of key activities embedded in these three functions, further modification of the teaching presence concept was needed to describe our facilitators' teaching presence. This study provides an initial insight into the key elements of online asynchronous IPE facilitation. Further research is required to continue to illuminate the complexity of online asynchronous IPE facilitation.
NASA Astrophysics Data System (ADS)
Krumpe, Tanja; Walter, Carina; Rosenstiel, Wolfgang; Spüler, Martin
2016-08-01
Objective. In this study, the feasibility of detecting a P300 via an asynchronous classification mode in a reactive EEG-based brain-computer interface (BCI) was evaluated. The P300 is one of the most popular BCI control signals and therefore used in many applications, mostly for active communication purposes (e.g. P300 speller). As the majority of all systems work with a stimulus-locked mode of classification (synchronous), the field of applications is limited. A new approach needs to be applied in a setting in which a stimulus-locked classification cannot be used due to the fact that the presented stimuli cannot be controlled or predicted by the system. Approach. A continuous observation task requiring the detection of outliers was implemented to test such an approach. The study was divided into an offline and an online part. Main results. Both parts of the study revealed that an asynchronous detection of the P300 can successfully be used to detect single events with high specificity. It also revealed that no significant difference in performance was found between the synchronous and the asynchronous approach. Significance. The results encourage the use of an asynchronous classification approach in suitable applications without a potential loss in performance.
A 41 ps ASIC time-to-digital converter for physics experiments
NASA Astrophysics Data System (ADS)
Russo, Stefano; Petra, Nicola; De Caro, Davide; Barbarino, Giancarlo; Strollo, Antonio G. M.
2011-12-01
We present a novel Time-to-Digital (TDC) converter for physics experiments. Proposed TDC is based on a synchronous counter and an asynchronous fine interpolator. The fine part of the measurement is obtained using NORA inverters that provide improved resolution. A prototype IC was fabricated in 180 nm CMOS technology. Experimental measurements show that proposed TDC features 41 ps resolution associated with 0.35LSB differential non-linearity, 0.77LSB integral non-linearity and a negligible single shot precision. The whole dynamic range is equal to 18 μs. The proposed TDC is designed using a flash architecture that reduces dead time. Data reported in the paper show that our design is well suited for present and future particle physics experiments.
NASA Astrophysics Data System (ADS)
Decyk, Viktor K.; Dauger, Dean E.
We have constructed a parallel cluster consisting of Apple Macintosh G4 computers running both Classic Mac OS as well as the Unix-based Mac OS X, and have achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. Unlike other Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the mainstream of computing.
NASA Technical Reports Server (NTRS)
Reiff, P. H.; Collin, H. L.; Craven, J. D.; Burch, J. L.; Winningham, J. D.
1988-01-01
The auroral electrostatic potential differences were determined from the particle distribution functions obtained nearly simultaneously above and below the auroral acceleration region by DE-1 at altitudes 9000-15,000 km and DE-2 at 400-800 km. Three independent techniques were used: (1) the peak energies of precipitating electrons observed by DE-2, (2) the widening of loss cones for upward traveling electrons observed by DE-1, and (3) the energies of upgoing ions observed by DE-1. The assumed parallel electrostatic potential difference calculated by the three methods was nearly the same. The results confirmed the hypothesis that parallel electrostatic fields of 1-10 kV potential drop at 1-2 earth radii altitude are an important source for auroral particle acceleration.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dooley, James H.; Lanning, David N.
Comminution process of wood veneer to produce wood particles, by feeding wood veneer in a direction of travel substantially normal to grain through a counter rotating pair of intermeshing arrays of cutting discs arrayed axially perpendicular to the direction of wood veneer travel, wherein the cutting discs have a uniform thickness (Td), to produce wood particles characterized by a length dimension (L) substantially equal to the Td and aligned substantially parallel to grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) aligned normal to W and L, wherein the W.times.H dimensions definemore » a pair of substantially parallel end surfaces with end checking between crosscut fibers.« less
NASA Astrophysics Data System (ADS)
Ovaysi, S.; Piri, M.
2009-12-01
We present a three-dimensional fully dynamic parallel particle-based model for direct pore-level simulation of incompressible viscous fluid flow in disordered porous media. The model was developed from scratch and is capable of simulating flow directly in three-dimensional high-resolution microtomography images of naturally occurring or man-made porous systems. It reads the images as input where the position of the solid walls are given. The entire medium, i.e., solid and fluid, is then discretized using particles. The model is based on Moving Particle Semi-implicit (MPS) technique. We modify this technique in order to improve its stability. The model handles highly irregular fluid-solid boundaries effectively. It takes into account viscous pressure drop in addition to the gravity forces. It conserves mass and can automatically detect any false connectivity with fluid particles in the neighboring pores and throats. It includes a sophisticated algorithm to automatically split and merge particles to maintain hydraulic connectivity of extremely narrow conduits. Furthermore, it uses novel methods to handle particle inconsistencies and open boundaries. To handle the computational load, we present a fully parallel version of the model that runs on distributed memory computer clusters and exhibits excellent scalability. The model is used to simulate unsteady-state flow problems under different conditions starting from straight noncircular capillary tubes with different cross-sectional shapes, i.e., circular/elliptical, square/rectangular and triangular cross-sections. We compare the predicted dimensionless hydraulic conductances with the data available in the literature and observe an excellent agreement. We then test the scalability of our parallel model with two samples of an artificial sandstone, samples A and B, with different volumes and different distributions (non-uniform and uniform) of solid particles among the processors. An excellent linear scalability is obtained for sample B that has more uniform distribution of solid particles leading to a superior load balancing. The model is then used to simulate fluid flow directly in REV size three-dimensional x-ray images of a naturally occurring sandstone. We analyze the quality and consistency of the predicted flow behavior and calculate absolute permeability, which compares well with the available network modeling and Lattice-Boltzmann permeabilities available in the literature for the same sandstone. We show that the model conserves mass very well and is stable computationally even at very narrow fluid conduits. The transient- and the steady-state fluid flow patterns are presented as well as the steady-state flow rates to compute absolute permeability. Furthermore, we discuss the vital role of our adaptive particle resolution scheme in preserving the original pore connectivity of the samples and their narrow channels through splitting and merging of fluid particles.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lasuik, J.; Shalchi, A., E-mail: andreasm4@yahoo.com
Recently, a new theory for the transport of energetic particles across a mean magnetic field was presented. Compared to other nonlinear theories the new approach has the advantage that it provides a full time-dependent description of the transport. Furthermore, a diffusion approximation is no longer part of that theory. The purpose of this paper is to combine this new approach with a time-dependent model for parallel transport and different turbulence configurations in order to explore the parameter regimes for which we get ballistic transport, compound subdiffusion, and normal Markovian diffusion.
Turbulence Evolution and Shock Acceleration of Solar Energetic Particles
NASA Technical Reports Server (NTRS)
Chee, Ng K.
2007-01-01
We model the effects of self-excitation/damping and shock transmission of Alfven waves on solar-energetic-particle (SEP) acceleration at a coronal-mass-ejection (CME) driven parallel shock. SEP-excited outward upstream waves speedily bootstrap acceleration. Shock transmission further raises the SEP-excited wave intensities at high wavenumbers but lowers them at low wavenumbers through wavenumber shift. Downstream, SEP excitation of inward waves and damping of outward waves tend to slow acceleration. Nevertheless, > 2000 km/s parallel shocks at approx. 3.5 solar radii can accelerate SEPs to 100 MeV in < 5 minutes.
Learning Objects and Gerontology
ERIC Educational Resources Information Center
Weinreich, Donna M.; Tompkins, Catherine J.
2006-01-01
Virtual AGE (vAGE) is an asynchronous educational environment that utilizes learning objects focused on gerontology and a learning anytime/anywhere philosophy. This paper discusses the benefits of asynchronous instruction and the process of creating learning objects. Learning objects are "small, reusable chunks of instructional media" Wiley…
Redundant Asynchronous Microprocessor System
NASA Technical Reports Server (NTRS)
Meyer, G.; Johnston, J. O.; Dunn, W. R.
1985-01-01
Fault-tolerant computer structure called RAMPS (for redundant asynchronous microprocessor system) has simplicity of static redundancy but offers intermittent-fault handling ability of complex, dynamically redundant systems. New structure useful wherever several microprocessors are employed for control - in aircraft, industrial processes, robotics, and automatic machining, for example.
PARALLEL HOP: A SCALABLE HALO FINDER FOR MASSIVE COSMOLOGICAL DATA SETS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skory, Stephen; Turk, Matthew J.; Norman, Michael L.
2010-11-15
Modern N-body cosmological simulations contain billions (10{sup 9}) of dark matter particles. These simulations require hundreds to thousands of gigabytes of memory and employ hundreds to tens of thousands of processing cores on many compute nodes. In order to study the distribution of dark matter in a cosmological simulation, the dark matter halos must be identified using a halo finder, which establishes the halo membership of every particle in the simulation. The resources required for halo finding are similar to the requirements for the simulation itself. In particular, simulations have become too extensive to use commonly employed halo finders, suchmore » that the computational requirements to identify halos must now be spread across multiple nodes and cores. Here, we present a scalable-parallel halo finding method called Parallel HOP for large-scale cosmological simulation data. Based on the halo finder HOP, it utilizes message passing interface and domain decomposition to distribute the halo finding workload across multiple compute nodes, enabling analysis of much larger data sets than is possible with the strictly serial or previous parallel implementations of HOP. We provide a reference implementation of this method as a part of the toolkit {sup yt}, an analysis toolkit for adaptive mesh refinement data that include complementary analysis modules. Additionally, we discuss a suite of benchmarks that demonstrate that this method scales well up to several hundred tasks and data sets in excess of 2000{sup 3} particles. The Parallel HOP method and our implementation can be readily applied to any kind of N-body simulation data and is therefore widely applicable.« less
ATDM LANL FleCSI: Topology and Execution Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bergen, Benjamin Karl
FleCSI is a compile-time configurable C++ framework designed to support multi-physics application development. As such, FleCSI attempts to provide a very general set of infrastructure design patterns that can be specialized and extended to suit the needs of a broad variety of solver and data requirements. This means that FleCSI is potentially useful to many different ECP projects. Current support includes multidimensional mesh topology, mesh geometry, and mesh adjacency information, n-dimensional hashed-tree data structures, graph partitioning interfaces, and dependency closures (to identify data dependencies between distributed-memory address spaces). FleCSI introduces a functional programming model with control, execution, and data abstractionsmore » that are consistent with state-of-the-art task-based runtimes such as Legion and Charm++. The model also provides support for fine-grained, data-parallel execution with backend support for runtimes such as OpenMP and C++17. The FleCSI abstraction layer provides the developer with insulation from the underlying runtimes, while allowing support for multiple runtime systems, including conventional models like asynchronous MPI. The intent is to give developers a concrete set of user-friendly programming tools that can be used now, while allowing flexibility in choosing runtime implementations and optimizations that can be applied to architectures and runtimes that arise in the future. This project is essential to the ECP Ristra Next-Generation Code project, part of ASC ATDM, because it provides a hierarchically parallel programming model that is consistent with the design of modern system architectures, but which allows for the straightforward expression of algorithmic parallelism in a portably performant manner.« less
Acceleration of discrete stochastic biochemical simulation using GPGPU.
Sumiyoshi, Kei; Hirata, Kazuki; Hiroi, Noriko; Funahashi, Akira
2015-01-01
For systems made up of a small number of molecules, such as a biochemical network in a single cell, a simulation requires a stochastic approach, instead of a deterministic approach. The stochastic simulation algorithm (SSA) simulates the stochastic behavior of a spatially homogeneous system. Since stochastic approaches produce different results each time they are used, multiple runs are required in order to obtain statistical results; this results in a large computational cost. We have implemented a parallel method for using SSA to simulate a stochastic model; the method uses a graphics processing unit (GPU), which enables multiple realizations at the same time, and thus reduces the computational time and cost. During the simulation, for the purpose of analysis, each time course is recorded at each time step. A straightforward implementation of this method on a GPU is about 16 times faster than a sequential simulation on a CPU with hybrid parallelization; each of the multiple simulations is run simultaneously, and the computational tasks within each simulation are parallelized. We also implemented an improvement to the memory access and reduced the memory footprint, in order to optimize the computations on the GPU. We also implemented an asynchronous data transfer scheme to accelerate the time course recording function. To analyze the acceleration of our implementation on various sizes of model, we performed SSA simulations on different model sizes and compared these computation times to those for sequential simulations with a CPU. When used with the improved time course recording function, our method was shown to accelerate the SSA simulation by a factor of up to 130.
Acceleration of discrete stochastic biochemical simulation using GPGPU
Sumiyoshi, Kei; Hirata, Kazuki; Hiroi, Noriko; Funahashi, Akira
2015-01-01
For systems made up of a small number of molecules, such as a biochemical network in a single cell, a simulation requires a stochastic approach, instead of a deterministic approach. The stochastic simulation algorithm (SSA) simulates the stochastic behavior of a spatially homogeneous system. Since stochastic approaches produce different results each time they are used, multiple runs are required in order to obtain statistical results; this results in a large computational cost. We have implemented a parallel method for using SSA to simulate a stochastic model; the method uses a graphics processing unit (GPU), which enables multiple realizations at the same time, and thus reduces the computational time and cost. During the simulation, for the purpose of analysis, each time course is recorded at each time step. A straightforward implementation of this method on a GPU is about 16 times faster than a sequential simulation on a CPU with hybrid parallelization; each of the multiple simulations is run simultaneously, and the computational tasks within each simulation are parallelized. We also implemented an improvement to the memory access and reduced the memory footprint, in order to optimize the computations on the GPU. We also implemented an asynchronous data transfer scheme to accelerate the time course recording function. To analyze the acceleration of our implementation on various sizes of model, we performed SSA simulations on different model sizes and compared these computation times to those for sequential simulations with a CPU. When used with the improved time course recording function, our method was shown to accelerate the SSA simulation by a factor of up to 130. PMID:25762936
How to Build an AppleSeed: A Parallel Macintosh Cluster for Numerically Intensive Computing
NASA Astrophysics Data System (ADS)
Decyk, V. K.; Dauger, D. E.
We have constructed a parallel cluster consisting of a mixture of Apple Macintosh G3 and G4 computers running the Mac OS, and have achieved very good performance on numerically intensive, parallel plasma particle-incell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the main stream of computing.
de Jong, N; Verstegen, D M L; Tan, F E S; O'Connor, S J
2013-05-01
This case-study compared traditional, face-to-face classroom-based teaching with asynchronous online learning and teaching methods in two sets of students undertaking a problem-based learning module in the multilevel and exploratory factor analysis of longitudinal data as part of a Masters degree in Public Health at Maastricht University. Students were allocated to one of the two study variants on the basis of their enrolment status as full-time or part-time students. Full-time students (n = 11) followed the classroom-based variant and part-time students (n = 12) followed the online asynchronous variant which included video recorded lectures and a series of asynchronous online group or individual SPSS activities with synchronous tutor feedback. A validated student motivation questionnaire was administered to both groups of students at the start of the study and a second questionnaire was administered at the end of the module. This elicited data about student satisfaction with the module content, teaching and learning methods, and tutor feedback. The module coordinator and problem-based learning tutor were also interviewed about their experience of delivering the experimental online variant and asked to evaluate its success in relation to student attainment of the module's learning outcomes. Student examination results were also compared between the two groups. Asynchronous online teaching and learning methods proved to be an acceptable alternative to classroom-based teaching for both students and staff. Educational outcomes were similar for both groups, but importantly, there was no evidence that the asynchronous online delivery of module content disadvantaged part-time students in comparison to their full-time counterparts.
NASA Astrophysics Data System (ADS)
Jolliet, S.; McMillan, B. F.; Vernay, T.; Villard, L.; Hatzky, R.; Bottino, A.; Angelino, P.
2009-07-01
In this paper, the influence of the parallel nonlinearity on zonal flows and heat transport in global particle-in-cell ion-temperature-gradient simulations is studied. Although this term is in theory orders of magnitude smaller than the others, several authors [L. Villard, P. Angelino, A. Bottino et al., Plasma Phys. Contr. Fusion 46, B51 (2004); L. Villard, S. J. Allfrey, A. Bottino et al., Nucl. Fusion 44, 172 (2004); J. C. Kniep, J. N. G. Leboeuf, and V. C. Decyck, Comput. Phys. Commun. 164, 98 (2004); J. Candy, R. E. Waltz, S. E. Parker et al., Phys. Plasmas 13, 074501 (2006)] found different results on its role. The study is performed using the global gyrokinetic particle-in-cell codes TORB (theta-pinch) [R. Hatzky, T. M. Tran, A. Könies et al., Phys. Plasmas 9, 898 (2002)] and ORB5 (tokamak geometry) [S. Jolliet, A. Bottino, P. Angelino et al., Comput. Phys. Commun. 177, 409 (2007)]. In particular, it is demonstrated that the parallel nonlinearity, while important for energy conservation, affects the zonal electric field only if the simulation is noise dominated. When a proper convergence is reached, the influence of parallel nonlinearity on the zonal electric field, if any, is shown to be small for both the cases of decaying and driven turbulence.
Innovative Methods for Providing Instruction to Distance Students Using Technology.
ERIC Educational Resources Information Center
Pival, Paul R.; Tunon, Johanna
2001-01-01
Examines three innovative methods tried at Nova Southeastern University for providing quality bibliographic instruction to distance students: one synchronous, one asynchronous, and one that combined features from both synchronous and asynchronous methods of delivering instruction. Topics include compressed video, collaborative groupware, streaming…
Miscellany of Students' Satisfaction in an Asynchronous Learning Environment
ERIC Educational Resources Information Center
Larbi-Siaw, Otu; Owusu-Agyeman, Yaw
2017-01-01
This study investigates the determinants of students' satisfaction in an asynchronous learning environment using seven key considerations: the e-learning environment, student-content interaction, student and student interaction, student-teacher interaction, group cohesion and timely participation, knowledge of Internet usage, and satisfaction. The…
Developing asynchronous online interprofessional education.
Sanborn, Heidi
2016-09-01
For many health programmes, developing interprofessional education (IPE) has been a challenge. Evidence on the best method for design and implementation of IPE has been slow to emerge, with little research on how to best incorporate IPE in the asynchronous online learning environment. This leaves online programmes with no clear guidance when embarking upon an initiative to integrate IPE into the curriculum. One tool that can be effective at guiding the incorporation of IPE across all learning platforms is the Interprofessional Education Collaborative (IPEC) competencies. A project was designed to integrate the nationally defined IPEC competencies throughout an asynchronous, online baccalaureate nursing completion programme. A programme-wide review led to targeted revision of course and unit-level objectives, learning experiences, and assessments based on the IPEC framework. As a result of this effort, the programme curriculum now provides interprofessional learning activities across all courses. This report provides a method for using the IPEC competencies to incorporate IPE within various asynchronous learning assessments, assuring students learn about, with, and from other professions.
Zhou, Keming; Cherra, Salvatore J; Goncharov, Alexandr; Jin, Yishi
2017-05-09
Excitation-inhibition imbalance in neural networks is widely linked to neurological and neuropsychiatric disorders. However, how genetic factors alter neuronal activity, leading to excitation-inhibition imbalance, remains unclear. Here, using the C. elegans locomotor circuit, we examine how altering neuronal activity for varying time periods affects synaptic release pattern and animal behavior. We show that while short-duration activation of excitatory cholinergic neurons elicits a reversible enhancement of presynaptic strength, persistent activation results to asynchronous and reduced cholinergic drive, inducing imbalance between endogenous excitation and inhibition. We find that the neuronal calcium sensor protein NCS-2 is required for asynchronous cholinergic release in an activity-dependent manner and dampens excitability of inhibitory neurons non-cell autonomously. The function of NCS-2 requires its Ca 2+ binding and membrane association domains. These results reveal a synaptic mechanism implicating asynchronous release in regulation of excitation-inhibition balance. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
High-Throughput Bit-Serial LDPC Decoder LSI Based on Multiple-Valued Asynchronous Interleaving
NASA Astrophysics Data System (ADS)
Onizawa, Naoya; Hanyu, Takahiro; Gaudet, Vincent C.
This paper presents a high-throughput bit-serial low-density parity-check (LDPC) decoder that uses an asynchronous interleaver. Since consecutive log-likelihood message values on the interleaver are similar, node computations are continuously performed by using the most recently arrived messages without significantly affecting bit-error rate (BER) performance. In the asynchronous interleaver, each message's arrival rate is based on the delay due to the wire length, so that the decoding throughput is not restricted by the worst-case latency, which results in a higher average rate of computation. Moreover, the use of a multiple-valued data representation makes it possible to multiplex control signals and data from mutual nodes, thus minimizing the number of handshaking steps in the asynchronous interleaver and eliminating the clock signal entirely. As a result, the decoding throughput becomes 1.3 times faster than that of a bit-serial synchronous decoder under a 90nm CMOS technology, at a comparable BER.
Hu, Guoqing; Pan, Yingling; Zhao, Xin; Yin, Siyao; Zhang, Meng; Zheng, Zheng
2017-12-01
The evolution from asynchronous to synchronous dual-wavelength pulse generation in a passively mode-locked fiber laser is experimentally investigated by tailoring the intracavity dispersion. Through tuning the intracavity-loss-dependent gain profile and the birefringence-induced filter effect, asynchronous dual-wavelength soliton pulses can be generated until the intracavity anomalous dispersion is reduced to ∼8 fs/nm. The transition from asynchronous to synchronous pulse generation is then observed at an elevated pump power in the presence of residual anomalous dispersion, and it is shown that pulses are temporally synchronized at the mode-locker in the cavity. Spectral sidelobes are observed and could be attributed to the four-wave-mixing effect between dual-wavelength pulses at the carbon nanotube mode-locker. These results could provide further insight into the design and realization of such dual-wavelength ultrafast lasers for different applications such as dual-comb metrology as well as better understanding of the inter-pulse interactions in such dual-comb lasers.
Optimization of Particle-in-Cell Codes on RISC Processors
NASA Technical Reports Server (NTRS)
Decyk, Viktor K.; Karmesin, Steve Roy; Boer, Aeint de; Liewer, Paulette C.
1996-01-01
General strategies are developed to optimize particle-cell-codes written in Fortran for RISC processors which are commonly used on massively parallel computers. These strategies include data reorganization to improve cache utilization and code reorganization to improve efficiency of arithmetic pipelines.
Dynamics of magnetic single domain particles embedded in a viscous liquid
NASA Astrophysics Data System (ADS)
Usadel, K. D.; Usadel, C.
2015-12-01
Kinetic equations for magnetic nano particles dispersed in a viscous liquid are developed and analyzed numerically. Depending on the amplitude of an applied oscillatory magnetic field, the particles orient their time averaged anisotropy axis perpendicular to the applied field for low magnetic field amplitudes and nearly parallel to the direction of the field for high amplitudes. The transition between these regions takes place in a narrow field interval. In the low field region, the magnetic moment is locked to some crystal axis and the energy absorption in an oscillatory driving field is dominated by viscous losses associated with particle rotation in the liquid. In the opposite limit, the magnetic moment rotates within the particle while its easy axis being nearly parallel to the external field direction oscillates. The kinetic equations are generalized to include thermal fluctuations. This leads to a significant increase of the power absorption in the low and intermediate field regions with a pronounced absorption peak as function of particle size. In the high field region, on the other hand, the inclusion of thermal fluctuations reduces the power absorption. The illustrative numerical calculations presented are performed for magnetic parameters typical for iron oxide.
The Effect of Surface Induced Flows on Bubble and Particle Aggregation
NASA Technical Reports Server (NTRS)
Guelcher, Scott A.; Solomentsev, Yuri E.; Anderson, John L.; Boehmer, Marcel; Sides, Paul J.
1999-01-01
Almost 20 years have elapsed since a phenomenon called "radial specific coalescence" was identified. During studies of electrolytic oxygen evolution from the back side of a vertically oriented, transparent tin oxide electrode in alkaline electrolyte, one of the authors (Sides) observed that large "collector" bubbles appeared to attract smaller bubbles. The bubbles moved parallel to the surface of the electrode, while the electric field was normal to the electrode surface. The phenomenon was reported but not explained. More recently self ordering of latex particles was observed during electrophoretic deposition at low DC voltages likewise on a transparent tin oxide electrode. As in the bubble work, the field was normal to the electrode while the particles moved parallel to it. Fluid convection caused by surface induced flows (SIF) can explain these two apparently different experimental observations: the aggregation of particles on an electrode during electrophoretic deposition, and a radial bubble coalescence pattern on an electrode during electrolytic gas evolution. An externally imposed driving force (the gradient of electrical potential or temperature), interacting with the surface of particles or bubbles very near a planar conducting surface, drives the convection of fluid that causes particles and bubbles to approach each other on the electrode.
GPU surface extraction using the closest point embedding
NASA Astrophysics Data System (ADS)
Kim, Mark; Hansen, Charles
2015-01-01
Isosurface extraction is a fundamental technique used for both surface reconstruction and mesh generation. One method to extract well-formed isosurfaces is a particle system; unfortunately, particle systems can be slow. In this paper, we introduce an enhanced parallel particle system that uses the closest point embedding as the surface representation to speedup the particle system for isosurface extraction. The closest point embedding is used in the Closest Point Method (CPM), a technique that uses a standard three dimensional numerical PDE solver on two dimensional embedded surfaces. To fully take advantage of the closest point embedding, it is coupled with a Barnes-Hut tree code on the GPU. This new technique produces well-formed, conformal unstructured triangular and tetrahedral meshes from labeled multi-material volume datasets. Further, this new parallel implementation of the particle system is faster than any known methods for conformal multi-material mesh extraction. The resulting speed-ups gained in this implementation can reduce the time from labeled data to mesh from hours to minutes and benefits users, such as bioengineers, who employ triangular and tetrahedral meshes
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network’s initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data. PMID:27304987
Asynchronous updates can promote the evolution of cooperation on multiplex networks
NASA Astrophysics Data System (ADS)
Allen, James M.; Hoyle, Rebecca B.
2017-04-01
We study the importance to the frequency of cooperation of the choice of updating strategies in a game played asynchronously or synchronously across layers in a multiplex network. Updating asynchronously in the public goods game leads to higher frequencies of cooperation compared to synchronous updates. How large this effect is depends on the sensitivity of the game dynamics to changes in the number of cooperators surrounding a player, with the largest effect observed when players payoffs are small. The discovery of this effect enhances understanding of cooperation on multiplex networks, and demonstrates a new way to maintain cooperation in these systems.
Control strategy for a variable-speed wind energy conversion system
NASA Technical Reports Server (NTRS)
Jacob, A.; Veillette, D.; Rajagopalan, V.
1979-01-01
A control concept for a variable-speed wind energy conversion system is proposed, for which a self-exited asynchronous cage generator is used along with a system of thyristor converters. The control loops are the following: (1) regulation of the entrainment speed as function of available mechanical energy by acting on the resistance couple of the asynchronous generator; (2) control of electric power delivered to the asynchronous machine, functioning as a motor, for start-up of the vertical axis wind converter; and (3) limitation of the slip value, and by consequence, of the induction currents in the presence of sudden variations of input parameters.
Simulation of particle motion in a closed conduit validated against experimental data
NASA Astrophysics Data System (ADS)
Dolanský, Jindřich
2015-05-01
Motion of a number of spherical particles in a closed conduit is examined by means of both simulation and experiment. The bed of the conduit is covered by stationary spherical particles of the size of the moving particles. The flow is driven by experimentally measured velocity profiles which are inputs of the simulation. Altering input velocity profiles generates various trajectory patterns. The lattice Boltzmann method (LBM) based simulation is developed to study mutual interactions of the flow and the particles. The simulation enables to model both the particle motion and the fluid flow. The entropic LBM is employed to deal with the flow characterized by the high Reynolds number. The entropic modification of the LBM along with the enhanced refinement of the lattice grid yield an increase in demands on computational resources. Due to the inherently parallel nature of the LBM it can be handled by employing the Parallel Computing Toolbox (MATLAB) and other transformations enabling usage of the CUDA GPU computing technology. The trajectories of the particles determined within the LBM simulation are validated against data gained from the experiments. The compatibility of the simulation results with the outputs of experimental measurements is evaluated. The accuracy of the applied approach is assessed and stability and efficiency of the simulation is also considered.
Particle Number Concentrations for HI-SCALE Field Campaign Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hering, Susanne V
In support of the Holistic Interactions of Shallow Clouds, Aerosols, and Ecosystems (HI-SCALE) project to study new particle formation in the atmosphere, a pair of custom water condensation particle counters were provided to the second intensive field campaign, from mid-August through mid-September 2017, at the U.S. Department of Energy Southern Great Plains Atmospheric Radiation Measurement (ARM) Climate Research Facility observatory. These custom instruments were developed by Aerosol Dynamics, Inc. (Hering et al. 2017) to detect particles into the nanometer size range. Referred to as “versatile water condensation particle counter (vWCPC)”, they are water-based, laminar-flow condensational growth instruments whose lower particlemore » size threshold can be set based on user-selected operating temperatures. For HI-SCALE, the vWCPCs were configured to measure airborne particle number concentrations in the size range from approximately 2nm to 2μm. Both were installed in the particle sizing system operated by Chongai Kuang of Brookhaven National Laboratory (BNL). One of these was operated in parallel to a TSI Model 3776, upstream of the mobility particle sizing system, to measure total ambient particle concentrations. The airborne particle concentration data from this “total particle number vWCPC” (Ntot-vWCPC) system has been reported to the ARM database. The data are reported with one-second resolution. The second vWCPC was operated in parallel with the BNL diethylene glycol instrument to count particles downstream of a separate differential mobility size analyzer. Data from this “DMA-vWCPC” system was logged by BNL, and will eventually be provided by that laboratory.« less
Asynchronous versus Synchronous Learning in Pharmacy Education
ERIC Educational Resources Information Center
Motycka, Carol A.; St. Onge, Erin L.; Williams, Jennifer
2013-01-01
Objective: To better understand the technology being used today in pharmacy education through a review of the current methodologies being employed at various institutions. Also, to discuss the benefits and difficulties of asynchronous and synchronous methodologies, which are being utilized at both traditional and distance education campuses.…
Knowledge Building in Asynchronous Discussion Groups: Going Beyond Quantitative Analysis
ERIC Educational Resources Information Center
Schrire, Sarah
2006-01-01
This contribution examines the methodological challenges involved in defining the collaborative knowledge-building processes occurring in asynchronous discussion and proposes an approach that could advance understanding of these processes. The written protocols that are available to the analyst provide an exact record of the instructional…
Creating and Nurturing Distributed Asynchronous Learning Environments.
ERIC Educational Resources Information Center
Kochtanek, Thomas R.; Hein, Karen K.
2000-01-01
Describes the evolution of a university course from a face-to-face experience to a Web-based asynchronous learning environment. Topics include cognition and learning; distance learning and distributed learning; student learning communities and the traditional classroom; the future as it relates to education and technology; collaborative student…
ERIC Educational Resources Information Center
De Oliveira, Luciana C.; Olesova, Larisa
2013-01-01
This study examined asynchronous online discussions in the online course "English Language Development" to identify themes related to participants' learning about the language and literacy development of English Language Learners when they facilitated online discussions to determine whether the participants developed sufficient…
Creating Asynchronous Online Learning Communities
ERIC Educational Resources Information Center
Kerr, Crystal
2009-01-01
This research project examined how to develop and sustain online, asynchronous learning communities in continuous intake, distance education environments for learners in grades 7 through 10. The study is an action research project that is based upon in-depth, qualitative data. Interviews were conducted with distance education teachers,…
NASA Astrophysics Data System (ADS)
Averkin, Sergey N.; Gatsonis, Nikolaos A.
2018-06-01
An unstructured electrostatic Particle-In-Cell (EUPIC) method is developed on arbitrary tetrahedral grids for simulation of plasmas bounded by arbitrary geometries. The electric potential in EUPIC is obtained on cell vertices from a finite volume Multi-Point Flux Approximation of Gauss' law using the indirect dual cell with Dirichlet, Neumann and external circuit boundary conditions. The resulting matrix equation for the nodal potential is solved with a restarted generalized minimal residual method (GMRES) and an ILU(0) preconditioner algorithm, parallelized using a combination of node coloring and level scheduling approaches. The electric field on vertices is obtained using the gradient theorem applied to the indirect dual cell. The algorithms for injection, particle loading, particle motion, and particle tracking are parallelized for unstructured tetrahedral grids. The algorithms for the potential solver, electric field evaluation, loading, scatter-gather algorithms are verified using analytic solutions for test cases subject to Laplace and Poisson equations. Grid sensitivity analysis examines the L2 and L∞ norms of the relative error in potential, field, and charge density as a function of edge-averaged and volume-averaged cell size. Analysis shows second order of convergence for the potential and first order of convergence for the electric field and charge density. Temporal sensitivity analysis is performed and the momentum and energy conservation properties of the particle integrators in EUPIC are examined. The effects of cell size and timestep on heating, slowing-down and the deflection times are quantified. The heating, slowing-down and the deflection times are found to be almost linearly dependent on number of particles per cell. EUPIC simulations of current collection by cylindrical Langmuir probes in collisionless plasmas show good comparison with previous experimentally validated numerical results. These simulations were also used in a parallelization efficiency investigation. Results show that the EUPIC has efficiency of more than 80% when the simulation is performed on a single CPU from a non-uniform memory access node and the efficiency is decreasing as the number of threads further increases. The EUPIC is applied to the simulation of the multi-species plasma flow over a geometrically complex CubeSat in Low Earth Orbit. The EUPIC potential and flowfield distribution around the CubeSat exhibit features that are consistent with previous simulations over simpler geometrical bodies.
Brownian motion as a new probe of wettability.
Mo, Jianyong; Simha, Akarsh; Raizen, Mark G
2017-04-07
Understanding wettability is crucial for optimizing oil recovery, semiconductor manufacturing, pharmaceutical industry, and electrowetting. In this letter, we study the effects of wettability on Brownian motion. We consider the cases of a sphere in an unbounded fluid medium, as well as a sphere placed in the vicinity of a plane wall. For the first case, we show the effects of wettability on the statistical properties of the particles' motion, such as velocity autocorrelation, velocity, and thermal force power spectra over a large range of time scales. We also propose a new method to measure wettability based on the particles' Brownian motion. In addition, we compare the boundary effects on Brownian motion imposed by both no-slip and perfect-slip flat walls. We emphasize the surprising boundary effects on Brownian motion imposed by a perfect-slip wall in the parallel direction, such as a higher particle mobility parallel to a perfect flat wall compared to that in the absence of the wall, as well as compared to a particle near a no-slip flat wall.
Random-subset fitting of digital holograms for fast three-dimensional particle tracking [invited].
Dimiduk, Thomas G; Perry, Rebecca W; Fung, Jerome; Manoharan, Vinothan N
2014-09-20
Fitting scattering solutions to time series of digital holograms is a precise way to measure three-dimensional dynamics of microscale objects such as colloidal particles. However, this inverse-problem approach is computationally expensive. We show that the computational time can be reduced by an order of magnitude or more by fitting to a random subset of the pixels in a hologram. We demonstrate our algorithm on experimentally measured holograms of micrometer-scale colloidal particles, and we show that 20-fold increases in speed, relative to fitting full frames, can be attained while introducing errors in the particle positions of 10 nm or less. The method is straightforward to implement and works for any scattering model. It also enables a parallelization strategy wherein random-subset fitting is used to quickly determine initial guesses that are subsequently used to fit full frames in parallel. This approach may prove particularly useful for studying rare events, such as nucleation, that can only be captured with high frame rates over long times.
Magnetic orientation of nontronite clay in aqueous dispersions and its effect on water diffusion.
Abrahamsson, Christoffer; Nordstierna, Lars; Nordin, Matias; Dvinskikh, Sergey V; Nydén, Magnus
2015-01-01
The diffusion rate of water in dilute clay dispersions depends on particle concentration, size, shape, aggregation and water-particle interactions. As nontronite clay particles magnetically align parallel to the magnetic field, directional self-diffusion anisotropy can be created within such dispersion. Here we study water diffusion in exfoliated nontronite clay dispersions by diffusion NMR and time-dependant 1H-NMR-imaging profiles. The dispersion clay concentration was varied between 0.3 and 0.7 vol%. After magnetic alignment of the clay particles in these dispersions a maximum difference of 20% was measured between the parallel and perpendicular self-diffusion coefficients in the dispersion with 0.7 vol% clay. A method was developed to measure water diffusion within the dispersion in the absence of a magnetic field (random clay orientation) as this is not possible with standard diffusion NMR. However, no significant difference in self-diffusion coefficient between random and aligned dispersions could be observed. Copyright © 2014 Elsevier Inc. All rights reserved.
Adam, Asrul; Mohd Tumari, Mohd Zaidi; Mohamad, Mohd Saberi
2014-01-01
Electroencephalogram (EEG) signal peak detection is widely used in clinical applications. The peak point can be detected using several approaches, including time, frequency, time-frequency, and nonlinear domains depending on various peak features from several models. However, there is no study that provides the importance of every peak feature in contributing to a good and generalized model. In this study, feature selection and classifier parameters estimation based on particle swarm optimization (PSO) are proposed as a framework for peak detection on EEG signals in time domain analysis. Two versions of PSO are used in the study: (1) standard PSO and (2) random asynchronous particle swarm optimization (RA-PSO). The proposed framework tries to find the best combination of all the available features that offers good peak detection and a high classification rate from the results in the conducted experiments. The evaluation results indicate that the accuracy of the peak detection can be improved up to 99.90% and 98.59% for training and testing, respectively, as compared to the framework without feature selection adaptation. Additionally, the proposed framework based on RA-PSO offers a better and reliable classification rate as compared to standard PSO as it produces low variance model. PMID:25243236
Funahashi, Junichiro; Tanaka, Hiromitsu; Hirano, Tomoo
2018-01-01
Fast repetitive synaptic transmission depends on efficient exocytosis and retrieval of synaptic vesicles around a presynaptic active zone. However, the functional organization of an active zone and regulatory mechanisms of exocytosis, endocytosis and reconstruction of release-competent synaptic vesicles have not been fully elucidated. By developing a novel visualization method, we attempted to identify the location of exocytosis of a single synaptic vesicle within an active zone and examined movement of synaptic vesicle protein synaptophysin (Syp) after exocytosis. Using cultured hippocampal neurons, we induced formation of active-zone-like membranes (AZLMs) directly adjacent and parallel to a glass surface coated with neuroligin, and imaged Syp fused to super-ecliptic pHluorin (Syp-SEP) after its translocation to the plasma membrane from a synaptic vesicle using total internal reflection fluorescence microscopy (TIRFM). An AZLM showed characteristic molecular and functional properties of a presynaptic active zone. It contained active zone proteins, cytomatrix at the active zone-associated structural protein (CAST), Bassoon, Piccolo, Munc13 and RIM, and showed an increase in intracellular Ca 2+ concentration upon electrical stimulation. In addition, single-pulse stimulation sometimes induced a transient increase of Syp-SEP signal followed by lateral spread in an AZLM, which was considered to reflect an exocytosis event of a single synaptic vesicle. The diffusion coefficient of Syp-SEP on the presynaptic plasma membrane after the membrane fusion was estimated to be 0.17-0.19 μm 2 /s, suggesting that Syp-SEP diffused without significant obstruction. Synchronous exocytosis just after the electrical stimulation tended to occur at multiple restricted sites within an AZLM, whereas locations of asynchronous release occurring later after the stimulation tended to be more scattered.
Near real-time digital holographic microscope based on GPU parallel computing
NASA Astrophysics Data System (ADS)
Zhu, Gang; Zhao, Zhixiong; Wang, Huarui; Yang, Yan
2018-01-01
A transmission near real-time digital holographic microscope with in-line and off-axis light path is presented, in which the parallel computing technology based on compute unified device architecture (CUDA) and digital holographic microscopy are combined. Compared to other holographic microscopes, which have to implement reconstruction in multiple focal planes and are time-consuming the reconstruction speed of the near real-time digital holographic microscope can be greatly improved with the parallel computing technology based on CUDA, so it is especially suitable for measurements of particle field in micrometer and nanometer scale. Simulations and experiments show that the proposed transmission digital holographic microscope can accurately measure and display the velocity of particle field in micrometer scale, and the average velocity error is lower than 10%.With the graphic processing units(GPU), the computing time of the 100 reconstruction planes(512×512 grids) is lower than 120ms, while it is 4.9s using traditional reconstruction method by CPU. The reconstruction speed has been raised by 40 times. In other words, it can handle holograms at 8.3 frames per second and the near real-time measurement and display of particle velocity field are realized. The real-time three-dimensional reconstruction of particle velocity field is expected to achieve by further optimization of software and hardware. Keywords: digital holographic microscope,
NASA Astrophysics Data System (ADS)
Cauchi, Marija; Assmann, R. W.; Bertarelli, A.; Carra, F.; Lari, L.; Rossi, A.; Mollicone, P.; Sammut, N.
2015-02-01
The correct functioning of a collimation system is crucial to safely and successfully operate high-energy particle accelerators, such as the Large Hadron Collider (LHC). However, the requirements to handle high-intensity beams can be demanding, and accident scenarios must be well studied in order to assess if the collimator design is robust against possible error scenarios. One of the catastrophic, though not very probable, accident scenarios identified within the LHC is an asynchronous beam dump. In this case, one (or more) of the 15 precharged kicker circuits fires out of time with the abort gap, spraying beam pulses onto LHC machine elements before the machine protection system can fire the remaining kicker circuits and bring the beam to the dump. If a proton bunch directly hits a collimator during such an event, severe beam-induced damage such as magnet quenches and other equipment damage might result, with consequent downtime for the machine. This study investigates a number of newly defined jaw error cases, which include angular misalignment errors of the collimator jaw. A numerical finite element method approach is presented in order to precisely evaluate the thermomechanical response of tertiary collimators to beam impact. We identify the most critical and interesting cases, and show that a tilt of the jaw can actually mitigate the effect of an asynchronous dump on the collimators. Relevant collimator damage limits are taken into account, with the aim to identify optimal operational conditions for the LHC.
Fischer, William H.
1984-04-24
A non-binding particle trap to outer sheath contact for use in gas insulated transmission lines having a corrugated outer conductor. The non-binding feature of the contact according to the teachings of the invention is accomplished by having a lever arm rotatably attached to a particle trap by a pivot support axis disposed parallel to the direction of travel of the inner conductor/insulator/particle trap assembly.
Long-time self-diffusion of charged spherical colloidal particles in parallel planar layers.
Contreras-Aburto, Claudio; Báez, César A; Méndez-Alcaraz, José M; Castañeda-Priego, Ramón
2014-06-28
The long-time self-diffusion coefficient, D(L), of charged spherical colloidal particles in parallel planar layers is studied by means of Brownian dynamics computer simulations and mode-coupling theory. All particles (regardless which layer they are located on) interact with each other via the screened Coulomb potential and there is no particle transfer between layers. As a result of the geometrical constraint on particle positions, the simulation results show that D(L) is strongly controlled by the separation between layers. On the basis of the so-called contraction of the description formalism [C. Contreras-Aburto, J. M. Méndez-Alcaraz, and R. Castañeda-Priego, J. Chem. Phys. 132, 174111 (2010)], the effective potential between particles in a layer (the so-called observed layer) is obtained from integrating out the degrees of freedom of particles in the remaining layers. We have shown in a previous work that the effective potential performs well in describing the static structure of the observed layer (loc. cit.). In this work, we find that the D(L) values determined from the simulations of the observed layer, where the particles interact via the effective potential, do not agree with the exact values of D(L). Our findings confirm that even when an effective potential can perform well in describing the static properties, there is no guarantee that it will correctly describe the dynamic properties of colloidal systems.
Liu, Zhou; Shum, Ho Cheung
2013-01-01
In this work, we demonstrate a robust and reliable approach to fabricate multi-compartment particles for cell co-culture studies. By taking advantage of the laminar flow within our microfluidic nozzle, multiple parallel streams of liquids flow towards the nozzle without significant mixing. Afterwards, the multiple parallel streams merge into a single stream, which is sprayed into air, forming monodisperse droplets under an electric field with a high field strength. The resultant multi-compartment droplets are subsequently cross-linked in a calcium chloride solution to form calcium alginate micro-particles with multiple compartments. Each compartment of the particles can be used for encapsulating different types of cells or biological cell factors. These hydrogel particles with cross-linked alginate chains show similarity in the physical and mechanical environment as the extracellular matrix of biological cells. Thus, the multi-compartment particles provide a promising platform for cell studies and co-culture of different cells. In our study, cells are encapsulated in the multi-compartment particles and the viability of cells is quantified using a fluorescence microscope after the cells are stained for a live/dead assay. The high cell viability after encapsulation indicates the cytocompatibility and feasibility of our technique. Our multi-compartment particles have great potential as a platform for studying cell-cell interactions as well as interactions of cells with extracellular factors.
Liu, Zhou; Shum, Ho Cheung
2013-01-01
In this work, we demonstrate a robust and reliable approach to fabricate multi-compartment particles for cell co-culture studies. By taking advantage of the laminar flow within our microfluidic nozzle, multiple parallel streams of liquids flow towards the nozzle without significant mixing. Afterwards, the multiple parallel streams merge into a single stream, which is sprayed into air, forming monodisperse droplets under an electric field with a high field strength. The resultant multi-compartment droplets are subsequently cross-linked in a calcium chloride solution to form calcium alginate micro-particles with multiple compartments. Each compartment of the particles can be used for encapsulating different types of cells or biological cell factors. These hydrogel particles with cross-linked alginate chains show similarity in the physical and mechanical environment as the extracellular matrix of biological cells. Thus, the multi-compartment particles provide a promising platform for cell studies and co-culture of different cells. In our study, cells are encapsulated in the multi-compartment particles and the viability of cells is quantified using a fluorescence microscope after the cells are stained for a live/dead assay. The high cell viability after encapsulation indicates the cytocompatibility and feasibility of our technique. Our multi-compartment particles have great potential as a platform for studying cell-cell interactions as well as interactions of cells with extracellular factors. PMID:24404050
3D magnetospheric parallel hybrid multi-grid method applied to planet–plasma interactions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leclercq, L., E-mail: ludivine.leclercq@latmos.ipsl.fr; Modolo, R., E-mail: ronan.modolo@latmos.ipsl.fr; Leblanc, F.
2016-03-15
We present a new method to exploit multiple refinement levels within a 3D parallel hybrid model, developed to study planet–plasma interactions. This model is based on the hybrid formalism: ions are kinetically treated whereas electrons are considered as a inertia-less fluid. Generally, ions are represented by numerical particles whose size equals the volume of the cells. Particles that leave a coarse grid subsequently entering a refined region are split into particles whose volume corresponds to the volume of the refined cells. The number of refined particles created from a coarse particle depends on the grid refinement rate. In order tomore » conserve velocity distribution functions and to avoid calculations of average velocities, particles are not coalesced. Moreover, to ensure the constancy of particles' shape function sizes, the hybrid method is adapted to allow refined particles to move within a coarse region. Another innovation of this approach is the method developed to compute grid moments at interfaces between two refinement levels. Indeed, the hybrid method is adapted to accurately account for the special grid structure at the interfaces, avoiding any overlapping grid considerations. Some fundamental test runs were performed to validate our approach (e.g. quiet plasma flow, Alfven wave propagation). Lastly, we also show a planetary application of the model, simulating the interaction between Jupiter's moon Ganymede and the Jovian plasma.« less
NASA Astrophysics Data System (ADS)
Sandalski, Stou
Smooth particle hydrodynamics is an efficient method for modeling the dynamics of fluids. It is commonly used to simulate astrophysical processes such as binary mergers. We present a newly developed GPU accelerated smooth particle hydrodynamics code for astrophysical simulations. The code is named
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dooley, James H; Lanning, David N
Comminution process of wood veneer to produce wood particles, by feeding wood veneer in a direction of travel substantially normal to grain through a counter rotating pair of intermeshing arrays of cutting discs arrayed axially perpendicular to the direction of veneer travel, wherein the cutting discs have a uniform thickness (Td), to produce wood particles characterized by a length dimension (L) substantially equal to the Td and aligned substantially parallel to grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) substantially equal to the veneer thickness (Tv) and aligned normal to Wmore » and L, wherein the W.times.H dimensions define a pair of substantially parallel end surfaces with end checking between crosscut fibers.« less
Zhou, Ruhong
2004-05-01
A highly parallel replica exchange method (REM) that couples with a newly developed molecular dynamics algorithm particle-particle particle-mesh Ewald (P3ME)/RESPA has been proposed for efficient sampling of protein folding free energy landscape. The algorithm is then applied to two separate protein systems, beta-hairpin and a designed protein Trp-cage. The all-atom OPLSAA force field with an explicit solvent model is used for both protein folding simulations. Up to 64 replicas of solvated protein systems are simulated in parallel over a wide range of temperatures. The combined trajectories in temperature and configurational space allow a replica to overcome free energy barriers present at low temperatures. These large scale simulations reveal detailed results on folding mechanisms, intermediate state structures, thermodynamic properties and the temperature dependences for both protein systems.
The Necessity of Real-Time: Fact and Fiction in Digital Reference Systems.
ERIC Educational Resources Information Center
Lankes, R. David; Shostack, Pauline
2002-01-01
Discussion of digital reference services and the use of real-time versus asynchronous services such as email focuses on data from the AskERIC digital reference service to demonstrate that asynchronous services are not only useful but may have greater utility than real-time systems. (Author/LRW)
ERIC Educational Resources Information Center
Wu, Zhiwei
2018-01-01
Framed from positioning theory and dynamic systems theory, the paper reports on a naturalistic study involving four Chinese participants and their American peers in an intercultural asynchronous computer-mediated communication (ACMC) activity. Based on the moment-by-moment analysis and triangulation of forum posts, reflective essays, and…
Automated Feedback as a Convergence Tool
ERIC Educational Resources Information Center
Chenoweth, Tim; Corral, Karen; Scott, Kit
2016-01-01
This study evaluates two content delivery options for teaching a programming language to determine whether an asynchronous format can achieve the same learning efficacy as a traditional lecture (face-to-face) format. We use media synchronicity theory as a guide to choose media capabilities to incorporate into an asynchronous tutorial used…
Designing a Web-Based Asynchronous Innovation/Entrepreneurism Course
ERIC Educational Resources Information Center
Ghandforoush, Parviz
2017-01-01
Teaching an online fully asynchronous information technology course that requires students to ideate, build an e-commerce website, and develop an effective business plan involves a well-developed and highly engaging course design. This paper describes the design, development, and implementation of such a course and presents information on…
Asynchronous Assessment in a Large Lecture Marketing Course
ERIC Educational Resources Information Center
Downey, W. Scott; Schetzsle, Stacey
2012-01-01
Asynchronous assessment, which includes quizzes or exams online or outside class, offers marketing educators an opportunity to make more efficient use of class time and to enhance students' learning experiences by giving them more flexibility and choice in their assessment environment. In this paper, we examine the performance difference between…
Principles for Effective Asynchronous Online Instruction in Religious Studies
ERIC Educational Resources Information Center
McGuire, Beverley
2017-01-01
Asynchronous online instruction has become increasingly popular in the field of religious studies. However, despite voluminous research on online learning in general and numerous articles on online theological instruction, there has been little discussion of how to effectively design and deliver online undergraduate courses in religious studies.…
Investigating Asynchronous Online Communication: A Connected Stance Revealed
ERIC Educational Resources Information Center
Wegmann, Susan J.; McCauley, Joyce K.
2014-01-01
This research project explores the effects of altering the structure of discussion board formats to increase students' engagement and participation. This paper will present the findings of a two-university, two-class research project in which asynchronous discussion board entries were analyzed for substance. By using oral discourse analysis…
ERIC Educational Resources Information Center
Majeski, Robin; Stover, Merrily
2007-01-01
Online learning has enjoyed increasing popularity in gerontology. This paper presents instructional strategies grounded in Fink's (2003) theory of significant learning designed for the completely asynchronous online gerontology classroom. It links these components with the development of mastery learning goals and provides specific guidelines for…
Developing a Successful Asynchronous Online Extension Program for Forest Landowners
ERIC Educational Resources Information Center
Zobrist, Kevin W.
2014-01-01
Asynchronous online Extension classes can reach a wide audience, is convenient for the learner, and minimizes ongoing demands on instructor time. However, producing such classes takes significant effort up front. Advance planning and good communication with contributors are essential to success. Considerations include delivery platforms, content…
Cultural Influences on Chinese Students' Asynchronous Online Learning in a Canadian University
ERIC Educational Resources Information Center
Zhao, Naxin; McDougall, Douglas
2008-01-01
This study explored six Chinese graduate students' asynchronous online learning in a large urban Canadian university. Individual interviews in Mandarin elicited their perceptions of online learning, their participation in it, and the cultural factors that influenced their experiences. In general, the participants had a positive attitude towards…
Language Use in Asynchronous Computer-Mediated Communication in Taiwan
ERIC Educational Resources Information Center
Huang, Daphne Li-jung
2009-01-01
This paper describes how Chinese-English bilinguals in Taiwan use their languages in asynchronous computer-mediated communication, specifically, via Bulletin Board System (BBS) and email. The main data includes two types: emails collected from a social network and postings collected from two BBS websites. By examining patterns of language choice…
Students' Use of Asynchronous Discussions for Academic Discourse Socialization
ERIC Educational Resources Information Center
Beckett, Gulbahar H.; Amaro-Jimenez, Carla; Beckett, Kelvin S.
2010-01-01
Our universities are becoming increasingly diverse at the same time as online asynchronous discussions (OADs) are emerging as the most important forum for computer mediated communication (CMC) in distance education. But there is shortage of studies that explore how graduate students from different ethnic, linguistic and cultural backgrounds use…
ERIC Educational Resources Information Center
Saltarelli, Andrew John
2012-01-01
Previous research suggests asynchronous online computer-mediated communication (CMC) has deleterious effects on certain cooperative learning pedagogies (e.g., constructive controversy), but the processes underlying this effect and how it may be ameliorated remain unclear. This study tests whether asynchronous CMC thwarts belongingness needs…
Synchronous versus Asynchronous CMC and Transfer to Japanese Oral Performance
ERIC Educational Resources Information Center
Hirotani, Maki
2009-01-01
This study investigated the effects of synchronous and asynchronous CMC (computer-mediated communication)on the development of linguistic features of learners' speech in Japanese. Using learners from fourth-semester Japanese classes, the following research questions were examined: (a) Does CMC have positive effects on the development of oral…
A Group Intelligence-Based Asynchronous Argumentation Learning-Assistance Platform
ERIC Educational Resources Information Center
Huang, Chenn-Jung; Chang, Shun-Chih; Chen, Heng-Ming; Tseng, Jhe-Hao; Chien, Sheng-Yuan
2016-01-01
Structured argumentation support environments have been built and used in scientific discourse in the literature. However, to the best our knowledge, there is no research work in the literature examining whether student's knowledge has grown during learning activities with asynchronous argumentation. In this work, an intelligent computer-supported…
ERIC Educational Resources Information Center
Zhao, Huahui; Sullivan, Kirk P. H.; Mellenius, Ingmarie
2014-01-01
A key reason for using asynchronous computer conferencing in instruction is its potential for supporting collaborative learning. However, few studies have examined collaboration in computer conferencing. This study examined collaboration in six peer review groups within an asynchronous computer conferencing. Eighteen tertiary students participated…
Asynchronous Group Review of EFL Writing: Interactions and Text Revisions
ERIC Educational Resources Information Center
Saeed, Murad Abdu; Ghazali, Kamila
2017-01-01
The current paper reports an empirical study of asynchronous online group review of argumentative essays among nine English as foreign language (EFL) Arab university learners joining English in their first, second, and third years at the institution. In investigating online interactions, commenting patterns, and how the students facilitate text…
The Role of Beliefs and Motivation in Asynchronous Online Learning in College-Level Classes
ERIC Educational Resources Information Center
Xie, Kui; Huang, Kun
2014-01-01
Epistemic and learning beliefs were found to affect college students' cognitive engagement and study strategies, as well as motivation in classroom settings. However, the relationships between epistemic and learning beliefs, motivation, learning perception, and students' actual learning participation in asynchronous online settings have been…
A Coding Scheme to Analyse the Online Asynchronous Discussion Forums of University Students
ERIC Educational Resources Information Center
Biasutti, Michele
2017-01-01
The current study describes the development of a content analysis coding scheme to examine transcripts of online asynchronous discussion groups in higher education. The theoretical framework comprises the theories regarding knowledge construction in computer-supported collaborative learning (CSCL) based on a sociocultural perspective. The coding…
Relationship of Metacognitive Monitoring with Interaction in an Asynchronous Online Discussion Forum
ERIC Educational Resources Information Center
Topcu, Abdullah
2010-01-01
Monitoring one's own performance accurately is essential for information-processing and self-regulation, which are indispensable in an online learning environment. In this article, the effect of metacognitive monitoring (MM) on interaction in an asynchronous online discussion forum was investigated. Transcripts of this forum, which was integrated…
ERIC Educational Resources Information Center
Arnold, Nike; Ducate, Lara; Lomicka, Lara; Lord, Gillian
2005-01-01
This article examines social presence in virtual asynchronous learning communities among foreign language teachers. We present the findings of two studies investigating cross-institutional asynchronous forums created to engage participants in online dialogues regarding their foreign language teacher preparation experiences in and out of the…
ERIC Educational Resources Information Center
Duncan, Keith; Kenworthy, Amy; McNamara, Ray
2012-01-01
This article examines the relationship between MBA students' performance and participation in two online environments: a synchronous forum (chat room) and an asynchronous forum (discussion board) at an Australian university. The "quality" and "quantity" of students' participation is used to predict their final examination and…
Adding the Human Touch to Asynchronous Online Learning
ERIC Educational Resources Information Center
Glenn, Cynthia Wheatley
2018-01-01
For learners to actively accept responsibility in a virtual classroom platform, it is necessary to provide special motivation extending across the traditional classroom setting into asynchronous online learning. This article explores specific ways to do this that bridge the gap between ground and online students' learning experiences, and how…
Increasing Student Engagement Using Asynchronous Learning
ERIC Educational Resources Information Center
Northey, Gavin; Bucic, Tania; Chylinski, Mathew; Govind, Rahul
2015-01-01
Student engagement is an ongoing concern for educators because of its positive association with deep learning and educational outcomes. This article tests the use of a social networking site (Facebook) as a tool to facilitate asynchronous learning opportunities that complement face-to-face interactions and thereby enable a stronger learning…
NASA Astrophysics Data System (ADS)
Qiang, Ji
2017-10-01
A three-dimensional (3D) Poisson solver with longitudinal periodic and transverse open boundary conditions can have important applications in beam physics of particle accelerators. In this paper, we present a fast efficient method to solve the Poisson equation using a spectral finite-difference method. This method uses a computational domain that contains the charged particle beam only and has a computational complexity of O(Nu(logNmode)) , where Nu is the total number of unknowns and Nmode is the maximum number of longitudinal or azimuthal modes. This saves both the computational time and the memory usage of using an artificial boundary condition in a large extended computational domain. The new 3D Poisson solver is parallelized using a message passing interface (MPI) on multi-processor computers and shows a reasonable parallel performance up to hundreds of processor cores.
Limits on the Efficiency of Event-Based Algorithms for Monte Carlo Neutron Transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romano, Paul K.; Siegel, Andrew R.
The traditional form of parallelism in Monte Carlo particle transport simulations, wherein each individual particle history is considered a unit of work, does not lend itself well to data-level parallelism. Event-based algorithms, which were originally used for simulations on vector processors, may offer a path toward better utilizing data-level parallelism in modern computer architectures. In this study, a simple model is developed for estimating the efficiency of the event-based particle transport algorithm under two sets of assumptions. Data collected from simulations of four reactor problems using OpenMC was then used in conjunction with the models to calculate the speedup duemore » to vectorization as a function of two parameters: the size of the particle bank and the vector width. When each event type is assumed to have constant execution time, the achievable speedup is directly related to the particle bank size. We observed that the bank size generally needs to be at least 20 times greater than vector size in order to achieve vector efficiency greater than 90%. When the execution times for events are allowed to vary, however, the vector speedup is also limited by differences in execution time for events being carried out in a single event-iteration. For some problems, this implies that vector effciencies over 50% may not be attainable. While there are many factors impacting performance of an event-based algorithm that are not captured by our model, it nevertheless provides insights into factors that may be limiting in a real implementation.« less
Particle-in-cell simulations on graphic processing units
NASA Astrophysics Data System (ADS)
Ren, C.; Zhou, X.; Li, J.; Huang, M. C.; Zhao, Y.
2014-10-01
We will show our recent progress in using GPU's to accelerate the PIC code OSIRIS [Fonseca et al. LNCS 2331, 342 (2002)]. The OISRIS parallel structure is retained and the computation-intensive kernels are shipped to GPU's. Algorithms for the kernels are adapted for the GPU, including high-order charge-conserving current deposition schemes with few branching and parallel particle sorting [Kong et al., JCP 230, 1676 (2011)]. These algorithms make efficient use of the GPU shared memory. This work was supported by U.S. Department of Energy under Grant No. DE-FC02-04ER54789 and by NSF under Grant No. PHY-1314734.
Protons and alpha particles in the expanding solar wind: Hybrid simulations
NASA Astrophysics Data System (ADS)
Hellinger, Petr; Trávníček, Pavel M.
2013-09-01
We present results of a two‒dimensional hybrid expanding box simulation of a plasma system with three ion populations, beam and core protons, and alpha particles (and fluid electrons), drifting with respect to each other. The expansion with a strictly radial magnetic field leads to a decrease of the ion perpendicular to parallel temperature ratios as well as to an increase of the ratio between the ion relative velocities and the local Alfvén velocity creating a free energy for many different instabilities. The system is most of the time marginally stable with respect to kinetic instabilities mainly due to the ion relative velocities; these instabilities determine the system evolution counteracting some effects of the expansion. Nonlinear evolution of these instabilities leads to large modifications of the ion velocity distribution functions. The beam protons and alpha particles are decelerated with respect to the core protons and all the populations are cooled in the parallel direction and heated in the perpendicular one. On the macroscopic level, the kinetic instabilities cause large departures of the system evolution from the double adiabatic prediction and lead to perpendicular heating and parallel cooling rates which are comparable to the heating rates estimated from the Helios observations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nelson, Andrew F.; Wetzstein, M.; Naab, T.
2009-10-01
We continue our presentation of VINE. In this paper, we begin with a description of relevant architectural properties of the serial and shared memory parallel computers on which VINE is intended to run, and describe their influences on the design of the code itself. We continue with a detailed description of a number of optimizations made to the layout of the particle data in memory and to our implementation of a binary tree used to access that data for use in gravitational force calculations and searches for smoothed particle hydrodynamics (SPH) neighbor particles. We describe the modifications to the codemore » necessary to obtain forces efficiently from special purpose 'GRAPE' hardware, the interfaces required to allow transparent substitution of those forces in the code instead of those obtained from the tree, and the modifications necessary to use both tree and GRAPE together as a fused GRAPE/tree combination. We conclude with an extensive series of performance tests, which demonstrate that the code can be run efficiently and without modification in serial on small workstations or in parallel using the OpenMP compiler directives on large-scale, shared memory parallel machines. We analyze the effects of the code optimizations and estimate that they improve its overall performance by more than an order of magnitude over that obtained by many other tree codes. Scaled parallel performance of the gravity and SPH calculations, together the most costly components of most simulations, is nearly linear up to at least 120 processors on moderate sized test problems using the Origin 3000 architecture, and to the maximum machine sizes available to us on several other architectures. At similar accuracy, performance of VINE, used in GRAPE-tree mode, is approximately a factor 2 slower than that of VINE, used in host-only mode. Further optimizations of the GRAPE/host communications could improve the speed by as much as a factor of 3, but have not yet been implemented in VINE. Finally, we find that although parallel performance on small problems may reach a plateau beyond which more processors bring no additional speedup, performance never decreases, a factor important for running large simulations on many processors with individual time steps, where only a small fraction of the total particles require updates at any given moment.« less
NASA Astrophysics Data System (ADS)
Loucaides, N. G.; Georghiou, G. E.; Charalambous, C. D.
2007-04-01
The dielectrophoretic concentration of DNA particles suspended in a solution is investigated in a system of parallel electrodes, where the particles are attracted to the edges of the electrodes by positive dielectrophoresis. The AC electroosmotic motion of the fluid is also considered, as well as the diffusion of the particles, using the solution of the Smoluchowski equation. The results examine the effect of AC electroosmosis in steady state dielectrophoretic concentration of particles, by demonstrating that AC electroosmosis significantly reduces the dielectrophoretic concentration at the edges and moves the particles towards the electrode centres.
NASA Astrophysics Data System (ADS)
Sun, Jicheng; Gao, Xinliang; Lu, Quanming; Chen, Lunjin; Liu, Xu; Wang, Xueyi; Tao, Xin; Wang, Shui
2017-05-01
In this paper, we perform a 1-D particle-in-cell (PIC) simulation model consisting of three species, cold electrons, cold ions, and energetic ion ring, to investigate spectral structures of magnetosonic waves excited by ring distribution protons in the Earth's magnetosphere, and dynamics of charged particles during the excitation of magnetosonic waves. As the wave normal angle decreases, the spectral range of excited magnetosonic waves becomes broader with upper frequency limit extending beyond the lower hybrid resonant frequency, and the discrete spectra tends to merge into a continuous one. This dependence on wave normal angle is consistent with the linear theory. The effects of magnetosonic waves on the background cold plasma populations also vary with wave normal angle. For exactly perpendicular magnetosonic waves (parallel wave number k|| = 0), there is no energization in the parallel direction for both background cold protons and electrons due to the negligible fluctuating electric field component in the parallel direction. In contrast, the perpendicular energization of background plasmas is rather significant, where cold protons follow unmagnetized motion while cold electrons follow drift motion due to wave electric fields. For magnetosonic waves with a finite k||, there exists a nonnegligible parallel fluctuating electric field, leading to a significant and rapid energization in the parallel direction for cold electrons. These cold electrons can also be efficiently energized in the perpendicular direction due to the interaction with the magnetosonic wave fields in the perpendicular direction. However, cold protons can be only heated in the perpendicular direction, which is likely caused by the higher-order resonances with magnetosonic waves. The potential impacts of magnetosonic waves on the energization of the background cold plasmas in the Earth's inner magnetosphere are also discussed in this paper.
NASA Astrophysics Data System (ADS)
Laitinen, Timo; Effenberger, Frederic; Kopp, Andreas; Dalla, Silvia
2018-02-01
Insights into the processes of Solar Energetic Particle (SEP) propagation are essential for understanding how solar eruptions affect the radiation environment of near-Earth space. SEP propagation is influenced by turbulent magnetic fields in the solar wind, resulting in stochastic transport of the particles from their acceleration site to Earth. While the conventional approach for SEP modelling focuses mainly on the transport of particles along the mean Parker spiral magnetic field, multi-spacecraft observations suggest that the cross-field propagation shapes the SEP fluxes at Earth strongly. However, adding cross-field transport of SEPs as spatial diffusion has been shown to be insufficient in modelling the SEP events without use of unrealistically large cross-field diffusion coefficients. Recently, Laitinen et al. [ApJL 773 (2013b); A&A 591 (2016)] demonstrated that the early-time propagation of energetic particles across the mean field direction in turbulent fields is not diffusive, with the particles propagating along meandering field lines. This early-time transport mode results in fast access of the particles across the mean field direction, in agreement with the SEP observations. In this work, we study the propagation of SEPs within the new transport paradigm, and demonstrate the significance of turbulence strength on the evolution of the SEP radiation environment near Earth. We calculate the transport parameters consistently using a turbulence transport model, parametrised by the SEP parallel scattering mean free path at 1 AU, λ∥*, and show that the parallel and cross-field transport are connected, with conditions resulting in slow parallel transport corresponding to wider events. We find a scaling σφ,max∝(1/λ∥*)1/4 for the Gaussian fitting of the longitudinal distribution of maximum intensities. The longitudes with highest intensities are shifted towards the west for strong scattering conditions. Our results emphasise the importance of understanding both the SEP transport and the interplanetary turbulence conditions for modelling and predicting the SEP radiation environment at Earth.
High-speed asynchronous optical sampling for high-sensitivity detection of coherent phonons
NASA Astrophysics Data System (ADS)
Dekorsy, T.; Taubert, R.; Hudert, F.; Schrenk, G.; Bartels, A.; Cerna, R.; Kotaidis, V.; Plech, A.; Köhler, K.; Schmitz, J.; Wagner, J.
2007-12-01
A new optical pump-probe technique is implemented for the investigation of coherent acoustic phonon dynamics in the GHz to THz frequency range which is based on two asynchronously linked femtosecond lasers. Asynchronous optical sampling (ASOPS) provides the performance of on all-optical oscilloscope and allows us to record optically induced lattice dynamics over nanosecond times with femtosecond resolution at scan rates of 10 kHz without any moving part in the set-up. Within 1 minute of data acquisition time signal-to-noise ratios better than 107 are achieved. We present examples of the high-sensitivity detection of coherent phonons in superlattices and of the coherent acoustic vibration of metallic nanoparticles.
Dual stator winding variable speed asynchronous generator: optimal design and experiments
NASA Astrophysics Data System (ADS)
Tutelea, L. N.; Deaconu, S. I.; Popa, G. N.
2015-06-01
In the present paper is carried out a theoretical and experimental study of dual stator winding squirrel cage asynchronous generator (DSWA) behavior in the presence of saturation regime (non-sinusoidal) due to the variable speed operation. The main aims are the determination of the relations of calculating the equivalent parameters of the machine windings to optimal design using a Matlab code. Issue is limited to three phase range of double stator winding cage-induction generator of small sized powers, the most currently used in the small adjustable speed wind or hydro power plants. The tests were carried out using three-phase asynchronous generator having rated power of 6 [kVA].
Emotional first aid for a suicide crisis: comparison between Telephonic hotline and internet.
Gilat, Itzhak; Shahar, Golan
2007-01-01
The telephone and the internet have become popular sources of psychological help in various types of distress, including a suicide crisis. To gain more insight into the unique features of these media, we compared characteristics of calls to three technologically mediated sources of help that are part of the volunteer-based Israeli Association for Emotional First Aid (ERAN): Telephonic hotline (n = 4426), personal chat (n = 373) and an asynchronous online support group (n = 954). Threats of suicide were much more frequent among participants in the asynchronous support group than the telephone and personal chat. These findings encourage further research into suicide-related interpersonal exchanges in asynchronous online support groups.
Asynchronous emergence by loggerhead turtle (Caretta caretta) hatchlings
NASA Astrophysics Data System (ADS)
Houghton, J. D. R.; Hays, G. C.
2001-03-01
For many decades it has been accepted that marine turtle hatchlings from the same nest generally emerge from the sand together. However, for loggerhead turtles (Caretta caretta) nesting on the Greek Island of Kefalonia, a more asynchronous pattern of emergence has been documented. By placing temperature loggers at the top and bottom of nests laid on Kefalonia during 1998, we examined whether this asynchronous emergence was related to the thermal conditions within nests. Pronounced thermal variation existed not only between, but also within, individual nests. These within-nest temperature differences were related to the patterns of hatchling emergence, with hatchlings from nests displaying large thermal ranges emerging over a longer time-scale than those characterised by more uniform temperatures.
Asynchronous, macrotasked relaxation strategies for the solution of viscous, hypersonic flows
NASA Technical Reports Server (NTRS)
Gnoffo, Peter A.
1991-01-01
A point-implicit, asynchronous macrotasked relaxation of the steady, thin-layer, Navier-Stokes equations is presented. The method employs multidirectional, single-level storage Gauss-Seidel relaxation sweeps, which effectively communicate perturbations across the entire domain in 2n sweeps, where n is the dimension of the domain. In order to enhance convergence the application of relaxation factors to specific components of the Jacobian is examined using a stability analysis of the advection and diffusion equations. Attention is also given to the complications associated with asynchronous multitasking. Solutions are generated for hypersonic flows over blunt bodies in two and three dimensions with chemical reactions, utilizing single-tasked and multitasked relaxation strategies.
Parallel and Portable Monte Carlo Particle Transport
NASA Astrophysics Data System (ADS)
Lee, S. R.; Cummings, J. C.; Nolen, S. D.; Keen, N. D.
1997-08-01
We have developed a multi-group, Monte Carlo neutron transport code in C++ using object-oriented methods and the Parallel Object-Oriented Methods and Applications (POOMA) class library. This transport code, called MC++, currently computes k and α eigenvalues of the neutron transport equation on a rectilinear computational mesh. It is portable to and runs in parallel on a wide variety of platforms, including MPPs, clustered SMPs, and individual workstations. It contains appropriate classes and abstractions for particle transport and, through the use of POOMA, for portable parallelism. Current capabilities are discussed, along with physics and performance results for several test problems on a variety of hardware, including all three Accelerated Strategic Computing Initiative (ASCI) platforms. Current parallel performance indicates the ability to compute α-eigenvalues in seconds or minutes rather than days or weeks. Current and future work on the implementation of a general transport physics framework (TPF) is also described. This TPF employs modern C++ programming techniques to provide simplified user interfaces, generic STL-style programming, and compile-time performance optimization. Physics capabilities of the TPF will be extended to include continuous energy treatments, implicit Monte Carlo algorithms, and a variety of convergence acceleration techniques such as importance combing.
Flagellar coordination in Chlamydomonas cells held on micropipettes.
Rüffer, U; Nultsch, W
1998-01-01
The two flagella of Chlamydomonas are known to beat synchronously: During breaststroke beating they are generally coordinated in a bilateral way while in shock responses during undulatory beating coordination is mostly parallel [Rüffer and Nultsch, 1995: Botanica Acta 108:169-276]. Analysis of a great number of shock responses revealed that in undulatory beats also periods of bilateral coordination are found and that the coordination type may change several times during a shock response, without concomitant changes of the beat envelope and the beat period. In normal wt cells no coordination changes are found during breaststroke beating, but only short temporary asynchronies: During 2 or 3 normal beats of the cis flagellum, the trans flagellum performs 3 or 4 flat beats with a reduced beat envelope and a smaller beat period, resulting in one additional trans beat. Long periods with flat beats of the same shape and beat period are found in both flagella of the non-phototactic mutant ptx1 and in defective wt 622E cells. During these periods, the coordination is parallel, the two flagella beat alternately. A correlation between normal asynchronous trans beats and the parallel-coordinated beats in the presumably cis defective cells and also the undulatory beats is discussed. In the cis defective cells, a perpetual spontaneous change between parallel beats with small beat periods (higher beat frequency) and bilateral beats with greater beat periods (lower beat frequency) are observed and render questionable the existence of two different intrinsic beat frequencies of the two flagella cis and trans. Asynchronies occur spontaneously but may also be induced by light changes, either step-up or step-down, but not by both stimuli in turn as breaststroke flagellar photoresponses (BFPRs). Asynchronies are not involved in phototaxis. They are independent of the BFPRs, which are supposed to be the basis of phototaxis. Both types of coordination must be assumed to be regulated internally, involving calcium-sensitive basal-body associated fibrous structures.
ERIC Educational Resources Information Center
Green, Rodney A.; Hughes, Diane L.
2013-01-01
Asynchronous online discussion forums are increasingly common in blended learning environments but the relationship to student learning outcomes has not been reported for anatomy teaching. Forums were monitored in two multicampus anatomy courses; an introductory first year course and a second year physiotherapy-specific course. The forums are…
At a Distance: A Comparative Study of Distance Delivery Modalities for PhD Nursing Students
ERIC Educational Resources Information Center
Black, Andrew G.
2010-01-01
This study sought to ascertain and compare the attitudes and perceptions of PhD nursing students attending their coursework through synchronous and asynchronous means at two different universities. Many studies have been performed comparing both synchronous videoconferencing and asynchronous online education with the traditional classroom, but no…
ERIC Educational Resources Information Center
Schroeder, Shawnda; Baker, Mary; Terras, Katherine; Mahar, Patti; Chiasson, Kari
2016-01-01
This study examined graduate students' desired and experienced levels of connectivity in an online, asynchronous distance degree program. Connectivity was conceptualized as the students' feelings of community and involvement, not their level of access to the Internet. Graduate students enrolled in a distance degree program were surveyed on both…
Three Interaction Patterns on Asynchronous Online Discussion Behaviours: A Methodological Comparison
ERIC Educational Resources Information Center
Jo, I.; Park, Y.; Lee, H.
2017-01-01
An asynchronous online discussion (AOD) is one format of instructional methods that facilitate student-centered learning. In the wealth of AOD research, this study evaluated how students' behavior on AOD influences their academic outcomes. This case study compared the differential analytic methods including web log mining, social network analysis…
Microbial infection affects egg viability and incubation behavior in a tropical passerine.
Mark I. Cook; Steven R. Beissinger; Gary A. Toranzos; Roberto A. Arendt Rodriguez
2004-01-01
Many avian species initiate incubation before clutch completion, which causes eggs to hatch asynchronously. This influences brood competitive dynamics and often results in nestling mortality. The prevailing hypotheses contend that parents incubate early because asynchronous hatching provides fitness benefits to parents or surviving offspring. An alternative idea is...
FIFO Buffer for Asynchronous Data Streams
NASA Technical Reports Server (NTRS)
Bascle, K. P.
1985-01-01
Variable-rate, asynchronous data signals from up to four measuring instruments or other sources combined in first-in/first-out (FIFO) buffer for transmission on single channel. Constructed in complementary metal-oxide-semiconductor (CMOS) logic, buffer consumes low power (only 125 mW at 5V) and conforms to aerospace standards of reliability and maintainability.