Chu, Henry Shiu-Hung [Idaho Falls, ID; Lacy, Jeffrey M [Idaho Falls, ID
2008-04-01
An armor structure includes first and second layers individually containing a plurality of i-beams. Individual i-beams have a pair of longitudinal flanges interconnected by a longitudinal crosspiece and defining opposing longitudinal channels between the pair of flanges. The i-beams within individual of the first and second layers run parallel. The laterally outermost faces of the flanges of adjacent i-beams face one another. One of the longitudinal channels in each of the first and second layers faces one of the longitudinal channels in the other of the first and second layers. The channels of the first layer run parallel with the channels of the second layer. The flanges of the first and second layers overlap with the crosspieces of the other of the first and second layers, and portions of said flanges are received within the facing channels of the i-beams of the other of the first and second layers.
NASA Astrophysics Data System (ADS)
Vivoni, Enrique R.; Mascaro, Giuseppe; Mniszewski, Susan; Fasel, Patricia; Springer, Everett P.; Ivanov, Valeriy Y.; Bras, Rafael L.
2011-10-01
SummaryA major challenge in the use of fully-distributed hydrologic models has been the lack of computational capabilities for high-resolution, long-term simulations in large river basins. In this study, we present the parallel model implementation and real-world hydrologic assessment of the Triangulated Irregular Network (TIN)-based Real-time Integrated Basin Simulator (tRIBS). Our parallelization approach is based on the decomposition of a complex watershed using the channel network as a directed graph. The resulting sub-basin partitioning divides effort among processors and handles hydrologic exchanges across boundaries. Through numerical experiments in a set of nested basins, we quantify parallel performance relative to serial runs for a range of processors, simulation complexities and lengths, and sub-basin partitioning methods, while accounting for inter-run variability on a parallel computing system. In contrast to serial simulations, the parallel model speed-up depends on the variability of hydrologic processes. Load balancing significantly improves parallel speed-up with proportionally faster runs as simulation complexity (domain resolution and channel network extent) increases. The best strategy for large river basins is to combine a balanced partitioning with an extended channel network, with potential savings through a lower TIN resolution. Based on these advances, a wider range of applications for fully-distributed hydrologic models are now possible. This is illustrated through a set of ensemble forecasts that account for precipitation uncertainty derived from a statistical downscaling model.
Scalable parallel communications
NASA Technical Reports Server (NTRS)
Maly, K.; Khanna, S.; Overstreet, C. M.; Mukkamala, R.; Zubair, M.; Sekhar, Y. S.; Foudriat, E. C.
1992-01-01
Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth service to a single application); and (3) coarse grain parallelism will be able to incorporate many future improvements from related work (e.g., reduced data movement, fast TCP, fine-grain parallelism) also with near linear speed-ups.
Nikcevic, Irena; Piruska, Aigars; Wehmeyer, Kenneth R; Seliskar, Carl J; Limbach, Patrick A; Heineman, William R
2010-08-01
Parallel separations using CE on a multilane microchip with multiplexed LIF detection is demonstrated. The detection system was developed to simultaneously record data on all channels using an expanded laser beam for excitation, a camera lens to capture emission, and a CCD camera for detection. The detection system enables monitoring of each channel continuously and distinguishing individual lanes without significant crosstalk between adjacent lanes. Multiple analytes can be determined in parallel lanes within a single microchip in a single run, leading to increased sample throughput. The pK(a) determination of small molecule analytes is demonstrated with the multilane microchip.
Nikcevic, Irena; Piruska, Aigars; Wehmeyer, Kenneth R.; Seliskar, Carl J.; Limbach, Patrick A.; Heineman, William R.
2010-01-01
Parallel separations using capillary electrophoresis on a multilane microchip with multiplexed laser induced fluorescence detection is demonstrated. The detection system was developed to simultaneously record data on all channels using an expanded laser beam for excitation, a camera lens to capture emission, and a CCD camera for detection. The detection system enables monitoring of each channel continuously and distinguishing individual lanes without significant crosstalk between adjacent lanes. Multiple analytes can be analyzed on parallel lanes within a single microchip in a single run, leading to increased sample throughput. The pKa determination of small molecule analytes is demonstrated with the multilane microchip. PMID:20737446
Optofluidic refractive-index sensor in step-index fiber with parallel hollow micro-channel.
Lee, H W; Schmidt, M A; Uebel, P; Tyagi, H; Joly, N Y; Scharrer, M; Russell, P St J
2011-04-25
We present a simple refractive index sensor based on a step-index fiber with a hollow micro-channel running parallel to its core. This channel becomes waveguiding when filled with a liquid of index greater than silica, causing sharp dips to appear in the transmission spectrum at wavelengths where the glass-core mode phase-matches to a mode of the liquid-core. The sensitivity of the dip-wavelengths to changes in liquid refractive index is quantified and the results used to study the dynamic flow characteristics of fluids in narrow channels. Potential applications of this fiber microstructure include measuring the optical properties of liquids, refractive index sensing, biophotonics and studies of fluid dynamics on the nanoscale.
Transmission electron microscopy characterization of a large-pore titanium silicate
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bozhilov, K.N.; Valtchev, V.P.
1993-11-01
The large-pore titanium silicate ETS-10, synthesized with tetramethylammonium, was characterized by means of TEM. The parameters of an orthorhombic unit cell, a = 14.79 [angstrom], b = 14.5 [angstrom], c = 13.06 [angstrom], were determined based on both electron and x-ray diffraction data. A one-dimensional channel structure is proposed, with channels running parallel to [001]. The cations and molecules occupying channel positions display significant positional disorder.
Crashworthiness simulations with DYNA3D
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schauer, D.A.; Hoover, C.G.; Kay, G.J.
1996-04-01
Current progress in parallel algorithm research and applications in vehicle crash simulation is described for the explicit, finite element algorithms in DYNA3D. Problem partitioning methods and parallel algorithms for contact at material interfaces are the two challenging algorithm research problems that are addressed. Two prototype parallel contact algorithms have been developed for treating the cases of local and arbitrary contact. Demonstration problems for local contact are crashworthiness simulations with 222 locally defined contact surfaces and a vehicle/barrier collision modeled with arbitrary contact. A simulation of crash tests conducted for a vehicle impacting a U-channel small sign post embedded in soilmore » has been run on both the serial and parallel versions of DYNA3D. A significant reduction in computational time has been observed when running these problems on the parallel version. However, to achieve maximum efficiency, complex problems must be appropriately partitioned, especially when contact dominates the computation.« less
A picoliter-volume mixer for microfluidic analytical systems.
He, B; Burke, B J; Zhang, X; Zhang, R; Regnier, F E
2001-05-01
Mixing confluent liquid streams is an important, but difficult operation in microfluidic systems. This paper reports the construction and characterization of a 100-pL mixer for liquids transported by electroosmotic flow. Mixing was achieved in a microfabricated device with multiple intersecting channels of varying lengths and a bimodal width distribution. All channels running parallel to the direction of flow were 5 microm in width whereas larger 27-microm-width channels ran back and forth through the parallel channel network at a 45 degrees angle. The channel network composing the mixer was approximately 10 microm deep. It was observed that little mixing of the confluent solvent streams occurred in the 100-microm-wide, 300-microm-long mixer inlet channel where mixing would be achieved almost exclusively by diffusion. In contrast, after passage through the channel network in the approximately 200-microm-length static mixer bed, mixing was complete as determined by confocal microscopy and CCD detection. Theoretical simulations were also performed in an attempt to describe the extent of mixing in microfabricated systems.
Zhu, Hao; Sun, Yan; Rajagopal, Gunaretnam; Mondry, Adrian; Dhar, Pawan
2004-01-01
Background Many arrhythmias are triggered by abnormal electrical activity at the ionic channel and cell level, and then evolve spatio-temporally within the heart. To understand arrhythmias better and to diagnose them more precisely by their ECG waveforms, a whole-heart model is required to explore the association between the massively parallel activities at the channel/cell level and the integrative electrophysiological phenomena at organ level. Methods We have developed a method to build large-scale electrophysiological models by using extended cellular automata, and to run such models on a cluster of shared memory machines. We describe here the method, including the extension of a language-based cellular automaton to implement quantitative computing, the building of a whole-heart model with Visible Human Project data, the parallelization of the model on a cluster of shared memory computers with OpenMP and MPI hybrid programming, and a simulation algorithm that links cellular activity with the ECG. Results We demonstrate that electrical activities at channel, cell, and organ levels can be traced and captured conveniently in our extended cellular automaton system. Examples of some ECG waveforms simulated with a 2-D slice are given to support the ECG simulation algorithm. A performance evaluation of the 3-D model on a four-node cluster is also given. Conclusions Quantitative multicellular modeling with extended cellular automata is a highly efficient and widely applicable method to weave experimental data at different levels into computational models. This process can be used to investigate complex and collective biological activities that can be described neither by their governing differentiation equations nor by discrete parallel computation. Transparent cluster computing is a convenient and effective method to make time-consuming simulation feasible. Arrhythmias, as a typical case, can be effectively simulated with the methods described. PMID:15339335
Methods for operating parallel computing systems employing sequenced communications
Benner, R.E.; Gustafson, J.L.; Montry, G.R.
1999-08-10
A parallel computing system and method are disclosed having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system. 15 figs.
Methods for operating parallel computing systems employing sequenced communications
Benner, Robert E.; Gustafson, John L.; Montry, Gary R.
1999-01-01
A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
Planned development of a 3D computer based on free-space optical interconnects
NASA Astrophysics Data System (ADS)
Neff, John A.; Guarino, David R.
1994-05-01
Free-space optical interconnection has the potential to provide upwards of a million data channels between planes of electronic circuits. This may result in the planar board and backplane structures of today giving away to 3-D stacks of wafers or multi-chip modules interconnected via channels running perpendicular to the processor planes, thereby eliminating much of the packaging overhead. Three-dimensional packaging is very appealing for tightly coupled fine-grained parallel computing where the need for massive numbers of interconnections is severely taxing the capabilities of the planar structures. This paper describes a coordinated effort by four research organizations to demonstrate an operational fine-grained parallel computer that achieves global connectivity through the use of free space optical interconnects.
NASA Technical Reports Server (NTRS)
Hockney, George; Lee, Seungwon
2008-01-01
A computer program known as PyPele, originally written as a Pythonlanguage extension module of a C++ language program, has been rewritten in pure Python language. The original version of PyPele dispatches and coordinates parallel-processing tasks on cluster computers and provides a conceptual framework for spacecraft-mission- design and -analysis software tools to run in an embarrassingly parallel mode. The original version of PyPele uses SSH (Secure Shell a set of standards and an associated network protocol for establishing a secure channel between a local and a remote computer) to coordinate parallel processing. Instead of SSH, the present Python version of PyPele uses Message Passing Interface (MPI) [an unofficial de-facto standard language-independent application programming interface for message- passing on a parallel computer] while keeping the same user interface. The use of MPI instead of SSH and the preservation of the original PyPele user interface make it possible for parallel application programs written previously for the original version of PyPele to run on MPI-based cluster computers. As a result, engineers using the previously written application programs can take advantage of embarrassing parallelism without need to rewrite those programs.
NASA Astrophysics Data System (ADS)
MacMackin, C. T.; Wells, A.
2017-12-01
While relatively small in mass, ice shelves play an important role in buttressing ice sheets, slowing their flow into the ocean. As such, an understanding of ice shelf stability is needed for predictions of future sea level rise. Networks of channels have been observed underneath Antarctic ice shelves and are thought to affect their stability. While the origins of channels running parallel to ice flow are thought to be well understood, transverse channels have also been observed and the mechanism for their formation is less clear. It has been suggested that seasonal variations in ice and ocean properties could be a source and we run nonlinear, vertically integrated 1-D simulations of a coupled ice shelf and plume to test this hypothesis. We also examine how these variations might alter the shape of internal radar reflectors within the ice, suggesting a new technique to model their distribution using a vertically integrated model of ice flow. We examine a range of sources for seasonal forcing which might lead to channel formation, finding that variability in subglacial discharge results in small variations of ice thickness. Additional mechanisms would be required to expand these into large transverse channels.
Optimizing ion channel models using a parallel genetic algorithm on graphical processors.
Ben-Shalom, Roy; Aviv, Amit; Razon, Benjamin; Korngreen, Alon
2012-01-01
We have recently shown that we can semi-automatically constrain models of voltage-gated ion channels by combining a stochastic search algorithm with ionic currents measured using multiple voltage-clamp protocols. Although numerically successful, this approach is highly demanding computationally, with optimization on a high performance Linux cluster typically lasting several days. To solve this computational bottleneck we converted our optimization algorithm for work on a graphical processing unit (GPU) using NVIDIA's CUDA. Parallelizing the process on a Fermi graphic computing engine from NVIDIA increased the speed ∼180 times over an application running on an 80 node Linux cluster, considerably reducing simulation times. This application allows users to optimize models for ion channel kinetics on a single, inexpensive, desktop "super computer," greatly reducing the time and cost of building models relevant to neuronal physiology. We also demonstrate that the point of algorithm parallelization is crucial to its performance. We substantially reduced computing time by solving the ODEs (Ordinary Differential Equations) so as to massively reduce memory transfers to and from the GPU. This approach may be applied to speed up other data intensive applications requiring iterative solutions of ODEs. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Lee, A. Y.
1967-01-01
Computer program calculates the steady state fluid distribution, temperature rise, and pressure drop of a coolant, the material temperature distribution of a heat generating solid, and the heat flux distributions at the fluid-solid interfaces. It performs the necessary iterations automatically within the computer, in one machine run.
Method for simultaneous overlapped communications between neighboring processors in a multiple
Benner, Robert E.; Gustafson, John L.; Montry, Gary R.
1991-01-01
A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
NASA Astrophysics Data System (ADS)
Little, Duncan A.; Tennyson, Jonathan; Plummer, Martin; Noble, Clifford J.; Sunderland, Andrew G.
2017-06-01
TIMEDELN implements the time-delay method of determining resonance parameters from the characteristic Lorentzian form displayed by the largest eigenvalues of the time-delay matrix. TIMEDELN constructs the time-delay matrix from input K-matrices and analyses its eigenvalues. This new version implements multi-resonance fitting and may be run serially or as a high performance parallel code with three levels of parallelism. TIMEDELN takes K-matrices from a scattering calculation, either read from a file or calculated on a dynamically adjusted grid, and calculates the time-delay matrix. This is then diagonalized, with the largest eigenvalue representing the longest time-delay experienced by the scattering particle. A resonance shows up as a characteristic Lorentzian form in the time-delay: the programme searches the time-delay eigenvalues for maxima and traces resonances when they pass through different eigenvalues, separating overlapping resonances. It also performs the fitting of the calculated data to the Lorentzian form and outputs resonance positions and widths. Any remaining overlapping resonances can be fitted jointly. The branching ratios of decay into the open channels can also be found. The programme may be run serially or in parallel with three levels of parallelism. The parallel code modules are abstracted from the main physics code and can be used independently.
Implementing Audio Digital Feedback Loop Using the National Instruments RIO System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, G.; Byrd, J. M.
2006-11-20
Development of system for high precision RF distribution and laser synchronization at Berkeley Lab has been ongoing for several years. Successful operation of these systems requires multiple audio bandwidth feedback loops running at relatively high gains. Stable operation of the feedback loops requires careful design of the feedback transfer function. To allow for flexible and compact implementation, we have developed digital feedback loops on the National Instruments Reconfigurable Input/Output (RIO) platform. This platform uses an FPGA and multiple I/Os that can provide eight parallel channels running different filters. We present the design and preliminary experimental results of this system.
Angle performance on optima MDxt
DOE Office of Scientific and Technical Information (OSTI.GOV)
David, Jonathan; Kamenitsa, Dennis
2012-11-06
Angle control on medium current implanters is important due to the high angle-sensitivity of typical medium current implants, such as halo implants. On the Optima MDxt, beam-to-wafer angles are controlled in both the horizontal and vertical directions. In the horizontal direction, the beam angle is measured through six narrow slits, and any angle adjustment is made by electrostatically steering the beam, while cross-wafer beam parallelism is adjusted by changing the focus of the electrostatic parallelizing lens (P-lens). In the vertical direction, the beam angle is measured through a high aspect ratio mask, and any angle adjustment is made by slightlymore » tilting the wafer platen prior to implant. A variety of tests were run to measure the accuracy and repeatability of Optima MDxt's angle control. SIMS profiles of a high energy, channeling sensitive condition show both the cross-wafer angle uniformity, along with the small-angle resolution of the system. Angle repeatability was quantified by running a channeling sensitive implant as a regular monitor over a seven month period and measuring the sheet resistance-to-angle sensitivity. Even though crystal cut error was not controlled for in this case, when attributing all Rs variation to angle changes, the overall angle repeatability was measured as 0.16 Degree-Sign (1{sigma}). A separate angle repeatability test involved running a series of V-curves tests over a four month period using low crystal cut wafers selected from the same boule. The results of this test showed the angle repeatability to be <0.1 Degree-Sign (1{sigma}).« less
Aggregating quantum repeaters for the quantum internet
NASA Astrophysics Data System (ADS)
Azuma, Koji; Kato, Go
2017-09-01
The quantum internet holds promise for accomplishing quantum teleportation and unconditionally secure communication freely between arbitrary clients all over the globe, as well as the simulation of quantum many-body systems. For such a quantum internet protocol, a general fundamental upper bound on the obtainable entanglement or secret key has been derived [K. Azuma, A. Mizutani, and H.-K. Lo, Nat. Commun. 7, 13523 (2016), 10.1038/ncomms13523]. Here we consider its converse problem. In particular, we present a universal protocol constructible from any given quantum network, which is based on running quantum repeater schemes in parallel over the network. For arbitrary lossy optical channel networks, our protocol has no scaling gap with the upper bound, even based on existing quantum repeater schemes. In an asymptotic limit, our protocol works as an optimal entanglement or secret-key distribution over any quantum network composed of practical channels such as erasure channels, dephasing channels, bosonic quantum amplifier channels, and lossy optical channels.
Fast Face-Recognition Optical Parallel Correlator Using High Accuracy Correlation Filter
NASA Astrophysics Data System (ADS)
Watanabe, Eriko; Kodate, Kashiko
2005-11-01
We designed and fabricated a fully automatic fast face recognition optical parallel correlator [E. Watanabe and K. Kodate: Appl. Opt. 44 (2005) 5666] based on the VanderLugt principle. The implementation of an as-yet unattained ultra high-speed system was aided by reconfiguring the system to make it suitable for easier parallel processing, as well as by composing a higher accuracy correlation filter and high-speed ferroelectric liquid crystal-spatial light modulator (FLC-SLM). In running trial experiments using this system (dubbed FARCO), we succeeded in acquiring remarkably low error rates of 1.3% for false match rate (FMR) and 2.6% for false non-match rate (FNMR). Given the results of our experiments, the aim of this paper is to examine methods of designing correlation filters and arranging database image arrays for even faster parallel correlation, underlining the issues of calculation technique, quantization bit rate, pixel size and shift from optical axis. The correlation filter has proved its excellent performance and higher precision than classical correlation and joint transform correlator (JTC). Moreover, arrangement of multi-object reference images leads to 10-channel correlation signals, as sharply marked as those of a single channel. This experiment result demonstrates great potential for achieving the process speed of 10000 face/s.
Modeling and investigation of the channeling phenomenon in downdraft stratified gasifers.
Allesina, Giulio; Pedrazzi, Simone; Tartarini, Paolo
2013-10-01
Downdraft stratified gasifiers seem to be the reactors which are most influenced by loading conditions. Moreover, the larger the reactor is, the higher the possibility to stumble across a channeling phenomenon. This high sensitivity is due to the limited thickness and superficial placement of the flaming pyrolysis layer coupled with the necessity to keep all the zones parallel for a correct running of this kind of gasifier. This study was aimed at modeling and investigating the channeling phenomenon generated by loading condition variations on a 250-kWe nominal power gasification power plant. The experimental campaign showed great variations in most of the plant outputs. These phenomena were modeled on two modified mathematical models obtained from literature. The results of the models confirmed the capability of this approach to predict the channeling phenomena and its dependency on the loading method. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Greenberg, Albert G.; Lubachevsky, Boris D.; Nicol, David M.; Wright, Paul E.
1994-01-01
Fast, efficient parallel algorithms are presented for discrete event simulations of dynamic channel assignment schemes for wireless cellular communication networks. The driving events are call arrivals and departures, in continuous time, to cells geographically distributed across the service area. A dynamic channel assignment scheme decides which call arrivals to accept, and which channels to allocate to the accepted calls, attempting to minimize call blocking while ensuring co-channel interference is tolerably low. Specifically, the scheme ensures that the same channel is used concurrently at different cells only if the pairwise distances between those cells are sufficiently large. Much of the complexity of the system comes from ensuring this separation. The network is modeled as a system of interacting continuous time automata, each corresponding to a cell. To simulate the model, conservative methods are used; i.e., methods in which no errors occur in the course of the simulation and so no rollback or relaxation is needed. Implemented on a 16K processor MasPar MP-1, an elegant and simple technique provides speedups of about 15 times over an optimized serial simulation running on a high speed workstation. A drawback of this technique, typical of conservative methods, is that processor utilization is rather low. To overcome this, new methods were developed that exploit slackness in event dependencies over short intervals of time, thereby raising the utilization to above 50 percent and the speedup over the optimized serial code to about 120 times.
Stock, S R; Barss, J; Dahl, T; Veis, A; Almer, J D; Carlo, F
2003-05-01
In sea urchin teeth, the keel plays an important structural role, and this paper reports results of microstructural characterization of the keel of Lytechinus variegatus using two noninvasive synchrotron x-ray techniques: x-ray absorption microtomography (microCT) and x-ray diffraction mapping. MicroCT with 14 keV x-rays mapped the spatial distribution of mineral at the 1.3 microm level in a millimeter-sized fragment of a mature portion of the keel. Two rows of low absorption channels (i.e., primary channels) slightly less than 10 microm in diameter were found running linearly from the flange to the base of the keel and parallel to its sides. The primary channels paralleled the oral edge of the keel, and the microCT slices revealed a planar secondary channel leading from each primary channel to the side of the keel. The primary and secondary channels were more or less coplanar and may correspond to the soft tissue between plates of the carinar process. Transmission x-ray diffraction with 80.8 keV x-rays and a 0.1 mm beam mapped the distribution of calcite crystal orientations and the composition Ca(1-x)Mg(x)CO(3) of the calcite. Unlike the variable Mg concentration and highly curved prisms found in the keel of Paracentrotus lividus, a constant Mg content (x = 0.13) and relatively little prism curvature was found in the keel of Lytechinus variegatus.
Data Acquisition System for Multi-Frequency Radar Flight Operations Preparation
NASA Technical Reports Server (NTRS)
Leachman, Jonathan
2010-01-01
A three-channel data acquisition system was developed for the NASA Multi-Frequency Radar (MFR) system. The system is based on a commercial-off-the-shelf (COTS) industrial PC (personal computer) and two dual-channel 14-bit digital receiver cards. The decimated complex envelope representations of the three radar signals are passed to the host PC via the PCI bus, and then processed in parallel by multiple cores of the PC CPU (central processing unit). The innovation is this parallelization of the radar data processing using multiple cores of a standard COTS multi-core CPU. The data processing portion of the data acquisition software was built using autonomous program modules or threads, which can run simultaneously on different cores. A master program module calculates the optimal number of processing threads, launches them, and continually supplies each with data. The benefit of this new parallel software architecture is that COTS PCs can be used to implement increasingly complex processing algorithms on an increasing number of radar range gates and data rates. As new PCs become available with higher numbers of CPU cores, the software will automatically utilize the additional computational capacity.
Martínez-Zarzuela, Mario; Gómez, Carlos; Díaz-Pernas, Francisco Javier; Fernández, Alberto; Hornero, Roberto
2013-10-01
Cross-Approximate Entropy (Cross-ApEn) is a useful measure to quantify the statistical dissimilarity of two time series. In spite of the advantage of Cross-ApEn over its one-dimensional counterpart (Approximate Entropy), only a few studies have applied it to biomedical signals, mainly due to its high computational cost. In this paper, we propose a fast GPU-based implementation of the Cross-ApEn that makes feasible its use over a large amount of multidimensional data. The scheme followed is fully scalable, thus maximizes the use of the GPU despite of the number of neural signals being processed. The approach consists in processing many trials or epochs simultaneously, with independence of its origin. In the case of MEG data, these trials can proceed from different input channels or subjects. The proposed implementation achieves an average speedup greater than 250× against a CPU parallel version running on a processor containing six cores. A dataset of 30 subjects containing 148 MEG channels (49 epochs of 1024 samples per channel) can be analyzed using our development in about 30min. The same processing takes 5 days on six cores and 15 days when running on a single core. The speedup is much larger if compared to a basic sequential Matlab(®) implementation, that would need 58 days per subject. To our knowledge, this is the first contribution of Cross-ApEn measure computation using GPUs. This study demonstrates that this hardware is, to the day, the best option for the signal processing of biomedical data with Cross-ApEn. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Kinetics of veratridine action on Na channels of skeletal muscle
Sutro, JB
1986-01-01
Veratridine bath-applied to frog muscle makes inactivation of INa incomplete during a depolarizing voltage-clamp pulse and leads to a persistent veratridine-induced Na tail current. During repetitive depolarizations, the size of successive tail currents grows to a plateau and then gradually decreases. When pulsing is stopped, the tail current declines to zero with a time constant of approximately 3 s. Higher rates of stimulation result in a faster build-up of the tail current and a larger maximum value. I propose that veratridine binds only to open channels and, when bound, prevents normal fast inactivation and rapid shutting of the channel on return to rest. Veratridine-modified channels are also subject to a "slow" inactivation during long depolarizations or extended pulse trains. At rest, veratridine unbinds with a time constant of approximately 3 s. Three tests confirm these hypotheses: (a) the time course of the development of veratridine-induced tail currents parallels a running time integral of gNa during the pulse; (b) inactivating prepulses reduce the ability to evoke tails, and the voltage dependence of this reduction parallels the voltage dependence of h infinity; (c) chloramine-T, N-bromoacetamide, and scorpion toxin, agents that decrease inactivation in Na channels, each greatly enhance the tail currents and alter the time course of the appearance of the tails as predicted by the hypothesis. Veratridine-modified channels shut during hyperpolarizations from -90 mV and reopen on repolarization to -90 mV, a process that resembles normal activation gating. Veratridine appears to bind more rapidly during larger depolarizations. PMID:2419478
Interacting tilt and kink instabilities in repelling current channels
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keppens, R.; Porth, O.; Xia, C., E-mail: rony.keppens@wis.kuleuven.be
2014-11-01
We present a numerical study in resistive magnetohydrodynamics (MHD) where the initial equilibrium configuration contains adjacent, oppositely directed, parallel current channels. Since oppositely directed current channels repel, the equilibrium is liable to an ideal magnetohydrodynamic tilt instability. This tilt evolution, previously studied in planar settings, involves two magnetic islands or flux ropes, which on Alfvénic timescales undergo a combined rotation and separation. This in turn leads to the creation of (near) singular current layers, posing severe challenges to numerical approaches. Using our open-source grid-adaptive MPI-AMRVAC software, we revisit the planar evolution case in compressible MHD, as well as its extensionmore » to two-and-a-half-dimensional (2.5D) and full three-dimensional (3D) scenarios. As long as the third dimension can be ignored, pure tilt evolutions result that are hardly affected by out of plane magnetic field components. In all 2.5D runs, our simulations do show secondary tearing type disruptions throughout the near singular current sheets in the far nonlinear saturation regime. In full 3D runs, both current channels can be liable to additional ideal kink deformations. We discuss the effects of having both tilt and kink instabilities acting simultaneously in the violent, reconnection-dominated evolution. In 3D, both the tilt and the kink instabilities can be stabilized by tension forces. As a concrete space plasma application, we argue that interacting tilt-kink instabilities in repelling current channels provide a novel route to initiate solar coronal mass ejections, distinctly different from the currently favored pure kink or torus instability routes.« less
NASA Technical Reports Server (NTRS)
Hall, Lawrence O.; Bennett, Bonnie H.; Tello, Ivan
1994-01-01
A parallel version of CLIPS 5.1 has been developed to run on Intel Hypercubes. The user interface is the same as that for CLIPS with some added commands to allow for parallel calls. A complete version of CLIPS runs on each node of the hypercube. The system has been instrumented to display the time spent in the match, recognize, and act cycles on each node. Only rule-level parallelism is supported. Parallel commands enable the assertion and retraction of facts to/from remote nodes working memory. Parallel CLIPS was used to implement a knowledge-based command, control, communications, and intelligence (C(sup 3)I) system to demonstrate the fusion of high-level, disparate sources. We discuss the nature of the information fusion problem, our approach, and implementation. Parallel CLIPS has also be used to run several benchmark parallel knowledge bases such as one to set up a cafeteria. Results show from running Parallel CLIPS with parallel knowledge base partitions indicate that significant speed increases, including superlinear in some cases, are possible.
Introduction of the ASGARD Code
NASA Technical Reports Server (NTRS)
Bethge, Christian; Winebarger, Amy; Tiwari, Sanjiv; Fayock, Brian
2017-01-01
ASGARD stands for 'Automated Selection and Grouping of events in AIA Regional Data'. The code is a refinement of the event detection method in Ugarte-Urra & Warren (2014). It is intended to automatically detect and group brightenings ('events') in the AIA EUV channels, to record event parameters, and to find related events over multiple channels. Ultimately, the goal is to automatically determine heating and cooling timescales in the corona and to significantly increase statistics in this respect. The code is written in IDL and requires the SolarSoft library. It is parallelized and can run with multiple CPUs. Input files are regions of interest (ROIs) in time series of AIA images from the JSOC cutout service (http://jsoc.stanford.edu/ajax/exportdata.html). The ROIs need to be tracked, co-registered, and limited in time (typically 12 hours).
76 FR 44279 - Radio Broadcasting Services; Clinchco, VA, and Coal Run, KY
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-25
...] Radio Broadcasting Services; Clinchco, VA, and Coal Run, KY AGENCY: Federal Communications Commission... Station WPKE-FM, Coal Run Kentucky, from Channel 276A to Channel 221C3. DATES: Effective August 1, 2011... 221C3 at Coal Run, Kentucky, are 37-23-57 NL and 82-23-42 WL, and for Channel 276A at Clinchco, Virginia...
Ennajeh, Ines; Zid, Mohamed Faouzi; Driss, Ahmed
2013-01-01
The title compound, lithium/aluminium dimagnesium tetrakis[orthomolybdate(VI)], was prepared by a solid-state reaction route. The crystal structure is built up from MgO6 octahedra and MoO4 tetrahedra sharing corners and edges, forming two types of chains running along [100]. These chains are linked into layers parallel to (010) and finally linked by MoO4 tetrahedra into a three-dimensional framework structure with channels parallel to [001] in which lithium and aluminium cations equally occupy the same position within a distorted trigonal–bipyramidal coordination environment. The title structure is isotypic with LiMgIn(MoO4)3, with the In site becoming an Mg site and the fully occupied Li site a statistically occupied Li/Al site in the title structure. PMID:24426975
Aqueous outflow - a continuum from trabecular meshwork to episcleral veins
Carreon, Teresia; van der Merwe, Elizabeth; Fellman, Ronald L.; Johnstone, Murray; Bhattacharya, Sanjoy K.
2016-01-01
In glaucoma, lowered intraocular pressure (IOP) confers neuroprotection. Elevated IOP characterizes glaucoma and arises from impaired aqueous humor (AH) outflow. Increased resistance in the trabecular meshwork (TM), a filter-like structure essential to regulate AH outflow, may result in the impaired outflow. Flow through the 360° circumference of TM structures may be non-uniform, divided into high and low flow regions, termed as segmental. After flowing through the TM, AH enters Schlemm’s canal (SC), which expresses both blood and lymphatic markers; AH then passes into collector channel entrances (CCE) along the SC external well. From the CCE, AH enters a deep scleral plexus (DSP) of vessels that typically run parallel to SC. From the DSP, intrascleral collector vessels run radially to the scleral surface to connect with AH containing vessels called aqueous veins to discharge AH to blood-containing episcleral veins. However, the molecular mechanisms that maintain homeostatic properties of endothelial cells along the pathways are not well understood. How these molecular events change during aging and in glaucoma pathology remain unresolved. In this review, we propose mechanistic possibilities to explain the continuum of AH outflow control, which originates at the TM and extends through collector channels to the episcleral veins. PMID:28028002
Salko, Robert K.; Schmidt, Rodney C.; Avramova, Maria N.
2014-11-23
This study describes major improvements to the computational infrastructure of the CTF subchannel code so that full-core, pincell-resolved (i.e., one computational subchannel per real bundle flow channel) simulations can now be performed in much shorter run-times, either in stand-alone mode or as part of coupled-code multi-physics calculations. These improvements support the goals of the Department Of Energy Consortium for Advanced Simulation of Light Water Reactors (CASL) Energy Innovation Hub to develop high fidelity multi-physics simulation tools for nuclear energy design and analysis.
Wilson, J Adam; Williams, Justin C
2009-01-01
The clock speeds of modern computer processors have nearly plateaued in the past 5 years. Consequently, neural prosthetic systems that rely on processing large quantities of data in a short period of time face a bottleneck, in that it may not be possible to process all of the data recorded from an electrode array with high channel counts and bandwidth, such as electrocorticographic grids or other implantable systems. Therefore, in this study a method of using the processing capabilities of a graphics card [graphics processing unit (GPU)] was developed for real-time neural signal processing of a brain-computer interface (BCI). The NVIDIA CUDA system was used to offload processing to the GPU, which is capable of running many operations in parallel, potentially greatly increasing the speed of existing algorithms. The BCI system records many channels of data, which are processed and translated into a control signal, such as the movement of a computer cursor. This signal processing chain involves computing a matrix-matrix multiplication (i.e., a spatial filter), followed by calculating the power spectral density on every channel using an auto-regressive method, and finally classifying appropriate features for control. In this study, the first two computationally intensive steps were implemented on the GPU, and the speed was compared to both the current implementation and a central processing unit-based implementation that uses multi-threading. Significant performance gains were obtained with GPU processing: the current implementation processed 1000 channels of 250 ms in 933 ms, while the new GPU method took only 27 ms, an improvement of nearly 35 times.
NASA Technical Reports Server (NTRS)
2002-01-01
(Released 29 May 2002) The Science Today's THEMIS release captures Mangala Fossa. Mangala Fossa is a graben, which in geologic terminology translates into a long parallel to semi-parallel fracture or trough. Grabens are dropped or downthrown areas relative to the rocks on either side and these features are generally longer than they are wider. There are numerous dust devil trails seen in this image. In the lower portion of this image several dust devil tracks can be seen cutting across the upper surface then down the short stubby channel and finally back up and over to the adjacent upper surface. Some dust avalanche streaks on slopes are also visible. The rough material in the upper third of the image contains a portion of the rim of a 90 km diameter crater located in Daedalia Planum. The smooth crater floor has a graben (up to 7 km wide) and channel (2 km wide) incised into its surface. In the middle third and right of this image one can see ripples (possibly fossil dunes) on the crater floor material just above the graben. The floor of Mangala Fossa and the southern crater floor surface also have smaller linear ridges trending from the upper left to lower right. These linear ridges could be either erosional (yardangs) or depositional (dunes) landforms. The lower third of the scene contains a short stubby channel (near the right margin) and lava flow front (lower left). The floor of this channel is fairly smooth with some linear crevasses located along its course. One gets the impression that the channel floor is mantled with some type of indurated material that permits cracks to form in its surface. The Story In the Daedalia Plains on Mars, the rim of an old eroded crater rises up, a wreck of its former self (see context image at right). From the rough, choppy crater rim (top of the larger THEMIS image), the terrain descends to the almost smooth crater floor, gouged deeply by a trough, a channel, and the occasional dents of small, scattered craters. The deep trough running from southwest to northeast across the middle of this image is called 'Mangala Fossa.' Mangala Fossa is a graben, a land feature created by tectonic processes that worked to create a depression in the landscape. This graben is a little more than 4 miles wide at its maximum, but like most grabens, is much longer than it is wide. You can see from the context image that it runs across much of the width of the crater. Running southward from the graben (lower right-hand side of the larger THEMIS image) is a branching channel a little over a mile wide. The floor of this channel is fairly smooth with some linear crevasses along its course. These features suggest that the channel floor might be layered with some type of cemented material that permits cracks to form in its surface. Between the rough crater rim and the depressed graben, tiny crackles on the otherwise smooth surface appear. They might be the ripples of fossil dunes, hardened remains from a more active time. The floor of Mangala Fossa and the southern crater floor surface also feature small lines that seem to crease the surface. We know that they are ridges on the surface, but how did they form? Were higher surfaces carved away in grooves by the wind and scouring sand, forming ridges called yardangs? Or were dunes deposited on the smooth, lower terrain? No one knows for sure. Look closely for faint details as well. Do you see the subtle, scalloped pattern that laps at the lower left of the image, almost too muted to be seen? That's the sign of an ancient lava flow that stopped just there. And the shadowy gray streaks? Some are smudges caused by dust avalanches running down the slopes of the channel. Others are the tracks of dust devils that pass across the land, lifting and carrying away brighter dust to reveal the darker surface beneath. For a good example of a dust devil track, check out the faint gray line that cuts across the upper part of the channel, just below the point where it meets the graben.
NASA Astrophysics Data System (ADS)
Steinke, R. C.; Ogden, F. L.; Lai, W.; Moreno, H. A.; Pureza, L. G.
2014-12-01
Physics-based watershed models are useful tools for hydrologic studies, water resources management and economic analyses in the contexts of climate, land-use, and water-use changes. This poster presents a parallel implementation of a quasi 3-dimensional, physics-based, high-resolution, distributed water resources model suitable for simulating large watersheds in a massively parallel computing environment. Developing this model is one of the objectives of the NSF EPSCoR RII Track II CI-WATER project, which is joint between Wyoming and Utah EPSCoR jurisdictions. The model, which we call ADHydro, is aimed at simulating important processes in the Rocky Mountain west, including: rainfall and infiltration, snowfall and snowmelt in complex terrain, vegetation and evapotranspiration, soil heat flux and freezing, overland flow, channel flow, groundwater flow, water management and irrigation. Model forcing is provided by the Weather Research and Forecasting (WRF) model, and ADHydro is coupled with the NOAH-MP land-surface scheme for calculating fluxes between the land and atmosphere. The ADHydro implementation uses the Charm++ parallel run time system. Charm++ is based on location transparent message passing between migrateable C++ objects. Each object represents an entity in the model such as a mesh element. These objects can be migrated between processors or serialized to disk allowing the Charm++ system to automatically provide capabilities such as load balancing and checkpointing. Objects interact with each other by passing messages that the Charm++ system routes to the correct destination object regardless of its current location. This poster discusses the algorithms, communication patterns, and caching strategies used to implement ADHydro with Charm++. The ADHydro model code will be released to the hydrologic community in late 2014.
Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array
NASA Astrophysics Data System (ADS)
Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul
2008-04-01
This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.
SPEEDES - A multiple-synchronization environment for parallel discrete-event simulation
NASA Technical Reports Server (NTRS)
Steinman, Jeff S.
1992-01-01
Synchronous Parallel Environment for Emulation and Discrete-Event Simulation (SPEEDES) is a unified parallel simulation environment. It supports multiple-synchronization protocols without requiring users to recompile their code. When a SPEEDES simulation runs on one node, all the extra parallel overhead is removed automatically at run time. When the same executable runs in parallel, the user preselects the synchronization algorithm from a list of options. SPEEDES currently runs on UNIX networks and on the California Institute of Technology/Jet Propulsion Laboratory Mark III Hypercube. SPEEDES also supports interactive simulations. Featured in the SPEEDES environment is a new parallel synchronization approach called Breathing Time Buckets. This algorithm uses some of the conservative techniques found in Time Bucket synchronization, along with the optimism that characterizes the Time Warp approach. A mathematical model derived from first principles predicts the performance of Breathing Time Buckets. Along with the Breathing Time Buckets algorithm, this paper discusses the rules for processing events in SPEEDES, describes the implementation of various other synchronization protocols supported by SPEEDES, describes some new ones for the future, discusses interactive simulations, and then gives some performance results.
Correia, Vanda; Araújo, Duarte; Cummins, Alan; Craig, Cathy M
2012-06-01
This study used a virtual, simulated 3 vs. 3 rugby task to investigate whether gaps opening in particular running channels promote different actions by the ball carrier player and whether an effect of rugby expertise is verified. We manipulated emergent gaps in three different locations: Gap 1 in the participant's own running channel, Gap 2 in the first receiver's running channel, and Gap 3 in the second receiver's running channel. Recreational, intermediate, professional, and nonrugby players performed the task. They could (i) run with the ball, (ii) make a short pass, or (iii) make a long pass. All actions were digitally recorded. Results revealed that the emergence of gaps in the defensive line with respect to the participant's own position significantly influenced action selection. Namely, "run" was most often the action performed in Gap 1, "short pass" in Gap 2, and "long pass" in Gap 3 trials. Furthermore, a strong positive relationship between expertise and task achievement was found.
Engbers, Jordan D T; Anderson, Dustin; Asmara, Hadhimulya; Rehak, Renata; Mehaffey, W Hamish; Hameed, Shahid; McKay, Bruce E; Kruskic, Mirna; Zamponi, Gerald W; Turner, Ray W
2012-02-14
Encoding sensory input requires the expression of postsynaptic ion channels to transform key features of afferent input to an appropriate pattern of spike output. Although Ca(2+)-activated K(+) channels are known to control spike frequency in central neurons, Ca(2+)-activated K(+) channels of intermediate conductance (KCa3.1) are believed to be restricted to peripheral neurons. We now report that cerebellar Purkinje cells express KCa3.1 channels, as evidenced through single-cell RT-PCR, immunocytochemistry, pharmacology, and single-channel recordings. Furthermore, KCa3.1 channels coimmunoprecipitate and interact with low voltage-activated Cav3.2 Ca(2+) channels at the nanodomain level to support a previously undescribed transient voltage- and Ca(2+)-dependent current. As a result, subthreshold parallel fiber excitatory postsynaptic potentials (EPSPs) activate Cav3 Ca(2+) influx to trigger a KCa3.1-mediated regulation of the EPSP and subsequent after-hyperpolarization. The Cav3-KCa3.1 complex provides powerful control over temporal summation of EPSPs, effectively suppressing low frequencies of parallel fiber input. KCa3.1 channels thus contribute to a high-pass filter that allows Purkinje cells to respond preferentially to high-frequency parallel fiber bursts characteristic of sensory input.
Sawyer, William C.
1995-01-01
An apparatus for supporting a heating element in a channel formed in a heater base is disclosed. A preferred embodiment includes a substantially U-shaped tantalum member. The U-shape is characterized by two substantially parallel portions of tantalum that each have an end connected to opposite ends of a base portion of tantalum. The parallel portions are each substantially perpendicular to the base portion and spaced apart a distance not larger than a width of the channel and not smaller than a width of a graphite heating element. The parallel portions each have a hole therein, and the centers of the holes define an axis that is substantially parallel to the base portion. An aluminum oxide ceramic retaining pin extends through the holes in the parallel portions and into a hole in a wall of the channel to retain the U-shaped member in the channel and to support the graphite heating element. The graphite heating element is confined by the parallel portions of tantalum, the base portion of tantalum, and the retaining pin. A tantalum tube surrounds the retaining pin between the parallel portions of tantalum.
Sawyer, W.C.
1995-08-15
An apparatus for supporting a heating element in a channel formed in a heater base is disclosed. A preferred embodiment includes a substantially U-shaped tantalum member. The U-shape is characterized by two substantially parallel portions of tantalum that each have an end connected to opposite ends of a base portion of tantalum. The parallel portions are each substantially perpendicular to the base portion and spaced apart a distance not larger than a width of the channel and not smaller than a width of a graphite heating element. The parallel portions each have a hole therein, and the centers of the holes define an axis that is substantially parallel to the base portion. An aluminum oxide ceramic retaining pin extends through the holes in the parallel portions and into a hole in a wall of the channel to retain the U-shaped member in the channel and to support the graphite heating element. The graphite heating element is confined by the parallel portions of tantalum, the base portion of tantalum, and the retaining pin. A tantalum tube surrounds the retaining pin between the parallel portions of tantalum. 6 figs.
Visualization and Tracking of Parallel CFD Simulations
NASA Technical Reports Server (NTRS)
Vaziri, Arsi; Kremenetsky, Mark
1995-01-01
We describe a system for interactive visualization and tracking of a 3-D unsteady computational fluid dynamics (CFD) simulation on a parallel computer. CM/AVS, a distributed, parallel implementation of a visualization environment (AVS) runs on the CM-5 parallel supercomputer. A CFD solver is run as a CM/AVS module on the CM-5. Data communication between the solver, other parallel visualization modules, and a graphics workstation, which is running AVS, are handled by CM/AVS. Partitioning of the visualization task, between CM-5 and the workstation, can be done interactively in the visual programming environment provided by AVS. Flow solver parameters can also be altered by programmable interactive widgets. This system partially removes the requirement of storing large solution files at frequent time steps, a characteristic of the traditional 'simulate (yields) store (yields) visualize' post-processing approach.
Building a delta: Interactions between water, sediment, and vegetation in an experimental system
NASA Astrophysics Data System (ADS)
Piliouras, A.; Kim, W.; Carlson, B.
2013-12-01
Vegetation is an important part of morphodynamics in river deltas, but it has not been thoroughly investigated in physical delta models. We conducted a set of experiments in the Sediment Transport and Earth-surface Processes (STEP) Basin at the University of Texas at Austin to examine the effects of vegetation on delta growth and dynamics. One experiment was conducted without vegetation (Run 1), and four (Runs 2-5) were conducted using alfalfa (Medicago sativa) as a proxy for riparian vegetation, one of which included cycles between flood and normal flow discharges (Run 5). Results indicate that vegetation increased sediment trapping on the delta topset, increasing delta slope and decreasing progradation rate as compared to the unvegetated experiment. Vegetation also caused a lack of channelization when the topset reached 20% plant cover, after which progradational delta lobes were no longer evident. Discharge fluctuations in Run 5, however, led to more topset reworking, resulting in lower vegetation density (< 20%) and the persistence of highly incisional channels. Experiments run only at flood stage resulted in consistently net depositional deltas with very little channel incision, regardless of the amount of vegetation. The addition of water and sediment discharge fluctuations in Run 5, however, created a cyclic pattern between periods of topset aggradation and periods of channel incision that were net erosional. We conclude that there is a two-way interaction between the vegetation and the channels through discharge fluctuations that aid in delta growth. (1) During floods, vegetation acts an efficient sediment trapper on the floodplain to aid in topset aggradation and maintain channel relief. During normal flow, vegetation also stabilizes channel banks, allowing channels to focus their flow and erode sediment from the bed. (2) During floods, channels transport sediment to the shoreline to create new deposits that can be colonized by vegetation and deliver sediment to the topset to increase vegetation elevation. During normal flow, channels rework the delta topset and remove seeds from occupied flow paths.
Implementation and performance of parallel Prolog interpreter
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wei, S.; Kale, L.V.; Balkrishna, R.
1988-01-01
In this paper, the authors discuss the implementation of a parallel Prolog interpreter on different parallel machines. The implementation is based on the REDUCE--OR process model which exploits both AND and OR parallelism in logic programs. It is machine independent as it runs on top of the chare-kernel--a machine-independent parallel programming system. The authors also give the performance of the interpreter running a diverse set of benchmark pargrams on parallel machines including shared memory systems: an Alliant FX/8, Sequent and a MultiMax, and a non-shared memory systems: Intel iPSC/32 hypercube, in addition to its performance on a multiprocessor simulation system.
Multi-gigabit optical interconnects for next-generation on-board digital equipment
NASA Astrophysics Data System (ADS)
Venet, Norbert; Favaro, Henri; Sotom, Michel; Maignan, Michel; Berthon, Jacques
2017-11-01
Parallel optical interconnects are experimentally assessed as a technology that may offer the high-throughput data communication capabilities required to the next-generation on-board digital processing units. An optical backplane interconnect was breadboarded, on the basis of a digital transparent processor that provides flexible connectivity and variable bandwidth in telecom missions with multi-beam antenna coverage. The unit selected for the demonstration required that more than tens of Gbit/s be supported by the backplane. The demonstration made use of commercial parallel optical link modules at 850 nm wavelength, with 12 channels running at up to 2.5 Gbit/s. A flexible optical fibre circuit was developed so as to route board-to-board connections. It was plugged to the optical transmitter and receiver modules through 12-fibre MPO connectors. BER below 10-14 and optical link budgets in excess of 12 dB were measured, which would enable to integrate broadcasting. Integration of the optical backplane interconnect was successfully demonstrated by validating the overall digital processor functionality.
Multi-gigabit optical interconnects for next-generation on-board digital equipment
NASA Astrophysics Data System (ADS)
Venet, Norbert; Favaro, Henri; Sotom, Michel; Maignan, Michel; Berthon, Jacques
2004-06-01
Parallel optical interconnects are experimentally assessed as a technology that may offer the high-throughput data communication capabilities required to the next-generation on-board digital processing units. An optical backplane interconnect was breadboarded, on the basis of a digital transparent processor that provides flexible connectivity and variable bandwidth in telecom missions with multi-beam antenna coverage. The unit selected for the demonstration required that more than tens of Gbit/s be supported by the backplane. The demonstration made use of commercial parallel optical link modules at 850 nm wavelength, with 12 channels running at up to 2.5 Gbit/s. A flexible optical fibre circuit was developed so as to route board-to-board connections. It was plugged to the optical transmitter and receiver modules through 12-fibre MPO connectors. BER below 10-14 and optical link budgets in excess of 12 dB were measured, which would enable to integrate broadcasting. Integration of the optical backplane interconnect was successfully demonstrated by validating the overall digital processor functionality.
NASA Astrophysics Data System (ADS)
Kim, Cheolhwan; Kim, Kyu-Jung; Ha, Man Yeong
To investigate the possibility of the portable application of a direct borohydride fuel cell (DBFC), weight reduction of the stack and high stacking of the cells are investigated for practical running conditions. For weight reduction, carbon graphite is adopted as the bipolar plate material even though it has disadvantages in tight stacking, which results in stacking loss from insufficient material strength. For high stacking, it is essential to have a uniform fuel distribution among cells and channels to maintain equal electric load on each cell. In particular, the design of the anode channel is important because active hydrogen generation causes non-uniformity in the fuel flow-field of the cells and channels. To reduce the disadvantages of stacking force margin and fuel maldistribution, an O-ring type-sealing system with an internal manifold and a parallel anode channel design is adopted, and the characteristics of a single and a five-cell fuel cell stack are analyzed. By adopting carbon graphite, the stack weight can be reduced by 4.2 times with 12% of performance degradation from the insufficient stacking force. When cells are stacked, the performance exceeds the single-cell performance because of the stack temperature increase from the reduction of the radiation area from the narrow stacking of cells.
Silver indium diphosphate, AgInP(2)O(7).
Zouihri, Hafid; Saadi, Mohamed; Jaber, Boujemaa; El Ammari, Lehcen
2010-12-18
Polycrystalline material of the title compound, AgInP(2)O(7), was synthesized by traditional high-temperature solid-state methods and single crystals were grown from the melt of a mixture of AgInP(2)O(7) and B(2)O(3) as flux in a platinium crucible. The structure consists of InO(6) octa-hedra, which are corner-shared to PO(4) tetra-hedra into a three-dimensional network with hexa-gonal channels running parallel to the c axis. The silver cation, located in the channel, is bonded to seven O atoms of the [InP(2)O(7)] framework with Ag-O distances ranging from 2.370 (2) to 3.015 (2) Å. The P(2)O(7) diphosphate anion is characterized by a P-O-P angle of 137.27 (9) and a nearly eclipsed conformation. AgInP(2)O(7) is isotypic with the M(I)FeP(2)O(7) (M(I) = Na, K, Rb, Cs and Ag) diphosphate family.
MPI_XSTAR: MPI-based Parallelization of the XSTAR Photoionization Program
NASA Astrophysics Data System (ADS)
Danehkar, Ashkbiz; Nowak, Michael A.; Lee, Julia C.; Smith, Randall K.
2018-02-01
We describe a program for the parallel implementation of multiple runs of XSTAR, a photoionization code that is used to predict the physical properties of an ionized gas from its emission and/or absorption lines. The parallelization program, called MPI_XSTAR, has been developed and implemented in the C++ language by using the Message Passing Interface (MPI) protocol, a conventional standard of parallel computing. We have benchmarked parallel multiprocessing executions of XSTAR, using MPI_XSTAR, against a serial execution of XSTAR, in terms of the parallelization speedup and the computing resource efficiency. Our experience indicates that the parallel execution runs significantly faster than the serial execution, however, the efficiency in terms of the computing resource usage decreases with increasing the number of processors used in the parallel computing.
Nadkarni, P M; Miller, P L
1991-01-01
A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations.
Run-time parallelization and scheduling of loops
NASA Technical Reports Server (NTRS)
Saltz, Joel H.; Mirchandaney, Ravi; Crowley, Kay
1991-01-01
Run-time methods are studied to automatically parallelize and schedule iterations of a do loop in certain cases where compile-time information is inadequate. The methods presented involve execution time preprocessing of the loop. At compile-time, these methods set up the framework for performing a loop dependency analysis. At run-time, wavefronts of concurrently executable loop iterations are identified. Using this wavefront information, loop iterations are reordered for increased parallelism. Symbolic transformation rules are used to produce: inspector procedures that perform execution time preprocessing, and executors or transformed versions of source code loop structures. These transformed loop structures carry out the calculations planned in the inspector procedures. Performance results are presented from experiments conducted on the Encore Multimax. These results illustrate that run-time reordering of loop indexes can have a significant impact on performance.
Behavior of Caulobacter Crescentus Diagnosed Using a 3-Channel Microfluidic Device
NASA Astrophysics Data System (ADS)
Tang, Jay; Morse, Michael; Colin, Remy; Wilson, Laurence
2015-03-01
Many motile microorganisms are able to detect chemical gradients in their surroundings in order to bias their motion towards more favorable conditions. We study the biased motility of Caulobacter crescentus, a singly flagellated bacteria, which alternate between forward and backward swimming, driven by its flagella motor, which switches in rotation direction. We observe the swimming patterns of C. crescents in an oxygen gradient, which is established by flowing atmospheric air and pure nitrogen through a 3 parallel channel microfluidic device. In this setup, oxygen diffuses through the PDMS device and the bacterial medium, creating a linear gradient. Using low magnification, dark field microscopy, individual cells are tracked over a large field of view, with particular interest in the cells' motion relative to the oxygen gradient. Utilizing observable differences between backward and forward swimming motion, motor switching events can be identified. By analyzing these run time intervals between motor switches as a function of a cell's local oxygen level, we demonstrate that C. crescentus displays aerotacitc behavior by extending forward swimming run times while moving up an oxygen gradient, resulting in directed motility towards oxygen sources. Additionally, motor switching response is sensitive to both the steepness of the gradient experienced and background oxygen levels with cells exhibiting a logarithmic response to oxygen levels. Work funded by the United States National Science Foundation and by the Rowland Institute at Harvard University.
Design of k-Space Channel Combination Kernels and Integration with Parallel Imaging
Beatty, Philip J.; Chang, Shaorong; Holmes, James H.; Wang, Kang; Brau, Anja C. S.; Reeder, Scott B.; Brittain, Jean H.
2014-01-01
Purpose In this work, a new method is described for producing local k-space channel combination kernels using a small amount of low-resolution multichannel calibration data. Additionally, this work describes how these channel combination kernels can be combined with local k-space unaliasing kernels produced by the calibration phase of parallel imaging methods such as GRAPPA, PARS and ARC. Methods Experiments were conducted to evaluate both the image quality and computational efficiency of the proposed method compared to a channel-by-channel parallel imaging approach with image-space sum-of-squares channel combination. Results Results indicate comparable image quality overall, with some very minor differences seen in reduced field-of-view imaging. It was demonstrated that this method enables a speed up in computation time on the order of 3–16X for 32-channel data sets. Conclusion The proposed method enables high quality channel combination to occur earlier in the reconstruction pipeline, reducing computational and memory requirements for image reconstruction. PMID:23943602
NASA Astrophysics Data System (ADS)
Decyk, Viktor K.; Dauger, Dean E.
We have constructed a parallel cluster consisting of Apple Macintosh G4 computers running both Classic Mac OS as well as the Unix-based Mac OS X, and have achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. Unlike other Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the mainstream of computing.
Microfluidic channel flow cell for simultaneous cryoelectrochemical electron spin resonance.
Wain, Andrew J; Compton, Richard G; Le Roux, Rudolph; Matthews, Sinead; Fisher, Adrian C
2007-03-01
A novel microfluidic electrochemical channel flow cell has been constructed for in situ operation in a cylindrical TE011 resonant ESR cavity under variable temperature conditions. The cell has a U-tube configuration, consisting of an inlet and outlet channel which run parallel and contain evaporated gold film working, pseudo-reference, and counter electrodes. This geometry was employed to permit use in conjunction with variable temperature apparatus which does not allow a flow-through approach. The cell is characterized qualitatively and quantitatively using the one-electron reduction of p-bromonitrobenzene in acetonitrile at room temperature as a model system, and the ESR signal-flow rate response is validated by use of three-dimensional digital simulation of the concentration profile for a stable electrogenerated radical species under hydrodynamic conditions. The cell is then used to obtain ESR spectra for a number of radical species in acetonitrile at 233 K, including the radical anions of m- and p-iodonitrobenzene, o-bromonitrobenzene, and m-nitrobenzyl chloride, the latter three being unstable at room temperature. Spectra are also presented for the radical anion of 2-chloranthraquinone and the crystal violet radical, which display improved resolution at low temperatures.
Winter habitat use by cutthroat trout in the Snake River near Jackson, Wyoming
Harper, D.D.; Farag, A.M.
2004-01-01
Winter habitat use by Yellowstone cutthroat trout Oncorhynchus clarki bouvieri was monitored with radiotelemetry during November-March 1998-2001 in channelized and unaltered sections of the Snake River near Jackson, Wyoming. The use of run and off-channel pool habitat was significantly correlated to water temperature; run use was most frequent when mean water temperature exceeded 1.0°C, and off-channel pool use was greatest when mean water temperature was below 1.0°C. Available habitat was surveyed during winter 1999-2000 and was compared with actual habitat use. This comparison indicated that cutthroat trout avoided riffle habitat, selected deep runs, and strongly selected off-channel pool habitat. Large, deep, off-channel pools with groundwater influence were uncommon in the study area but were frequently selected as over-wintering habitat in the channelized section during all three study years. During 2000-2001, mainstem water temperatures were significantly colder than in 1998-1999 or 1999-2000, and anchor ice was observed more frequently in 2000-2001 than in 1998-1999 or 1999-2000 (on 18 d versus 5 d and 3 d, respectively). Mean water temperatures in off-channel pools were not significantly different among years. Depth and shelf ice were most frequently identified as cover elements in the channelized section. Run habitat was more common and used more frequently upstream of the channelized section. Large woody debris was more common and selected more frequently as cover in the unaltered section than in the channelized section.
View looking SW at brick retaining wall running parallel to ...
View looking SW at brick retaining wall running parallel to Jones Street showing bricked up storage vaults - Central of Georgia Railway, Savannah Repair Shops & Terminal Facilities, Brick Storage Vaults under Jones Street, Bounded by West Broad, Jones, West Boundary & Hull Streets, Savannah, Chatham County, GA
How to Build an AppleSeed: A Parallel Macintosh Cluster for Numerically Intensive Computing
NASA Astrophysics Data System (ADS)
Decyk, V. K.; Dauger, D. E.
We have constructed a parallel cluster consisting of a mixture of Apple Macintosh G3 and G4 computers running the Mac OS, and have achieved very good performance on numerically intensive, parallel plasma particle-incell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the main stream of computing.
Creating a Parallel Version of VisIt for Microsoft Windows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Whitlock, B J; Biagas, K S; Rawson, P L
2011-12-07
VisIt is a popular, free interactive parallel visualization and analysis tool for scientific data. Users can quickly generate visualizations from their data, animate them through time, manipulate them, and save the resulting images or movies for presentations. VisIt was designed from the ground up to work on many scales of computers from modest desktops up to massively parallel clusters. VisIt is comprised of a set of cooperating programs. All programs can be run locally or in client/server mode in which some run locally and some run remotely on compute clusters. The VisIt program most able to harness today's computing powermore » is the VisIt compute engine. The compute engine is responsible for reading simulation data from disk, processing it, and sending results or images back to the VisIt viewer program. In a parallel environment, the compute engine runs several processes, coordinating using the Message Passing Interface (MPI) library. Each MPI process reads some subset of the scientific data and filters the data in various ways to create useful visualizations. By using MPI, VisIt has been able to scale well into the thousands of processors on large computers such as dawn and graph at LLNL. The advent of multicore CPU's has made parallelism the 'new' way to achieve increasing performance. With today's computers having at least 2 cores and in many cases up to 8 and beyond, it is more important than ever to deploy parallel software that can use that computing power not only on clusters but also on the desktop. We have created a parallel version of VisIt for Windows that uses Microsoft's MPI implementation (MSMPI) to process data in parallel on the Windows desktop as well as on a Windows HPC cluster running Microsoft Windows Server 2008. Initial desktop parallel support for Windows was deployed in VisIt 2.4.0. Windows HPC cluster support has been completed and will appear in the VisIt 2.5.0 release. We plan to continue supporting parallel VisIt on Windows so our users will be able to take full advantage of their multicore resources.« less
Nadkarni, P. M.; Miller, P. L.
1991-01-01
A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations. PMID:1807632
Im, Hyungsoon; Lesuffleur, Antoine; Lindquist, Nathan C.; Oh, Sang-Hyun
2009-01-01
We present nanohole arrays in a gold film integrated with a 6-channel microfluidic chip for parallel measurements of molecular binding kinetics. Surface plasmon resonance effects in the nanohole arrays enable real-time label-free measurements of molecular binding events in each channel, while adjacent negative reference channels can record measurement artifacts such as bulk solution index changes, temperature variations, or changing light absorption in the liquid. Using this platform, streptavidin-biotin specific binding kinetics are measured at various concentrations with negative controls. A high-density microarray of 252 biosensing pixels is also demonstrated with a packing density of 106 sensing elements/cm2, which can potentially be coupled with a massively parallel array of microfluidic channels for protein microarray applications. PMID:19284776
Ji, Jim; Wright, Steven
2005-01-01
Parallel imaging using multiple phased-array coils and receiver channels has become an effective approach to high-speed magnetic resonance imaging (MRI). To obtain high spatiotemporal resolution, the k-space is subsampled and later interpolated using multiple channel data. Higher subsampling factors result in faster image acquisition. However, the subsampling factors are upper-bounded by the number of parallel channels. Phase constraints have been previously proposed to overcome this limitation with some success. In this paper, we demonstrate that in certain applications it is possible to obtain acceleration factors potentially up to twice the channel numbers by using a real image constraint. Data acquisition and processing methods to manipulate and estimate of the image phase information are presented for improving image reconstruction. In-vivo brain MRI experimental results show that accelerations up to 6 are feasible with 4-channel data.
Kowalski, Erik; Li, Jing Xian
2016-11-01
This study investigated the normal and parallel ground reaction forces during downhill and uphill running in habitual forefoot strike and habitual rearfoot strike (RFS) runners. Fifteen habitual forefoot strike and 15 habitual RFS recreational male runners ran at 3 m/s ± 5% during level, uphill and downhill overground running on a ramp mounted at 6° and 9°. Results showed that forefoot strike runners had no visible impact peak in all running conditions, while the impact peaks only decreased during the uphill conditions in RFS runners. Active peaks decreased during the downhill conditions in forefoot strike runners while active loading rates increased during downhill conditions in RFS runners. Compared to the level condition, parallel braking peaks were larger during downhill conditions and parallel propulsive peaks were larger during uphill conditions. Combined with previous biomechanics studies, our findings suggest that forefoot strike running may be an effective strategy to reduce impacts, especially during downhill running. These findings may have further implications towards injury management and prevention.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aoki, Kenji
A read/write head for a magnetic tape includes an elongated chip assembly and a tape running surface formed in the longitudinal direction of the chip assembly. A pair of substantially spaced parallel read/write gap lines for supporting read/write elements extend longitudinally along the tape running surface of the chip assembly. Also, at least one groove is formed on the tape running surface on both sides of each of the read/write gap lines and extends substantially parallel to the read/write gap lines.
NASA Astrophysics Data System (ADS)
Obbade, S.; Dion, C.; Rivenet, M.; Saadi, M.; Abraham, F.
2004-06-01
A new sodium uranyl vanadate Na(UO 2) 4(VO 4) 3 has been synthesized by solid-state reaction and its structure determined from single-crystal X-ray diffraction data. It crystallizes in the tetragonal symmetry with space group I4 1/ amd and following cell parameters: a=7.2267(4) Å and c=34.079(4) Å, V=1779.8(2) Å 3, Z=4 with ρmes=5.36(3) g/cm 3 and ρcal=5.40(2) g/cm 3. A full-matrix least-squares refinement on the basis of F2 yielded R1=0.028 and w R2=0.056 for 52 parameters with 474 independent reflections with I⩾2 σ( I) collected on a BRUKER AXS diffractometer with Mo Kα radiation and a CCD detector. The crystal structure is characterized by ∞2[(UO 2) 2(VO 4)] sheets parallel to (001) formed by corner-shared UO 6 distorted octahedra and V(2)O 4 tetrahedra, connected by V(1)O 4 tetrahedra to ∞1[UO 5] 4- chains of edge-shared UO 7 pentagonal bipyramids alternately parallel to the a- and b-axis. The resulting three-dimensional framework creates mono-dimensional channels running down the a- and b-axis formed by face-shared oxygen octahedra half occupied by Na. The powder of Li analog compound Li(UO 2) 4(VO 4) 3 has been synthesized by solid-state reaction. The two compounds exhibit high mobility of the alkaline ions within the two-dimensional network of non-intersecting channels.
A parallel finite element simulator for ion transport through three-dimensional ion channel systems.
Tu, Bin; Chen, Minxin; Xie, Yan; Zhang, Linbo; Eisenberg, Bob; Lu, Benzhuo
2013-09-15
A parallel finite element simulator, ichannel, is developed for ion transport through three-dimensional ion channel systems that consist of protein and membrane. The coordinates of heavy atoms of the protein are taken from the Protein Data Bank and the membrane is represented as a slab. The simulator contains two components: a parallel adaptive finite element solver for a set of Poisson-Nernst-Planck (PNP) equations that describe the electrodiffusion process of ion transport, and a mesh generation tool chain for ion channel systems, which is an essential component for the finite element computations. The finite element method has advantages in modeling irregular geometries and complex boundary conditions. We have built a tool chain to get the surface and volume mesh for ion channel systems, which consists of a set of mesh generation tools. The adaptive finite element solver in our simulator is implemented using the parallel adaptive finite element package Parallel Hierarchical Grid (PHG) developed by one of the authors, which provides the capability of doing large scale parallel computations with high parallel efficiency and the flexibility of choosing high order elements to achieve high order accuracy. The simulator is applied to a real transmembrane protein, the gramicidin A (gA) channel protein, to calculate the electrostatic potential, ion concentrations and I - V curve, with which both primitive and transformed PNP equations are studied and their numerical performances are compared. To further validate the method, we also apply the simulator to two other ion channel systems, the voltage dependent anion channel (VDAC) and α-Hemolysin (α-HL). The simulation results agree well with Brownian dynamics (BD) simulation results and experimental results. Moreover, because ionic finite size effects can be included in PNP model now, we also perform simulations using a size-modified PNP (SMPNP) model on VDAC and α-HL. It is shown that the size effects in SMPNP can effectively lead to reduced current in the channel, and the results are closer to BD simulation results. Copyright © 2013 Wiley Periodicals, Inc.
Nematic liquid crystals on sinusoidal channels: the zigzag instability.
Silvestre, Nuno M; Romero-Enrique, Jose M; Telo da Gama, Margarida M
2017-01-11
Substrates which are chemically or topographically patterned induce a variety of liquid crystal textures. The response of the liquid crystal to competing surface orientations, typical of patterned substrates, is determined by the anisotropy of the elastic constants and the interplay of the relevant lengths scales, such as the correlation length and the surface geometrical parameters. Transitions between different textures, usually with different symmetries, may occur under a wide range of conditions. We use the Landau-de Gennes free energy to investigate the texture of nematics in sinusoidal channels with parallel anchoring bounded by nematic-air interfaces that favour perpendicular (hometropic) anchoring. In micron size channels 5CB was observed to exhibit a non-trivial texture characterized by a disclination line, within the channel, which is broken into a zigzag pattern. Our calculations reveal that when the elastic anisotropy of the nematic does not favour twist distortions the defect is a straight disclination line that runs along the channel, which breaks into a zigzag pattern with a characteristic period, when the twist elastic constant becomes sufficiently small when compared to the splay and bend constants. The transition occurs through a twist instability that drives the defect line to rotate from its original position. The interplay between the energetically favourable twist distortions that induce the defect rotation and the liquid crystal anchoring at the surfaces leads to the zigzag pattern. We investigate in detail the dependence of the periodicity of the zigzag pattern on the geometrical parameters of the sinusoidal channels, which in line with the experimental results is found to be non-linear.
Run-time parallelization and scheduling of loops
NASA Technical Reports Server (NTRS)
Saltz, Joel H.; Mirchandaney, Ravi; Crowley, Kay
1990-01-01
Run time methods are studied to automatically parallelize and schedule iterations of a do loop in certain cases, where compile-time information is inadequate. The methods presented involve execution time preprocessing of the loop. At compile-time, these methods set up the framework for performing a loop dependency analysis. At run time, wave fronts of concurrently executable loop iterations are identified. Using this wavefront information, loop iterations are reordered for increased parallelism. Symbolic transformation rules are used to produce: inspector procedures that perform execution time preprocessing and executors or transformed versions of source code loop structures. These transformed loop structures carry out the calculations planned in the inspector procedures. Performance results are presented from experiments conducted on the Encore Multimax. These results illustrate that run time reordering of loop indices can have a significant impact on performance. Furthermore, the overheads associated with this type of reordering are amortized when the loop is executed several times with the same dependency structure.
Moody, Katherine Lynn; Hollingsworth, Neal A.; Zhao, Feng; Nielsen, Jon-Fredrik; Noll, Douglas C.; Wright, Steven M.; McDougall, Mary Preston
2014-01-01
Parallel transmit is an emerging technology to address the technical challenges associated with MR imaging at high field strengths. When developing arrays for parallel transmit systems, one of the primary factors to be considered is the mechanism to manage coupling and create independently operating channels. Recent work has demonstrated the use of amplifiers to provide some or all of the channel-to-channel isolation, reducing the need for on-coil decoupling networks in a manner analogous to the use of isolation preamplifiers with receive coils. This paper discusses an eight-channel transmit/receive head array for use with an ultra-low output impedance (ULOI) parallel transmit system. The ULOI amplifiers eliminated the need for a complex lumped element network to decouple the eight rung array. The design and construction details of the array are discussed in addition to the measurement considerations required for appropriately characterizing an array when using ULOI amplifiers. B1 maps and coupling matrices are used to verify the performance of the system. PMID:25072190
NASA Astrophysics Data System (ADS)
Moody, Katherine Lynn; Hollingsworth, Neal A.; Zhao, Feng; Nielsen, Jon-Fredrik; Noll, Douglas C.; Wright, Steven M.; McDougall, Mary Preston
2014-09-01
Parallel transmit is an emerging technology to address the technical challenges associated with MR imaging at high field strengths. When developing arrays for parallel transmit systems, one of the primary factors to be considered is the mechanism to manage coupling and create independently operating channels. Recent work has demonstrated the use of amplifiers to provide some or all of the channel-to-channel isolation, reducing the need for on-coil decoupling networks in a manner analogous to the use of isolation preamplifiers with receive coils. This paper discusses an eight-channel transmit/receive head array for use with an ultra-low output impedance (ULOI) parallel transmit system. The ULOI amplifiers eliminated the need for a complex lumped element network to decouple the eight-rung array. The design and construction details of the array are discussed in addition to the measurement considerations required for appropriately characterizing an array when using ULOI amplifiers. B1 maps and coupling matrices are used to verify the performance of the system.
IQ imbalance tolerable parallel-channel DMT transmission for coherent optical OFDMA access network
NASA Astrophysics Data System (ADS)
Jung, Sang-Min; Mun, Kyoung-Hak; Jung, Sun-Young; Han, Sang-Kook
2016-12-01
Phase diversity of coherent optical communication provides spectrally efficient higher-order modulation for optical communications. However, in-phase/quadrature (IQ) imbalance in coherent optical communication degrades transmission performance by introducing unwanted signal distortions. In a coherent optical orthogonal frequency division multiple access (OFDMA) passive optical network (PON), IQ imbalance-induced signal distortions degrade transmission performance by interferences of mirror subcarriers, inter-symbol interference (ISI), and inter-channel interference (ICI). We propose parallel-channel discrete multitone (DMT) transmission to mitigate transceiver IQ imbalance-induced signal distortions in coherent orthogonal frequency division multiplexing (OFDM) transmissions. We experimentally demonstrate the effectiveness of parallel-channel DMT transmission compared with that of OFDM transmission in the presence of IQ imbalance.
NASA Astrophysics Data System (ADS)
Higashino, Satoru; Kobayashi, Shoei; Yamagami, Tamotsu
2007-06-01
High data transfer rate has been demanded for data storage devices along increasing the storage capacity. In order to increase the transfer rate, high-speed data processing techniques in read-channel devices are required. Generally, parallel architecture is utilized for the high-speed digital processing. We have developed a new architecture of Interpolated Timing Recovery (ITR) to achieve high-speed data transfer rate and wide capture-range in read-channel devices for the information storage channels. It facilitates the parallel implementation on large-scale-integration (LSI) devices.
DREAM: An Efficient Methodology for DSMC Simulation of Unsteady Processes
NASA Astrophysics Data System (ADS)
Cave, H. M.; Jermy, M. C.; Tseng, K. C.; Wu, J. S.
2008-12-01
A technique called the DSMC Rapid Ensemble Averaging Method (DREAM) for reducing the statistical scatter in the output from unsteady DSMC simulations is introduced. During post-processing by DREAM, the DSMC algorithm is re-run multiple times over a short period before the temporal point of interest thus building up a combination of time- and ensemble-averaged sampling data. The particle data is regenerated several mean collision times before the output time using the particle data generated during the original DSMC run. This methodology conserves the original phase space data from the DSMC run and so is suitable for reducing the statistical scatter in highly non-equilibrium flows. In this paper, the DREAM-II method is investigated and verified in detail. Propagating shock waves at high Mach numbers (Mach 8 and 12) are simulated using a parallel DSMC code (PDSC) and then post-processed using DREAM. The ability of DREAM to obtain the correct particle velocity distribution in the shock structure is demonstrated and the reduction of statistical scatter in the output macroscopic properties is measured. DREAM is also used to reduce the statistical scatter in the results from the interaction of a Mach 4 shock with a square cavity and for the interaction of a Mach 12 shock on a wedge in a channel.
NASA Technical Reports Server (NTRS)
Springer, P.
1993-01-01
This paper discusses the method in which the Cascade-Correlation algorithm was parallelized in such a way that it could be run using the Time Warp Operating System (TWOS). TWOS is a special purpose operating system designed to run parellel discrete event simulations with maximum efficiency on parallel or distributed computers.
NASA Astrophysics Data System (ADS)
Plaza, Antonio; Plaza, Javier; Paz, Abel
2010-10-01
Latest generation remote sensing instruments (called hyperspectral imagers) are now able to generate hundreds of images, corresponding to different wavelength channels, for the same area on the surface of the Earth. In previous work, we have reported that the scalability of parallel processing algorithms dealing with these high-dimensional data volumes is affected by the amount of data to be exchanged through the communication network of the system. However, large messages are common in hyperspectral imaging applications since processing algorithms are pixel-based, and each pixel vector to be exchanged through the communication network is made up of hundreds of spectral values. Thus, decreasing the amount of data to be exchanged could improve the scalability and parallel performance. In this paper, we propose a new framework based on intelligent utilization of wavelet-based data compression techniques for improving the scalability of a standard hyperspectral image processing chain on heterogeneous networks of workstations. This type of parallel platform is quickly becoming a standard in hyperspectral image processing due to the distributed nature of collected hyperspectral data as well as its flexibility and low cost. Our experimental results indicate that adaptive lossy compression can lead to improvements in the scalability of the hyperspectral processing chain without sacrificing analysis accuracy, even at sub-pixel precision levels.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lichtner, Peter C.; Hammond, Glenn E.; Lu, Chuan
PFLOTRAN solves a system of generally nonlinear partial differential equations describing multi-phase, multicomponent and multiscale reactive flow and transport in porous materials. The code is designed to run on massively parallel computing architectures as well as workstations and laptops (e.g. Hammond et al., 2011). Parallelization is achieved through domain decomposition using the PETSc (Portable Extensible Toolkit for Scientific Computation) libraries for the parallelization framework (Balay et al., 1997). PFLOTRAN has been developed from the ground up for parallel scalability and has been run on up to 218 processor cores with problem sizes up to 2 billion degrees of freedom. Writtenmore » in object oriented Fortran 90, the code requires the latest compilers compatible with Fortran 2003. At the time of this writing this requires gcc 4.7.x, Intel 12.1.x and PGC compilers. As a requirement of running problems with a large number of degrees of freedom, PFLOTRAN allows reading input data that is too large to fit into memory allotted to a single processor core. The current limitation to the problem size PFLOTRAN can handle is the limitation of the HDF5 file format used for parallel IO to 32 bit integers. Noting that 2 32 = 4; 294; 967; 296, this gives an estimate of the maximum problem size that can be currently run with PFLOTRAN. Hopefully this limitation will be remedied in the near future.« less
NASA Astrophysics Data System (ADS)
Daubner, Tomas; Kizhofer, Jens; Dinulescu, Mircea
2018-06-01
This article describes an experimental investigation in the near field of five parallel plane jets. The study applies 2D Particle Image Velocimetry (PIV) for ventilated and unventilated jets, where ventilated means exiting into a duct with expansion ratio 3.5 and unventilated means exiting to the free atmosphere. Results are presented for Reynolds numbers 1408, 5857 and 10510. The Reynolds number is calculated for the middle channel and is based on the height of the nozzle (channel) equivalent diameter 2h. All characteristic regions of the methodology to describe multiple interacting jets are observed by the PIV measurements - converging, merging and combined. Each of the five parallel channels has an aspect ratio of 25 defined as nozzle width (w) to height (h). The channels have a length of 185 times the channel height guaranteeing a fully developed velocity profile at the exit from the channel. Spacing between the single plane jets is 3 times the channel height. The near field of multiple mixing jets is depended on outlet nozzle geometry. Blunt geometry of the nozzle was chosen (sudden contraction).
NASA Astrophysics Data System (ADS)
Bao, Xiurong; Zhao, Qingchun; Yin, Hongxi; Qin, Jie
2018-05-01
In this paper, an all-optical parallel reservoir computing (RC) system with two channels for the optical packet header recognition is proposed and simulated, which is based on a semiconductor ring laser (SRL) with the characteristic of bidirectional light paths. The parallel optical loops are built through the cross-feedback of the bidirectional light paths where every optical loop can independently recognize each injected optical packet header. Two input signals are mapped and recognized simultaneously by training all-optical parallel reservoir, which is attributed to the nonlinear states in the laser. The recognition of optical packet headers for two channels from 4 bits to 32 bits is implemented through the simulation optimizing system parameters and therefore, the optimal recognition error ratio is 0. Since this structure can combine with the wavelength division multiplexing (WDM) optical packet switching network, the wavelength of each channel of optical packet headers for recognition can be different, and a better recognition result can be obtained.
NASA Astrophysics Data System (ADS)
Garcia, Marga; Alonso, Belén; Tomas Vazquez, Juan; Ercilla, Gemma; Palomino, Desirée; Estrada, Ferran; Fernandez Puga, Ma Carmen; Lopez Gonzalez, Nieves; Roque, Cristina
2014-05-01
The Gulf of Cadiz records the interplay of a variety of sedimentary processes related to the flow of the Mediterranean Outflow Water (MOW) exiting the Mediterranean Sea, with downslope sedimentary processes and the topography of the region. This work presents detailed morphological features of the Guadalquivir Ridge area, based on high resolution bathymetry and very-high resolution seismic profiles (TOPAS) acquired during the MONTERA cruise. The Guadalquivir Ridge is a SW-NE-oriented relief located on the middle slope of the Gulf of Cadiz (8º-7º10' W). It reaches minimum depths at two highs, one at the Guadalquivir Bank, at the western extreme of the ridge (275 m), and a second one close to the eastern extreme (350 m). The ridge is cut by a gap where the Diego Cao contourite moat is incised forming a narrow, 4-5 km wide, SE-NW oriented channel. It delimits two contourite sheeted drifts (SD) at the northern side of the ridge: the Faro SD at the east (~ 600 m water depth) and the Bartolomeo Dias SD, at the west (~750 m water depth). The SD are relatively flat and become shallower progressively in a SE direction towards the Guadalquivir Ridge. At the SE side of the Guadalquivir Ridge depth increases dramatically where the Huelva and Cadiz contourite channels occur. They are originated by the direct erosion of the Lower Core of the MOW, running at depths of around 1200 m. The Diego Cao channel is related to the Upper Core, which runs at depths of around 800 m (Ambar and Serra, 2007). High resolution data reveal the existence of a variety of features. Semi-circular scarps, up to 10s km long, occur at the SE side of the Guadalquivir Ridge and at the SW side of the Bartolomeo Dias SD, at the rim of the Diego Cao contourite channel. Scarps occur at depths of 550 to 750 m and form steep steps of tens to hundreds of meters and in some cases occur overlapped one on each other at different depths. The second type of feature is a series of circular to ellipse-shaped depressions identified at the NE side of the Faro SD. Depressions are a few km in diameters and up to 100 m deep, and are aligned parallel to the edge of the SD, close to the rim of the Diego Cao. Finally, a valley-shaped depression has been identified at the N side of the Guadalquivir Bank. It is about 30 km long, with incision depths of up to 200 m and it runs parallel to the shape of the bank main relief. This work evaluates the relationship of the Lower and Upper cores of the MOW with the existing topography of the Guadalquivir ridge, as the origin for the identified morphologies, as the result of the interplay of mass-wasting and contouritic processes. Bibliography: Ambar, I., Serra, N., 2007. Intermediate depth circulation: The importance of MW. Workshop on Circum-Iberia Paleoceanography and Paleoclimate, Peniche, Portugal.
Decision making by superimposing information from parallel cognitive channels
NASA Astrophysics Data System (ADS)
Aityan, Sergey K.
1993-08-01
A theory of decision making with perception through parallel information channels is presented. Decision making is considered a parallel competitive process. Every channel can provide confirmation or rejection of a decision concept. Different channels provide different impact on the specific concepts caused by the goals and individual cognitive features. All concepts are divided into semantic clusters due to the goals and the system defaults. The clusters can be alternative or complimentary. The 'winner-take-all' concept nodes firing takes place within the alternative cluster. Concepts can be independently activated in the complimentary cluster. A cognitive channel affects a decision concept by sending an activating or inhibitory signal. The complimentary clusters serve for building up complex concepts by superimposing activation received from various channels. The decision making is provided by the alternative clusters. Every active concept in the alternative cluster tends to suppress the competitive concepts in the cluster by sending inhibitory signals to the other nodes of the cluster. The model accounts for a time delay in signal transmission between the nodes and explains decreasing of the reaction time if information is confirmed by different channels and increasing of the reaction time if deceiving information received from the channels.
1986-11-01
Report Organization. .................... 7 *PART 11: CASE STUDIES .......................... 9 Teton Dam Failure Flood. ...................... 9...channel, (3) Laurel Run Dam , and (4) Stillhouse Hollow Dam . The Laurel Run and Teton case studies involved field data sets from actual dam failures. The...hypothetical prismatic channel case study used the Teton reservoir and dam data but replaced the complex Teton Valley geometry with a prismatic channel
SU-F-SPS-09: Parallel MC Kernel Calculations for VMAT Plan Improvement
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chamberlain, S; Roswell Park Cancer Institute, Buffalo, NY; French, S
Purpose: Adding kernels (small perturbations in leaf positions) to the existing apertures of VMAT control points may improve plan quality. We investigate the calculation of kernel doses using a parallelized Monte Carlo (MC) method. Methods: A clinical prostate VMAT DICOM plan was exported from Eclipse. An arbitrary control point and leaf were chosen, and a modified MLC file was created, corresponding to the leaf position offset by 0.5cm. The additional dose produced by this 0.5 cm × 0.5 cm kernel was calculated using the DOSXYZnrc component module of BEAMnrc. A range of particle history counts were run (varying from 3more » × 10{sup 6} to 3 × 10{sup 7}); each job was split among 1, 10, or 100 parallel processes. A particle count of 3 × 10{sup 6} was established as the lower range because it provided the minimal accuracy level. Results: As expected, an increase in particle counts linearly increases run time. For the lowest particle count, the time varied from 30 hours for the single-processor run, to 0.30 hours for the 100-processor run. Conclusion: Parallel processing of MC calculations in the EGS framework significantly decreases time necessary for each kernel dose calculation. Particle counts lower than 1 × 10{sup 6} have too large of an error to output accurate dose for a Monte Carlo kernel calculation. Future work will investigate increasing the number of parallel processes and optimizing run times for multiple kernel calculations.« less
Parallel 3D Multi-Stage Simulation of a Turbofan Engine
NASA Technical Reports Server (NTRS)
Turner, Mark G.; Topp, David A.
1998-01-01
A 3D multistage simulation of each component of a modern GE Turbofan engine has been made. An axisymmetric view of this engine is presented in the document. This includes a fan, booster rig, high pressure compressor rig, high pressure turbine rig and a low pressure turbine rig. In the near future, all components will be run in a single calculation for a solution of 49 blade rows. The simulation exploits the use of parallel computations by using two levels of parallelism. Each blade row is run in parallel and each blade row grid is decomposed into several domains and run in parallel. 20 processors are used for the 4 blade row analysis. The average passage approach developed by John Adamczyk at NASA Lewis Research Center has been further developed and parallelized. This is APNASA Version A. It is a Navier-Stokes solver using a 4-stage explicit Runge-Kutta time marching scheme with variable time steps and residual smoothing for convergence acceleration. It has an implicit K-E turbulence model which uses an ADI solver to factor the matrix. Between 50 and 100 explicit time steps are solved before a blade row body force is calculated and exchanged with the other blade rows. This outer iteration has been coined a "flip." Efforts have been made to make the solver linearly scaleable with the number of blade rows. Enough flips are run (between 50 and 200) so the solution in the entire machine is not changing. The K-E equations are generally solved every other explicit time step. One of the key requirements in the development of the parallel code was to make the parallel solution exactly (bit for bit) match the serial solution. This has helped isolate many small parallel bugs and guarantee the parallelization was done correctly. The domain decomposition is done only in the axial direction since the number of points axially is much larger than the other two directions. This code uses MPI for message passing. The parallel speed up of the solver portion (no 1/0 or body force calculation) for a grid which has 227 points axially.
a Real-Time Computer Music Synthesis System
NASA Astrophysics Data System (ADS)
Lent, Keith Henry
A real time sound synthesis system has been developed at the Computer Music Center of The University of Texas at Austin. This system consists of several stand alone processors that were constructed jointly with White Instruments in Austin. These processors can be programmed as general purpose computers, but are provided with a number of specialized interfaces including: MIDI, 8 bit parallel, high speed serial, 2 channels analog input (18 bit A/Ds, 48kHz sample rate), and 4 channels analog output (18 bit D/As). In addition, a basic music synthesis language (Music56000) has been written in assembly code. On top of this, a symbolic compiler (PatchWork) has been developed to enable algorithms which run in these processors to be created graphically. And finally, a number of efficient time domain numerical models have been developed to enable the construction, simulation, control, and synthesis of many musical acoustics systems in real time on these processors. Specifically, assembly language models for cylindrical and conical horn sections, dissipative losses, tone holes, bells, and a number of linear and nonlinear boundary conditions have been developed.
Gignac, Lynne M; Mittal, Surbhi; Bangsaruntip, Sarunya; Cohen, Guy M; Sleight, Jeffrey W
2011-12-01
The ability to prepare multiple cross-section transmission electron microscope (XTEM) samples from one XTEM sample of specific sub-10 nm features was demonstrated. Sub-10 nm diameter Si nanowire (NW) devices were initially cross-sectioned using a dual-beam focused ion beam system in a direction running parallel to the device channel. From this XTEM sample, both low- and high-resolution transmission electron microscope (TEM) images were obtained from six separate, specific site Si NW devices. The XTEM sample was then re-sectioned in four separate locations in a direction perpendicular to the device channel: 90° from the original XTEM sample direction. Three of the four XTEM samples were successfully sectioned in the gate region of the device. From these three samples, low- and high-resolution TEM images of the Si NW were taken and measurements of the NW diameters were obtained. This technique demonstrated the ability to obtain high-resolution TEM images in directions 90° from one another of multiple, specific sub-10 nm features that were spaced 1.1 μm apart.
Running Parallel Discrete Event Simulators on Sierra
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barnes, P. D.; Jefferson, D. R.
2015-12-03
In this proposal we consider porting the ROSS/Charm++ simulator and the discrete event models that run under its control so that they run on the Sierra architecture and make efficient use of the Volta GPUs.
Novel molecular targets for kRAS downregulation: promoter G-quadruplexes
2016-11-01
conditions, and described the structure as having mixed parallel/anti-parallel loops of lengths 2:8:10 in the 5’-3’ direction. Using selective small...and anti-parallel loop directionality of lengths 4:10:8 in the 5’–3’ direction, three tetrads stacked, and involving guanines in runs B, C, E, and F...a tri-stacked structure incorporating runs B, C, E and F with intervening loops of 2, 10, and 8 bases in the 5’–3’ direction. G = black circles, C
Riparian vegetation controls on braided stream dynamics
NASA Astrophysics Data System (ADS)
Gran, Karen; Paola, Chris
2001-12-01
Riparian vegetation can significantly influence the morphology of a river, affecting channel geometry and flow dynamics. To examine the effects of riparian vegetation on gravel bed braided streams, we conducted a series of physical experiments at the St. Anthony Falls Laboratory with varying densities of bar and bank vegetation. Water discharge, sediment discharge, and grain size were held constant between runs. For each run, we allowed a braided system to develop, then seeded the flume with alfalfa (Medicago sativa), allowed the seeds to grow, and then continued the run. We collected data on water depth, surface velocity, and bed elevation throughout each run using image-based techniques designed to collect data over a large spatial area with minimal disturbance to the flow. Our results show that the influence of vegetation on overall river patterns varied systematically with the spatial density of plant stems. Vegetation reduced the number of active channels and increased bank stability, leading to lower lateral migration rates, narrower and deeper channels, and increased channel relief. These effects increased with vegetation density. Vegetation influenced flow dynamics, increasing the variance of flow direction in vegetated runs and increasing scour depths through strong downwelling where the flow collided with relatively resistant banks. This oblique bank collision also provides a new mechanism for producing secondary flows. We found it to be more important than the classical curvature-driven mechanism in vegetated runs.
Riparian vegetation controls on channels formed in non-cohesive sediment
NASA Astrophysics Data System (ADS)
Gran, K.; Tal, M.; Paola, C.
2002-05-01
Riparian vegetation can significantly influence the morphology of a river, affecting channel geometry and flow dynamics. In channels formed in non-cohesive material, vegetation is the main source of bank cohesion and could affect the overall behavior of the river, potentially constraining the flow from a multi-thread channel to a single-thread channel. To examine the effects of riparian vegetation on streams formed in non-cohesive material, we conducted a series of physical experiments at the St. Anthony Falls Laboratory. The first set of experiments examines the effects of varying densities of vegetation on braided stream dynamics. Water discharge, sediment discharge, and grain size were held constant. For each run, we allowed a braided system to develop, then halved the discharge, and seeded the flume with alfalfa (Medicago sativa). After ten to fourteen days of growth, we returned the discharge to its original value and continued the run for 30-36 hours. Our results show that the influence of vegetation on the overall river pattern varied systematically with the spatial density of plant stems. The vegetation reduced the number of active channels and increased bank stability, leading to lower lateral migration rates, narrower and deeper channels, and an increase in channel relief. All these effects increased with vegetation density. Vegetation also influenced flow dynamics, increasing the variance of flow direction in the vegetated runs, and increasing scour depths through strong downwelling where the flow collided with relatively resistant banks. This oblique bank collision provides a new mechanism for producing secondary flows. We found these bank collision driven secondary flows to be more important than the classical curvature-driven mechanism in the vegetated runs. The next set of experiments examines more closely how the channel pattern evolves through time, allowing for both channel migration and successive vegetation growth. In these on-going experiments, vegetation is reseeded following repeat high flow events, simulating the natural process of vegetation encroachment on the floodplain and channel.
Lee, Jae H.; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T.; Seo, Youngho
2014-01-01
The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting. PMID:27081299
Lee, Jae H; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T; Seo, Youngho
2014-11-01
The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting.
Scalable computing for evolutionary genomics.
Prins, Pjotr; Belhachemi, Dominique; Möller, Steffen; Smant, Geert
2012-01-01
Genomic data analysis in evolutionary biology is becoming so computationally intensive that analysis of multiple hypotheses and scenarios takes too long on a single desktop computer. In this chapter, we discuss techniques for scaling computations through parallelization of calculations, after giving a quick overview of advanced programming techniques. Unfortunately, parallel programming is difficult and requires special software design. The alternative, especially attractive for legacy software, is to introduce poor man's parallelization by running whole programs in parallel as separate processes, using job schedulers. Such pipelines are often deployed on bioinformatics computer clusters. Recent advances in PC virtualization have made it possible to run a full computer operating system, with all of its installed software, on top of another operating system, inside a "box," or virtual machine (VM). Such a VM can flexibly be deployed on multiple computers, in a local network, e.g., on existing desktop PCs, and even in the Cloud, to create a "virtual" computer cluster. Many bioinformatics applications in evolutionary biology can be run in parallel, running processes in one or more VMs. Here, we show how a ready-made bioinformatics VM image, named BioNode, effectively creates a computing cluster, and pipeline, in a few steps. This allows researchers to scale-up computations from their desktop, using available hardware, anytime it is required. BioNode is based on Debian Linux and can run on networked PCs and in the Cloud. Over 200 bioinformatics and statistical software packages, of interest to evolutionary biology, are included, such as PAML, Muscle, MAFFT, MrBayes, and BLAST. Most of these software packages are maintained through the Debian Med project. In addition, BioNode contains convenient configuration scripts for parallelizing bioinformatics software. Where Debian Med encourages packaging free and open source bioinformatics software through one central project, BioNode encourages creating free and open source VM images, for multiple targets, through one central project. BioNode can be deployed on Windows, OSX, Linux, and in the Cloud. Next to the downloadable BioNode images, we provide tutorials online, which empower bioinformaticians to install and run BioNode in different environments, as well as information for future initiatives, on creating and building such images.
Active local control of propeller-aircraft run-up noise.
Hodgson, Murray; Guo, Jingnan; Germain, Pierre
2003-12-01
Engine run-ups are part of the regular maintenance schedule at Vancouver International Airport. The noise generated by the run-ups propagates into neighboring communities, disturbing the residents. Active noise control is a potentially cost-effective alternative to passive methods, such as enclosures. Propeller aircraft generate low-frequency tonal noise that is highly compatible with active control. This paper presents a preliminary investigation of the feasibility and effectiveness of controlling run-up noise from propeller aircraft using local active control. Computer simulations for different configurations of multi-channel active-noise-control systems, aimed at reducing run-up noise in adjacent residential areas using a local-control strategy, were performed. These were based on an optimal configuration of a single-channel control system studied previously. The variations of the attenuation and amplification zones with the number of control channels, and with source/control-system geometry, were studied. Here, the aircraft was modeled using one or two sources, with monopole or multipole radiation patterns. Both free-field and half-space conditions were considered: for the configurations studied, results were similar in the two cases. In both cases, large triangular quiet zones, with local attenuations of 10 dB or more, were obtained when nine or more control channels were used. Increases of noise were predicted outside of these areas, but these were minimized as more control channels were employed. By combining predicted attenuations with measured noise spectra, noise levels after implementation of an active control system were estimated.
Minimum envelope roughness pulse design for reduced amplifier distortion in parallel excitation.
Grissom, William A; Kerr, Adam B; Stang, Pascal; Scott, Greig C; Pauly, John M
2010-11-01
Parallel excitation uses multiple transmit channels and coils, each driven by independent waveforms, to afford the pulse designer an additional spatial encoding mechanism that complements gradient encoding. In contrast to parallel reception, parallel excitation requires individual power amplifiers for each transmit channel, which can be cost prohibitive. Several groups have explored the use of low-cost power amplifiers for parallel excitation; however, such amplifiers commonly exhibit nonlinear memory effects that distort radio frequency pulses. This is especially true for pulses with rapidly varying envelopes, which are common in parallel excitation. To overcome this problem, we introduce a technique for parallel excitation pulse design that yields pulses with smoother envelopes. We demonstrate experimentally that pulses designed with the new technique suffer less amplifier distortion than unregularized pulses and pulses designed with conventional regularization.
Speaker Recognition Using Real vs. Synthetic Parallel Data for DNN Channel Compensation
2016-08-18
Speaker Recognition Using Real vs Synthetic Parallel Data for DNN Channel Compensation Fred Richardson, Michael Brandstein, Jennifer Melot and...de- noising DNNs has been demonstrated for several speech tech- nologies such as ASR and speaker recognition. This paper com- pares the use of real ...AVG and POOL min DCFs). In all cases, the telephone channel per- formance on SRE10 is improved by the denoising DNNs with the real Mixer 1 and 2
Speaker Recognition Using Real vs Synthetic Parallel Data for DNN Channel Compensation
2016-09-08
Speaker Recognition Using Real vs Synthetic Parallel Data for DNN Channel Compensation Fred Richardson, Michael Brandstein, Jennifer Melot and...de- noising DNNs has been demonstrated for several speech tech- nologies such as ASR and speaker recognition. This paper com- pares the use of real ...AVG and POOL min DCFs). In all cases, the telephone channel per- formance on SRE10 is improved by the denoising DNNs with the real Mixer 1 and 2
Didar, Tohid Fatanat; Tabrizian, Maryam
2012-11-07
Here we present a microfluidic platform to generate multiplex gradients of biomolecules within parallel microfluidic channels, in which a range of multiplex concentration gradients with different profile shapes are simultaneously produced. Nonlinear polynomial gradients were also generated using this device. The gradient generation principle is based on implementing parrallel channels with each providing a different hydrodynamic resistance. The generated biomolecule gradients were then covalently functionalized onto the microchannel surfaces. Surface gradients along the channel width were a result of covalent attachments of biomolecules to the surface, which remained functional under high shear stresses (50 dyn/cm(2)). An IgG antibody conjugated to three different fluorescence dyes (FITC, Cy5 and Cy3) was used to demonstrate the resulting multiplex concentration gradients of biomolecules. The device enabled generation of gradients with up to three different biomolecules in each channel with varying concentration profiles. We were also able to produce 2-dimensional gradients in which biomolecules were distributed along the length and width of the channel. To demonstrate the applicability of the developed design, three different multiplex concentration gradients of REDV and KRSR peptides were patterned along the width of three parallel channels and adhesion of primary human umbilical vein endothelial cell (HUVEC) in each channel was subsequently investigated using a single chip.
A three-dimensional spectral algorithm for simulations of transition and turbulence
NASA Technical Reports Server (NTRS)
Zang, T. A.; Hussaini, M. Y.
1985-01-01
A spectral algorithm for simulating three dimensional, incompressible, parallel shear flows is described. It applies to the channel, to the parallel boundary layer, and to other shear flows with one wall bounded and two periodic directions. Representative applications to the channel and to the heated boundary layer are presented.
Pre-Restoration Geomorphic Characteristics of Minebank Run, Baltimore County, Maryland, 2002-04
Doheny, Edward J.; Starsoneck, Roger J.; Mayer, Paul M.; Striz, Elise A.
2007-01-01
Data collected from 2002 through 2004 were used to assess geomorphic characteristics and geomorphic changes over time in a selected reach of Minebank Run, a small urban watershed near Towson, Maryland, prior to its physical restoration in 2004 and 2005. Longitudinal profiles of the channel bed, water surface, and bank features were developed from field surveys. Changes in cross-section geometry between field surveys were documented. Grain-size distributions for the channel bed and banks were developed from pebble counts and laboratory analyses. Net changes in the elevation of the channel bed over time were documented at selected locations. Rosgen Stream Classification was used to classify the stream channel according to morphological measurements of slope, entrenchment ratio, width-to-depth ratio, sinuosity, and median-particle diameter of the channel materials. An analysis of boundary shear stress in the vicinity of the streamflow-gaging station was conducted by use of hydraulic variables computed from cross-section surveys and slope measurements derived from crest-stage gages in the study reach. Analysis of the longitudinal profiles indicated noticeable changes in the percentage and distribution of riffles, pools, and runs through the study reach between 2002 and 2004. Despite major changes to the channel profile as a result of storm runoff events, the overall slope of the channel bed, water surface, and bank features remained constant at about 1 percent. The cross-sectional surveys showed net increases in cross-sectional area, mean depth, and channel width at several locations between 2002 and 2004, which indicate channel degradation and widening. Two locations were identified where significant amounts of sediment were being stored in the study reach. Data from scour chains identified several locations where maximum scour ranged from 1.0-1.4 feet during storm events. Bank retreat varied widely throughout the study reach and ranged from 0.2 feet to as much as 7.9 feet. Sequential measurements of bed elevation in selected locations indicated as much as 2 feet of channel degradation in one location during a storm event in May 2004 and identified pulses of sediment that were gradually transported through the study reach during the monitoring period. Particle-size analyses of channel bed materials indicated a median particle diameter of 20.5 millimeters (coarse gravel) for the study reach, with more than 24 percent being sand particles (greater than 0.062 millimeters). Analyses of bank samples showed finer-grained material composing the channel banks, predominantly silt/clay or a mixture of silt/clay (less than 0.062 millimeters) and very fine to coarse sand. The Minebank Run stream channel was classified as a B4c channel, based on morphological descriptions from the Rosgen Stream Classification System. The B4c classification describes a single-thread stream channel with a moderate entrenchment ratio of 1.4 to 2.2; a width-to-depth ratio greater than 12; moderate sinuosity of 1.2 or greater; a water-surface slope of less than 2 percent; and a median-particle diameter in the gravel range of 2 to 64 millimeters. Analysis of boundary shear stress indicated larger mean velocities and boundary shear stress values for Minebank Run when compared to relations for non-urban B channel types developed by Rosgen. The slope of the regression line for mean velocity versus boundary shear stress at Minebank Run was considerably less than slopes developed by Rosgen for non-urban channel types. This indicates that relatively small increases in mean velocity can result in large increases in boundary shear stress in stream channels with highly developed watersheds, such as Minebank Run.
Fatigue-induced changes in decline running.
Mizrahi, J; Verbitsky, O; Isakov, E
2001-03-01
Study the relation between muscle fatigue during eccentric muscle contractions and kinematics of the legs in downhill running. Decline running on a treadmill was used to acquire data on shock accelerations, muscle activity and kinematics, for comparison with level running. In downhill running, local muscle fatigue is the cause of morphological muscle damage which leads to reduced attenuation of shock accelerations. Fourteen subjects ran on a treadmill above level-running anaerobic threshold speed for 30 min, in level and -4 degrees decline running. The following were monitored: metabolic fatigue by means of respiratory parameters; muscle fatigue of the quadriceps by means of elevation in myoelectric activity; and kinematic parameters including knee and ankle angles and hip vertical excursion by means of computerized videography. Data on shock transmission reported in previous studies were also used. Quadriceps fatigue develops in parallel to an increasing vertical excursion of the hip in the stance phase of running, enabled by larger dorsi flexion of the ankle rather than by increased flexion of the knee. The decrease in shock attenuation can be attributed to quadriceps muscle fatigue in parallel to increased vertical excursion of the hips.
Teaching with a Dual-Channel Classroom Feedback System in the Digital Classroom Environment
ERIC Educational Resources Information Center
Yu, Yuan-Chih
2017-01-01
Teaching with a classroom feedback system can benefit both teaching and learning practices of interactivity. In this paper, we propose a dual-channel classroom feedback system integrated with a back-end e-Learning system. The system consists of learning agents running on the students' computers and a teaching agent running on the instructor's…
Parallelism in integrated fluidic circuits
NASA Astrophysics Data System (ADS)
Bousse, Luc J.; Kopf-Sill, Anne R.; Parce, J. W.
1998-04-01
Many research groups around the world are working on integrated microfluidics. The goal of these projects is to automate and integrate the handling of liquid samples and reagents for measurement and assay procedures in chemistry and biology. Ultimately, it is hoped that this will lead to a revolution in chemical and biological procedures similar to that caused in electronics by the invention of the integrated circuit. The optimal size scale of channels for liquid flow is determined by basic constraints to be somewhere between 10 and 100 micrometers . In larger channels, mixing by diffusion takes too long; in smaller channels, the number of molecules present is so low it makes detection difficult. At Caliper, we are making fluidic systems in glass chips with channels in this size range, based on electroosmotic flow, and fluorescence detection. One application of this technology is rapid assays for drug screening, such as enzyme assays and binding assays. A further challenge in this area is to perform multiple functions on a chip in parallel, without a large increase in the number of inputs and outputs. A first step in this direction is a fluidic serial-to-parallel converter. Fluidic circuits will be shown with the ability to distribute an incoming serial sample stream to multiple parallel channels.
NASA Technical Reports Server (NTRS)
Alario, J. P.; Haslett, R. A.
1986-01-01
Parallel pipes provide high heat flow from small heat exchanger. Six parallel heat pipes extract heat from overlying heat exchanger, forming evaporator. Vapor channel in pipe contains wick that extends into screen tube in liquid channel. Rods in each channel hold wick and screen tube in place. Evaporator compact rather than extended and more compatible with existing heat-exchanger geometries. Prototype six-pipe evaporator only 0.3 m wide and 0.71 m long. With ammonia as working fluid, transports heat to finned condenser at rate of 1,200 W.
Fellner, C; Doenitz, C; Finkenzeller, T; Jung, E M; Rennert, J; Schlaier, J
2009-01-01
Geometric distortions and low spatial resolution are current limitations in functional magnetic resonance imaging (fMRI). The aim of this study was to evaluate if application of parallel imaging or significant reduction of voxel size in combination with a new 32-channel head array coil can reduce those drawbacks at 1.5 T for a simple hand motor task. Therefore, maximum t-values (tmax) in different regions of activation, time-dependent signal-to-noise ratios (SNR(t)) as well as distortions within the precentral gyrus were evaluated. Comparing fMRI with and without parallel imaging in 17 healthy subjects revealed significantly reduced geometric distortions in anterior-posterior direction. Using parallel imaging, tmax only showed a mild reduction (7-11%) although SNR(t) was significantly diminished (25%). In 7 healthy subjects high-resolution (2 x 2 x 2 mm3) fMRI was compared with standard fMRI (3 x 3 x 3 mm3) in a 32-channel coil and with high-resolution fMRI in a 12-channel coil. The new coil yielded a clear improvement for tmax (21-32%) and SNR(t) (51%) in comparison with the 12-channel coil. Geometric distortions were smaller due to the smaller voxel size. Therefore, the reduction in tmax (8-16%) and SNR(t) (52%) in the high-resolution experiment seems to be tolerable with this coil. In conclusion, parallel imaging is an alternative to reduce geometric distortions in fMRI at 1.5 T. Using a 32-channel coil, reduction of the voxel size might be the preferable way to improve spatial accuracy.
Self-balanced modulation and magnetic rebalancing method for parallel multilevel inverters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Hui; Shi, Yanjun
A self-balanced modulation method and a closed-loop magnetic flux rebalancing control method for parallel multilevel inverters. The combination of the two methods provides for balancing of the magnetic flux of the inter-cell transformers (ICTs) of the parallel multilevel inverters without deteriorating the quality of the output voltage. In various embodiments a parallel multi-level inverter modulator is provide including a multi-channel comparator to generate a multiplexed digitized ideal waveform for a parallel multi-level inverter and a finite state machine (FSM) module coupled to the parallel multi-channel comparator, the FSM module to receive the multiplexed digitized ideal waveform and to generate amore » pulse width modulated gate-drive signal for each switching device of the parallel multi-level inverter. The system and method provides for optimization of the output voltage spectrum without influence the magnetic balancing.« less
Characterization of the Body-to-Body Propagation Channel for Subjects during Sports Activities.
Mohamed, Marshed; Cheffena, Michael; Moldsvor, Arild
2018-02-18
Body-to-body wireless networks (BBWNs) have great potential to find applications in team sports activities among others. However, successful design of such systems requires great understanding of the communication channel as the movement of the body components causes time-varying shadowing and fading effects. In this study, we present results of the measurement campaign of BBWN during running and cycling activities. Among others, the results indicated the presence of good and bad states with each state following a specific distribution for the considered propagation scenarios. This motivated the development of two-state semi-Markov model, for simulation of the communication channels. The simulation model was validated using the available measurement data in terms of first and second order statistics and have shown good agreement. The first order statistics obtained from the simulation model as well as the measured results were then used to analyze the performance of the BBWNs channels under running and cycling activities in terms of capacity and outage probability. Cycling channels showed better performance than running, having higher channel capacity and lower outage probability, regardless of the speed of the subjects involved in the measurement campaign.
Multichannel quench-flow microreactor chip for parallel reaction monitoring.
Bula, Wojciech P; Verboom, Willem; Reinhoudt, David N; Gardeniers, Han J G E
2007-12-01
This paper describes a multichannel silicon-glass microreactor which has been utilized to investigate the kinetics of a Knoevenagel condensation reaction under different reaction conditions. The reaction is performed on the chip in four parallel channels under identical conditions but with different residence times. A special topology of the reaction coils overcomes the common problem arising from the difference in pressure drop of parallel channels having different length. The parallelization of reaction coils combined with chemical quenching at specific locations results in a considerable reduction in experimental effort and cost. The system was tested and showed good reproducibility in flow properties and reaction kinetic data generation.
Passing in Command Line Arguments and Parallel Cluster/Multicore Batching in R with batch.
Hoffmann, Thomas J
2011-03-01
It is often useful to rerun a command line R script with some slight change in the parameters used to run it - a new set of parameters for a simulation, a different dataset to process, etc. The R package batch provides a means to pass in multiple command line options, including vectors of values in the usual R format, easily into R. The same script can be setup to run things in parallel via different command line arguments. The R package batch also provides a means to simplify this parallel batching by allowing one to use R and an R-like syntax for arguments to spread a script across a cluster or local multicore/multiprocessor computer, with automated syntax for several popular cluster types. Finally it provides a means to aggregate the results together of multiple processes run on a cluster.
Automatization of hardware configuration for plasma diagnostic system
NASA Astrophysics Data System (ADS)
Wojenski, A.; Pozniak, K. T.; Kasprowicz, G.; Kolasinski, P.; Krawczyk, R. D.; Zabolotny, W.; Linczuk, P.; Chernyshova, M.; Czarski, T.; Malinowski, K.
2016-09-01
Soft X-ray plasma measurement systems are mostly multi-channel, high performance systems. In case of the modular construction it is necessary to perform sophisticated system discovery in parallel with automatic system configuration. In the paper the structure of the modular system designed for tokamak plasma soft X-ray measurements is described. The concept of the system discovery and further automatic configuration is also presented. FCS application (FMC/ FPGA Configuration Software) is used for running sophisticated system setup with automatic verification of proper configuration. In order to provide flexibility of further system configurations (e.g. user setup), common communication interface is also described. The approach presented here is related to the automatic system firmware building presented in previous papers. Modular construction and multichannel measurements are key requirement in term of SXR diagnostics with use of GEM detectors.
Multi-leg heat pipe evaporator
NASA Technical Reports Server (NTRS)
Alario, J. P.; Haslett, R. A. (Inventor)
1986-01-01
A multileg heat pipe evaporator facilitates the use and application of a monogroove heat pipe by providing an evaporation section which is compact in area and structurally more compatible with certain heat exchangers or heat input apparatus. The evaporation section of a monogroove heat pipe is formed by a series of parallel legs having a liquid and a vapor channel and a communicating capillary slot therebetween. The liquid and vapor channels and interconnecting capillary slots of the evaporating section are connected to the condensing section of the heat pipe by a manifold connecting liquid and vapor channels of the parallel evaporation section legs with the corresponding liquid and vapor channels of the condensing section.
Long-term morphological developments of river channels separated by a longitudinal training wall
NASA Astrophysics Data System (ADS)
Le, T. B.; Crosato, A.; Uijttewaal, W. S. J.
2018-03-01
Rivers have been trained for centuries by channel narrowing and straightening. This caused important damages to their ecosystems, particularly around the bank areas. We analyze here the possibility to train rivers in a new way by subdividing their channel in main and ecological channel with a longitudinal training wall. The effectiveness of longitudinal training walls in achieving this goal and their long-term effects on the river morphology have not been thoroughly investigated yet. In particular, studies that assess the stability of the two parallel channels separated by the training wall are still lacking. This work studies the long-term morphological developments of river channels subdivided by a longitudinal training wall in the presence of steady alternate bars. This type of bars, common in alluvial rivers, alters the flow field and the sediment transport direction and might affect the stability of the bifurcating system. The work comprises both laboratory experiments and numerical simulations (Delft3D). The results show that a system of parallel channels divided by a longitudinal training wall has the tendency to become unstable. An important factor is found to be the location of the upstream termination of the longitudinal wall with respect to a neighboring steady bar. The relative widths of the two parallel channels separated by the wall and variable discharge do not substantially change the final evolution of the system.
Simulation of LHC events on a millions threads
NASA Astrophysics Data System (ADS)
Childers, J. T.; Uram, T. D.; LeCompte, T. J.; Papka, M. E.; Benjamin, D. P.
2015-12-01
Demand for Grid resources is expected to double during LHC Run II as compared to Run I; the capacity of the Grid, however, will not double. The HEP community must consider how to bridge this computing gap by targeting larger compute resources and using the available compute resources as efficiently as possible. Argonne's Mira, the fifth fastest supercomputer in the world, can run roughly five times the number of parallel processes that the ATLAS experiment typically uses on the Grid. We ported Alpgen, a serial x86 code, to run as a parallel application under MPI on the Blue Gene/Q architecture. By analysis of the Alpgen code, we reduced the memory footprint to allow running 64 threads per node, utilizing the four hardware threads available per core on the PowerPC A2 processor. Event generation and unweighting, typically run as independent serial phases, are coupled together in a single job in this scenario, reducing intermediate writes to the filesystem. By these optimizations, we have successfully run LHC proton-proton physics event generation at the scale of a million threads, filling two-thirds of Mira.
2014-10-07
is counted as. Per the TDTC, a test bridge with longitudinal and/or lateral symmetry under non- eccentric loading can be considered as 1, 2, or 4...Level Run036 3 MLC70T (tracked) BA Run046 6 AB Run055 9 AB Run060 9 BA Run064 12 BA Run071 15 AB Run155 3 MLC96W ( wheeled ) AB...Run331 9 AB Run359 15 AB Run430 12 MLC96W ( wheeled ) BA Run434 12 AB Run447 3 BA Bank Condition: Side Slope, Even Strain Channels High
Plasma Physics Calculations on a Parallel Macintosh Cluster
NASA Astrophysics Data System (ADS)
Decyk, Viktor; Dauger, Dean; Kokelaar, Pieter
2000-03-01
We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 MFlops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.
Plasma Physics Calculations on a Parallel Macintosh Cluster
NASA Astrophysics Data System (ADS)
Decyk, Viktor K.; Dauger, Dean E.; Kokelaar, Pieter R.
We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 Mflops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.
Parallel pulse processing and data acquisition for high speed, low error flow cytometry
van den Engh, Gerrit J.; Stokdijk, Willem
1992-01-01
A digitally synchronized parallel pulse processing and data acquisition system for a flow cytometer has multiple parallel input channels with independent pulse digitization and FIFO storage buffer. A trigger circuit controls the pulse digitization on all channels. After an event has been stored in each FIFO, a bus controller moves the oldest entry from each FIFO buffer onto a common data bus. The trigger circuit generates an ID number for each FIFO entry, which is checked by an error detection circuit. The system has high speed and low error rate.
Liu, Chengzhong
2018-02-12
There are two systems as the red channel system and the white channel system carved or painted on the wooden figurine of Laoguanshan of Benque school. The two systems are horizontally staggered each other without overlapped. The red channel system, similar to Shuangbaoshan wooden figurine, have channels, but without points. For the white channel system, the running courses of channels result from the sensation distributions of the points after optional stimulation. The Laoguanshan wooden figurine focuses on the illustration of the white channel system, named as white channel figurine. Compared with the Shuangbaoshan red channel figurine, together with examples, such as the running course of the white channel related to the meridian of heart-transfer-point, the white channel related to the belt vessel linking to lung-transfer-point, stomach-transfer-point and kidney-transfer-point, as well as the corresponding photographs. It is indicated that the Laoguanshan white channel figurine is a training aid for testing the sensation marching along channel (SMC) caused by transfer-point stimulation. The white channel system is a flexible way of channel. The study aims to observe the QI /SMC reaching the affected area and contributes to clinical practice. This discovery is not related to the "intermediate link theory" in the Yellow Emperor meridian system.
NASA Astrophysics Data System (ADS)
Wang, Liping; Jiang, Yao; Li, Tiemin
2014-09-01
Parallel kinematic machines have drawn considerable attention and have been widely used in some special fields. However, high precision is still one of the challenges when they are used for advanced machine tools. One of the main reasons is that the kinematic chains of parallel kinematic machines are composed of elongated links that can easily suffer deformations, especially at high speeds and under heavy loads. A 3-RRR parallel kinematic machine is taken as a study object for investigating its accuracy with the consideration of the deformations of its links during the motion process. Based on the dynamic model constructed by the Newton-Euler method, all the inertia loads and constraint forces of the links are computed and their deformations are derived. Then the kinematic errors of the machine are derived with the consideration of the deformations of the links. Through further derivation, the accuracy of the machine is given in a simple explicit expression, which will be helpful to increase the calculating speed. The accuracy of this machine when following a selected circle path is simulated. The influences of magnitude of the maximum acceleration and external loads on the running accuracy of the machine are investigated. The results show that the external loads will deteriorate the accuracy of the machine tremendously when their direction coincides with the direction of the worst stiffness of the machine. The proposed method provides a solution for predicting the running accuracy of the parallel kinematic machines and can also be used in their design optimization as well as selection of suitable running parameters.
Using Parallel Processing for Problem Solving.
1979-12-01
are the basic parallel proces- sing primitive . Different goals of the system can be pursued in parallel by placing them in separate activities...Language primitives are provided for manipulating running activities. Viewpoints are a generalization of context FOM -(over "*’ DD I FON 1473 ’EDITION OF I...arc the basic parallel processing primitive . Different goals of the system can be pursued in parallel by placing them in separate activities. Language
Pandian, Ramasamy P.; Dolgos, Michelle; Marginean, Camelia; Woodward, Patrick M.; Hammel, P. Chris; Manoharan, Periakaruppan T.; Kuppusamy, Periannan
2009-01-01
The synthesis, structural framework, magnetic and oxygen-sensing properties of a lithium naphthalocyanine (LiNc) radical probe are presented. LiNc was synthesized in the form of a microcrystalline powder using a chemical method and characterized by electron paramagnetic resonance (EPR) spectroscopy, magnetic susceptibility, powder X-ray diffraction analysis, and mass spectrometry. X-Ray powder diffraction studies revealed a structural framework that possesses long, hollow channels running parallel to the packing direction. The channels measured approximately 5.0 × 5.4 Å2 in the two-dimensional plane perpendicular to the length of the channel, enabling diffusion of oxygen molecules (2.9 × 3.9 Å2) through the channel. The powdered LiNc exhibited a single, sharp EPR line under anoxic conditions, with a peak-to-peak linewidth of 630 mG at room temperature. The linewidth was sensitive to surrounding molecular oxygen, showing a linear increase in pO2 with an oxygen sensitivity of 31.2 mG per mmHg. The LiNc microcrystals can be further prepared as nano-sized crystals without the loss of its high oxygen-sensing properties. The thermal variation of the magnetic properties of LiNc, such as the EPR linewidth, EPR intensity and magnetic susceptibility revealed the existence of two different temperature regimes of magnetic coupling and hence differing columnar packing, both being one-dimensional antiferromagnetic chains but with differing magnitudes of exchange coupling constants. At a temperature of ∼50 K, LiNc crystals undergo a reversible phase transition. The high degree of oxygen-sensitivity of micro- and nano-sized crystals of LiNc, combined with excellent stability, should enable precise and accurate measurements of oxygen concentration in biological systems using EPR spectroscopy. PMID:19809598
Experiences using OpenMP based on Computer Directed Software DSM on a PC Cluster
NASA Technical Reports Server (NTRS)
Hess, Matthias; Jost, Gabriele; Mueller, Matthias; Ruehle, Roland
2003-01-01
In this work we report on our experiences running OpenMP programs on a commodity cluster of PCs running a software distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS Parallel Benchmarks that have been automaticaly parallelized for OpenMP. We compare the performance of the OpenMP implementations with that of their message passing counterparts and discuss performance differences.
Parallel Processing Strategies of the Primate Visual System
Nassi, Jonathan J.; Callaway, Edward M.
2009-01-01
Preface Incoming sensory information is sent to the brain along modality-specific channels corresponding to the five senses. Each of these channels further parses the incoming signals into parallel streams to provide a compact, efficient input to the brain. Ultimately, these parallel input signals must be elaborated upon and integrated within the cortex to provide a unified and coherent percept. Recent studies in the primate visual cortex have greatly contributed to our understanding of how this goal is accomplished. Multiple strategies including retinal tiling, hierarchical and parallel processing and modularity, defined spatially and by cell type-specific connectivity, are all used by the visual system to recover the rich detail of our visual surroundings. PMID:19352403
Imaging exhumed lower continental crust in the distal Jequitinhonha basin, Brazil
NASA Astrophysics Data System (ADS)
Loureiro, A.; Schnürle, P.; Klingelhöfer, F.; Afilhado, A.; Pinheiro, J.; Evain, M.; Gallais, F.; Dias, N. A.; Rabineau, M.; Baltzer, A.; Benabdellouahed, M.; Soares, J.; Fuck, R.; Cupertino, J. A.; Viana, A.; Matias, L.; Moulin, M.; Aslanian, D.; Vinicius Aparecido Gomes de Lima, M.; Morvan, L.; Mazé, J. P.; Pierre, D.; Roudaut-Pitel, M.; Rio, I.; Alves, D.; Barros Junior, P.; Biari, Y.; Corela, C.; Crozon, J.; Duarte, J. L.; Ducatel, C.; Falcão, C.; Fernagu, P.; Le Piver, D.; Mokeddem, Z.; Pelleau, P.; Rigoti, C.; Roest, W.; Roudaut, M.; Salsa Team
2018-07-01
Twelve combined wide-angle refraction and coincident multi-channel seismic profiles were acquired in the Jequitinhonha-Camamu-Almada, Jacuípe, and Sergipe-Alagoas basins, NE Brazil, during the SALSA experiment in 2014. Profiles SL11 and SL12 image the Jequitinhonha basin, perpendicularly to the coast, with 15 and 11 four-channel ocean-bottom seismometers, respectively. Profile SL10 runs parallel to the coast, crossing profiles SL11 and SL12, imaging the proximal Jequitinhonha and Almada basins with 17 ocean-bottom seismometers. Forward modelling, combined with pre-stack depth migration to increase the horizontal resolution of the velocity models, indicates that sediment thickness varies between 3.3 km and 6.2 km in the distal basin. Crustal thickness at the western edge of the profiles is of around 20 km, with velocity gradients indicating a continental origin. It decreases to less than 5 km in the distal basin, with high seismic velocities and gradients, not compatible with normal oceanic crust nor exhumed upper mantle. Typical oceanic crust is never imaged along these about 200 km-long profiles and we propose that the transitional crust in the Jequitinhonha basin is a made of exhumed lower continental crust.
Breuer, Christian; Lucas, Martin; Schütze, Frank-Walter; Claus, Peter
2007-01-01
A multi-criteria optimisation procedure based on genetic algorithms is carried out in search of advanced heterogeneous catalysts for total oxidation. Simple but flexible software routines have been created to be applied within a search space of more then 150,000 individuals. The general catalyst design includes mono-, bi- and trimetallic compositions assembled out of 49 different metals and depleted on an Al2O3 support in up to nine amount levels. As an efficient tool for high-throughput screening and perfectly matched to the requirements of heterogeneous gas phase catalysis - especially for applications technically run in honeycomb structures - the multi-channel monolith reactor is implemented to evaluate the catalyst performances. Out of a multi-component feed-gas, the conversion rates of carbon monoxide (CO) and a model hydrocarbon (HC) are monitored in parallel. In combination with further restrictions to preparation and pre-treatment a primary screening can be conducted, promising to provide results close to technically applied catalysts. Presented are the resulting performances of the optimisation process for the first catalyst generations and the prospect of its auto-adaptation to specified optimisation goals.
Lee, H W; Schmidt, M A; Russell, R F; Joly, N Y; Tyagi, H K; Uebel, P; Russell, P St J
2011-06-20
We report a novel splicing-based pressure-assisted melt-filling technique for creating metallic nanowires in hollow channels in microstructured silica fibers. Wires with diameters as small as 120 nm (typical aspect ration 50:1) could be realized at a filling pressure of 300 bar. As an example we investigate a conventional single-mode step-index fiber with a parallel gold nanowire (wire diameter 510 nm) running next to the core. Optical transmission spectra show dips at wavelengths where guided surface plasmon modes on the nanowire phase match to the glass core mode. By monitoring the side-scattered light at narrow breaks in the nanowire, the loss could be estimated. Values as low as 0.7 dB/mm were measured at resonance, corresponding to those of an ultra-long-range eigenmode of the glass-core/nanowire system. By thermal treatment the hollow channel could be collapsed controllably, permitting creation of a conical gold nanowire, the optical properties of which could be monitored by side-scattering. The reproducibility of the technique and the high optical quality of the wires suggest applications in fields such as nonlinear plasmonics, near-field scanning optical microscope tips, cylindrical polarizers, optical sensing and telecommunications.
2010-02-01
channels, so the channel gain is known on each realization and used in a coherent matched filter; and (c) Rayleigh channels with noncoherent matched...gain is known on each realization and used in a coherent matched filter (channel model 1A); and (c) Rayleigh channels with noncoherent matched filters...filters, averaged over Rayleigh channel realizations (channel model 1A). (b) Noncoherent matched filters with Rayleigh fading (channel model 3). MSEs are
77 FR 50016 - Drawbridge Operation Regulation; Grassy Sound Channel, Middle Township, NJ
Federal Register 2010, 2011, 2012, 2013, 2014
2012-08-20
... Operation Regulation; Grassy Sound Channel, Middle Township, NJ AGENCY: Coast Guard, DHS. ACTION: Notice of... operating schedule that governs the Grassy Sound Channel (Ocean Drive) Bridge across the Grassy Sound... operating schedule to accommodate ``The Wild Half'' run. The Grassy Sound Channel (Ocean Drive) Bridge...
Comparative evaluation of three heat transfer enhancement strategies in a grooved channel
NASA Astrophysics Data System (ADS)
Herman, C.; Kang, E.
Results of a comparative evaluation of three heat transfer enhancement strategies for forced convection cooling of a parallel plate channel populated with heated blocks, representing electronic components mounted on printed circuit boards, are reported. Heat transfer in the reference geometry, the asymmetrically heated parallel plate channel, is compared with that for the basic grooved channel, and the same geometry enhanced by cylinders and vanes placed above the downstream edge of each heated block. In addition to conventional heat transfer and pressure drop measurements, holographic interferometry combined with high-speed cinematography was used to visualize the unsteady temperature fields in the self-sustained oscillatory flow. The locations of increased heat transfer within one channel periodicity depend on the enhancement technique applied, and were identified by analyzing the unsteady temperature distributions visualized by holographic interferometry. This approach allowed gaining insight into the mechanisms responsible for heat transfer enhancement. Experiments were conducted at moderate flow velocities in the laminar, transitional and turbulent flow regimes. Reynolds numbers were varied in the range Re=200-6500, corresponding to flow velocities from 0.076 to 2.36m/s. Flow oscillations were first observed between Re=1050 and 1320 for the basic grooved channel, and around Re=350 and 450 for the grooved channels equipped with cylinders and vanes, respectively. At Reynolds numbers above the onset of oscillations and in the transitional flow regime, heat transfer rates in the investigated grooved channels exceeded the performance of the reference geometry, the asymmetrically heated parallel plate channel. Heat transfer in the grooved channels enhanced with cylinders and vanes showed an increase by a factor of 1.2-1.8 and 1.5-3.5, respectively, when compared to data obtained for the basic grooved channel; however, the accompanying pressure drop penalties also increased significantly.
Lock Acquisition and Sensitivity Analysis of Advanced LIGO Interferometers
NASA Astrophysics Data System (ADS)
Martynov, Denis
Laser interferometer gravitational wave observatory (LIGO) consists of two complex large-scale laser interferometers designed for direct detection of gravitational waves from distant astrophysical sources in the frequency range 10Hz - 5kHz. Direct detection of space-time ripples will support Einstein's general theory of relativity and provide invaluable information and new insight into physics of the Universe. The initial phase of LIGO started in 2002, and since then data was collected during the six science runs. Instrument sensitivity improved from run to run due to the effort of commissioning team. Initial LIGO has reached designed sensitivity during the last science run, which ended in October 2010. In parallel with commissioning and data analysis with the initial detector, LIGO group worked on research and development of the next generation of detectors. Major instrument upgrade from initial to advanced LIGO started in 2010 and lasted until 2014. This thesis describes results of commissioning work done at the LIGO Livingston site from 2013 until 2015 in parallel with and after the installation of the instrument. This thesis also discusses new techniques and tools developed at the 40m prototype including adaptive filtering, estimation of quantization noise in digital filters and design of isolation kits for ground seismometers. The first part of this thesis is devoted to the description of methods for bringing the interferometer into linear regime when collection of data becomes possible. States of longitudinal and angular controls of interferometer degrees of freedom during lock acquisition process and in low noise configuration are discussed in details. Once interferometer is locked and transitioned to low noise regime, instrument produces astrophysics data that should be calibrated to units of meters or strain. The second part of this thesis describes online calibration technique set up in both observatories to monitor the quality of the collected data in real time. Sensitivity analysis was done to understand and eliminate noise sources of the instrument. The coupling of noise sources to gravitational wave channel can be reduced if robust feedforward and optimal feedback control loops are implemented. Static and adaptive feedforward noise cancellation techniques applied to Advanced LIGO interferometers and tested at the 40m prototype are described in the last part of this thesis. Applications of optimal time domain feedback control techniques and estimators to aLIGO control loops are also discussed. Commissioning work is still ongoing at the sites. First science run of advanced LIGO is planned for September 2015 and will last for 3-4 months. This run will be followed by a set of small instrument upgrades that will be installed on a time scale of few months. Second science run will start in spring 2016 and last for about six months. Since current sensitivity of advanced LIGO is already more than a factor of 3 higher compared to initial detectors and keeps improving on a monthly basis, the upcoming science runs have a good chance for the first direct detection of gravitational waves.
Scalable load balancing for massively parallel distributed Monte Carlo particle transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Brien, M. J.; Brantley, P. S.; Joy, K. I.
2013-07-01
In order to run computer simulations efficiently on massively parallel computers with hundreds of thousands or millions of processors, care must be taken that the calculation is load balanced across the processors. Examining the workload of every processor leads to an unscalable algorithm, with run time at least as large as O(N), where N is the number of processors. We present a scalable load balancing algorithm, with run time 0(log(N)), that involves iterated processor-pair-wise balancing steps, ultimately leading to a globally balanced workload. We demonstrate scalability of the algorithm up to 2 million processors on the Sequoia supercomputer at Lawrencemore » Livermore National Laboratory. (authors)« less
Parallel ALLSPD-3D: Speeding Up Combustor Analysis Via Parallel Processing
NASA Technical Reports Server (NTRS)
Fricker, David M.
1997-01-01
The ALLSPD-3D Computational Fluid Dynamics code for reacting flow simulation was run on a set of benchmark test cases to determine its parallel efficiency. These test cases included non-reacting and reacting flow simulations with varying numbers of processors. Also, the tests explored the effects of scaling the simulation with the number of processors in addition to distributing a constant size problem over an increasing number of processors. The test cases were run on a cluster of IBM RS/6000 Model 590 workstations with ethernet and ATM networking plus a shared memory SGI Power Challenge L workstation. The results indicate that the network capabilities significantly influence the parallel efficiency, i.e., a shared memory machine is fastest and ATM networking provides acceptable performance. The limitations of ethernet greatly hamper the rapid calculation of flows using ALLSPD-3D.
NASA Technical Reports Server (NTRS)
Gryphon, Coranth D.; Miller, Mark D.
1991-01-01
PCLIPS (Parallel CLIPS) is a set of extensions to the C Language Integrated Production System (CLIPS) expert system language. PCLIPS is intended to provide an environment for the development of more complex, extensive expert systems. Multiple CLIPS expert systems are now capable of running simultaneously on separate processors, or separate machines, thus dramatically increasing the scope of solvable tasks within the expert systems. As a tool for parallel processing, PCLIPS allows for an expert system to add to its fact-base information generated by other expert systems, thus allowing systems to assist each other in solving a complex problem. This allows individual expert systems to be more compact and efficient, and thus run faster or on smaller machines.
Experiences Using OpenMP Based on Compiler Directed Software DSM on a PC Cluster
NASA Technical Reports Server (NTRS)
Hess, Matthias; Jost, Gabriele; Mueller, Matthias; Ruehle, Roland; Biegel, Bryan (Technical Monitor)
2002-01-01
In this work we report on our experiences running OpenMP (message passing) programs on a commodity cluster of PCs (personal computers) running a software distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS (NASA Advanced Supercomputing) Parallel Benchmarks that have been automatically parallelized for OpenMP. We compare the performance of the OpenMP implementations with that of their message passing counterparts and discuss performance differences.
Network support for system initiated checkpoints
Chen, Dong; Heidelberger, Philip
2013-01-29
A system, method and computer program product for supporting system initiated checkpoints in parallel computing systems. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity.
NASA Astrophysics Data System (ADS)
Rambo, J. E.; Kim, W.; Miller, K.
2017-12-01
Physical modeling of a delta's evolution can represent how changing the intervals of flood and interflood can alter a delta's fluvial pattern and geometry. Here we present a set of six experimental runs in which sediment and water were discharged at constant rates over each experiment. During the "flood" period, both sediment and water were discharged at rates of 0.25 cm3/s and 15 ml/s respectively, and during the "interflood" period, only water was discharged at 7.5 ml/s. The flood periods were only run for 30 minutes to keep the total volume of sediment constant. Run 0 did not have an interflood period and therefore ran with constant sediment and water discharge for the duration of the experiment.The other five runs had either 5, 10, or 15-min intervals of flood with 5, 10, or 15-min intervals of interflood. The experimental results show that Run 0 had the smallest topset area. This is due to a lack of surface reworking that takes place during interflood periods. Run 1 had 15-minute intervals of flood and 15-minute intervals of interflood, and it had the largest topset area. Additionally, the experiments that had longer intervals of interflood than flood had more elongated delta geometries. Wetted fraction color maps were also created to plot channel locations during each run. The maps show that the runs with longer interflood durations had channels occurring predominantly down the middle with stronger incisions; these runs produced deltas with more elongated geometries. When the interflood duration was even longer, however, strong channels started to occur at multiple locations. This increased interflood period allowed for the entire area over the delta's surface to be reworked, thus reducing the downstream slope and allowing channels to be more mobile laterally. Physical modeling of a delta allows us to predict a delta's resulting geometry given a set of conditions. This insight is needed especially with delta's being the home to many populations of people and a habitat for various other species.
Parallel pulse processing and data acquisition for high speed, low error flow cytometry
Engh, G.J. van den; Stokdijk, W.
1992-09-22
A digitally synchronized parallel pulse processing and data acquisition system for a flow cytometer has multiple parallel input channels with independent pulse digitization and FIFO storage buffer. A trigger circuit controls the pulse digitization on all channels. After an event has been stored in each FIFO, a bus controller moves the oldest entry from each FIFO buffer onto a common data bus. The trigger circuit generates an ID number for each FIFO entry, which is checked by an error detection circuit. The system has high speed and low error rate. 17 figs.
Metal-organic framework assembled from erbium and a tetrapodal polyphosphonic acid organic linker.
Mendes, Ricardo F; Firmino, Ana D G; Tomé, João P C; Almeida Paz, Filipe A
2018-06-01
A three-dimensional metal-organic framework (MOF), poly[[μ 6 -5'-pentahydrogen [1,1'-biphenyl]-3,3',5,5'-tetrayltetrakis(phosphonato)]erbium(III)] 2.5-hydrate], formulated as [Er(C 12 H 11 O 12 P 4 )]·2.5H 2 O or [Er(H 5 btp)]·2.5H 2 O (I) and isotypical with a Y 3+ -based MOF reported previously by our research group [Firmino et al. (2017b). Inorg. Chem. 56, 1193-1208], was constructed based solely on Er 3+ and on the polyphosphonic organic linker [1,1'-biphenyl]-3,3',5,5'-tetrakis(phosphonic acid) (H 8 btp). The present work describes our efforts to introduce lanthanide cations into the flexible network, demonstrating that, on the one hand, the compound can be obtained using three distinct experimental methods, i.e. hydro(solvo)thermal (Hy), microwave-assisted (MW) and one-pot (Op), and, on the other hand, that crystallite size can be approximately fine-tuned according to the method employed. MOF I contains hexacoordinated Er 3+ cations which are distributed in a zigzag inorganic chain running parallel to the [100] direction of the unit cell. The chains are, in turn, bridged by the anionic organic linker to form a three-dimensional 6,6-connected binodal network. This connectivity leads to the existence of one-dimensional channels (also running parallel to the [100] direction) filled with disordered and partially occupied water molecules of crystalization which are engaged in O-H...O hydrogen-bonding interactions with the [Er(H 5 btp)] framework. Additional weak π-π interactions [intercentroid distance = 3.957 (7) Å] exist between aromatic rings, which help to maintain the structural integrity of the network.
Binary zone-plate array for a parallel joint transform correlator applied to face recognition.
Kodate, K; Hashimoto, A; Thapliya, R
1999-05-10
Taking advantage of small aberrations, high efficiency, and compactness, we developed a new, to our knowledge, design procedure for a binary zone-plate array (BZPA) and applied it to a parallel joint transform correlator for the recognition of the human face. Pairs of reference and unknown images of faces are displayed on a liquid-crystal spatial light modulator (SLM), Fourier transformed by the BZPA, intensity recorded on an optically addressable SLM, and inversely Fourier transformed to obtain correlation signals. Consideration of the bandwidth allows the relations among the channel number, the numerical aperture of the zone plates, and the pattern size to be determined. Experimentally a five-channel parallel correlator was implemented and tested successfully with a 100-person database. The design and the fabrication of a 20-channel BZPA for phonetic character recognition are also included.
Channel plate for DNA sequencing
Douthart, R.J.; Crowell, S.L.
1998-01-13
This invention is a channel plate that facilitates data compaction in DNA sequencing. The channel plate has a length, a width and a thickness, and further has a plurality of channels that are parallel. Each channel has a depth partially through the thickness of the channel plate. Additionally an interface edge permits electrical communication across an interface through a buffer to a deposition membrane surface. 15 figs.
Students' Adoption of Course-Specific Approaches to Learning in Two Parallel Courses
ERIC Educational Resources Information Center
Öhrstedt, Maria; Lindfors, Petra
2016-01-01
Research on students' adoption of course-specific approaches to learning in parallel courses is limited and inconsistent. This study investigated second-semester psychology students' levels of deep, surface and strategic approaches in two courses running in parallel within a real-life university setting. The results showed significant differences…
Support for Debugging Automatically Parallelized Programs
NASA Technical Reports Server (NTRS)
Hood, Robert; Jost, Gabriele
2001-01-01
This viewgraph presentation provides information on support sources available for the automatic parallelization of computer program. CAPTools, a support tool developed at the University of Greenwich, transforms, with user guidance, existing sequential Fortran code into parallel message passing code. Comparison routines are then run for debugging purposes, in essence, ensuring that the code transformation was accurate.
Parallel algorithms for mapping pipelined and parallel computations
NASA Technical Reports Server (NTRS)
Nicol, David M.
1988-01-01
Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.
Optimization of a new flow design for solid oxide cells using computational fluid dynamics modelling
NASA Astrophysics Data System (ADS)
Duhn, Jakob Dragsbæk; Jensen, Anker Degn; Wedel, Stig; Wix, Christian
2016-12-01
Design of a gas distributor to distribute gas flow into parallel channels for Solid Oxide Cells (SOC) is optimized, with respect to flow distribution, using Computational Fluid Dynamics (CFD) modelling. The CFD model is based on a 3d geometric model and the optimized structural parameters include the width of the channels in the gas distributor and the area in front of the parallel channels. The flow of the optimized design is found to have a flow uniformity index value of 0.978. The effects of deviations from the assumptions used in the modelling (isothermal and non-reacting flow) are evaluated and it is found that a temperature gradient along the parallel channels does not affect the flow uniformity, whereas a temperature difference between the channels does. The impact of the flow distribution on the maximum obtainable conversion during operation is also investigated and the obtainable overall conversion is found to be directly proportional to the flow uniformity. Finally the effect of manufacturing errors is investigated. The design is shown to be robust towards deviations from design dimensions of at least ±0.1 mm which is well within obtainable tolerances.
Optimized Hypervisor Scheduler for Parallel Discrete Event Simulations on Virtual Machine Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoginath, Srikanth B; Perumalla, Kalyan S
2013-01-01
With the advent of virtual machine (VM)-based platforms for parallel computing, it is now possible to execute parallel discrete event simulations (PDES) over multiple virtual machines, in contrast to executing in native mode directly over hardware as is traditionally done over the past decades. While mature VM-based parallel systems now offer new, compelling benefits such as serviceability, dynamic reconfigurability and overall cost effectiveness, the runtime performance of parallel applications can be significantly affected. In particular, most VM-based platforms are optimized for general workloads, but PDES execution exhibits unique dynamics significantly different from other workloads. Here we first present results frommore » experiments that highlight the gross deterioration of the runtime performance of VM-based PDES simulations when executed using traditional VM schedulers, quantitatively showing the bad scaling properties of the scheduler as the number of VMs is increased. The mismatch is fundamental in nature in the sense that any fairness-based VM scheduler implementation would exhibit this mismatch with PDES runs. We also present a new scheduler optimized specifically for PDES applications, and describe its design and implementation. Experimental results obtained from running PDES benchmarks (PHOLD and vehicular traffic simulations) over VMs show over an order of magnitude improvement in the run time of the PDES-optimized scheduler relative to the regular VM scheduler, with over 20 reduction in run time of simulations using up to 64 VMs. The observations and results are timely in the context of emerging systems such as cloud platforms and VM-based high performance computing installations, highlighting to the community the need for PDES-specific support, and the feasibility of significantly reducing the runtime overhead for scalable PDES on VM platforms.« less
Parallel computing in genomic research: advances and applications
Ocaña, Kary; de Oliveira, Daniel
2015-01-01
Today’s genomic experiments have to process the so-called “biological big data” that is now reaching the size of Terabytes and Petabytes. To process this huge amount of data, scientists may require weeks or months if they use their own workstations. Parallelism techniques and high-performance computing (HPC) environments can be applied for reducing the total processing time and to ease the management, treatment, and analyses of this data. However, running bioinformatics experiments in HPC environments such as clouds, grids, clusters, and graphics processing unit requires the expertise from scientists to integrate computational, biological, and mathematical techniques and technologies. Several solutions have already been proposed to allow scientists for processing their genomic experiments using HPC capabilities and parallelism techniques. This article brings a systematic review of literature that surveys the most recently published research involving genomics and parallel computing. Our objective is to gather the main characteristics, benefits, and challenges that can be considered by scientists when running their genomic experiments to benefit from parallelism techniques and HPC capabilities. PMID:26604801
Parallel computing in genomic research: advances and applications.
Ocaña, Kary; de Oliveira, Daniel
2015-01-01
Today's genomic experiments have to process the so-called "biological big data" that is now reaching the size of Terabytes and Petabytes. To process this huge amount of data, scientists may require weeks or months if they use their own workstations. Parallelism techniques and high-performance computing (HPC) environments can be applied for reducing the total processing time and to ease the management, treatment, and analyses of this data. However, running bioinformatics experiments in HPC environments such as clouds, grids, clusters, and graphics processing unit requires the expertise from scientists to integrate computational, biological, and mathematical techniques and technologies. Several solutions have already been proposed to allow scientists for processing their genomic experiments using HPC capabilities and parallelism techniques. This article brings a systematic review of literature that surveys the most recently published research involving genomics and parallel computing. Our objective is to gather the main characteristics, benefits, and challenges that can be considered by scientists when running their genomic experiments to benefit from parallelism techniques and HPC capabilities.
Communication library for run-time visualization of distributed, asynchronous data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rowlan, J.; Wightman, B.T.
1994-04-01
In this paper we present a method for collecting and visualizing data generated by a parallel computational simulation during run time. Data distributed across multiple processes is sent across parallel communication lines to a remote workstation, which sorts and queues the data for visualization. We have implemented our method in a set of tools called PORTAL (for Parallel aRchitecture data-TrAnsfer Library). The tools comprise generic routines for sending data from a parallel program (callable from either C or FORTRAN), a semi-parallel communication scheme currently built upon Unix Sockets, and a real-time connection to the scientific visualization program AVS. Our methodmore » is most valuable when used to examine large datasets that can be efficiently generated and do not need to be stored on disk. The PORTAL source libraries, detailed documentation, and a working example can be obtained by anonymous ftp from info.mcs.anl.gov from the file portal.tar.Z from the directory pub/portal.« less
NASA Astrophysics Data System (ADS)
Wang, Linlin; Wang, Zhenqi; Yu, Shui; Ngia, Ngong Roger
2016-08-01
The Miocene deepwater gravity-flow sedimentary system in Block A of the southwestern part of the Lower Congo Basin was identified and interpreted using high-resolution 3-D seismic, drilling and logging data to reveal development characteristics and main controlling factors. Five types of deepwater gravity-flow sedimentary units have been identified in the Miocene section of Block A, including mass transport, deepwater channel, levee, abandoned channel and sedimentary lobe deposits. Each type of sedimentary unit has distinct external features, internal structures and lateral characteristics in seismic profiles. Mass transport deposits (MTDs) in particular correspond to chaotic low-amplitude reflections in contact with mutants on both sides. The cross section of deepwater channel deposits in the seismic profile is in U- or V-shape. The channel deposits change in ascending order from low-amplitude, poor-continuity, chaotic filling reflections at the bottom, to high-amplitude, moderate to poor continuity, chaotic or sub-parallel reflections in the middle section and to moderate-weak amplitude, good continuity, parallel or sub-parallel reflections in the upper section. The sedimentary lobes are laterally lobate, which corresponds to high-amplitude, good-continuity, moundy reflection signatures in the seismic profile. Due to sediment flux, faults, and inherited terrain, few mass transport deposits occur in the northeastern part of the study area. The front of MTDs is mainly composed of channel-levee complex deposits, while abandoned-channel and lobe-deposits are usually developed in high-curvature channel sections and the channel terminals, respectively. The distribution of deepwater channel, levee, abandoned channel and sedimentary lobe deposits is predominantly controlled by relative sea level fluctuations and to a lesser extent by tectonism and inherited terrain.
NASA Astrophysics Data System (ADS)
Wang, Chunhong; Sun, Fujun; Fu, Zhongyuan; Ding, Zhaoxiang; Wang, Chao; Zhou, Jian; Wang, Jiawen; Tian, Huiping
2017-08-01
In this paper, a photonic crystal (PhC) butt-coupled mini-hexagonal-H1 defect (MHHD) microcavity sensor is proposed. The MHHD microcavity is designed by introducing six mini-holes into the initial H1 defect region. Further, based on a well-designed 1 ×3 PhC Beam Splitter and three optimal MHHD microcavity sensors with different lattice constants (a), a 3-channel parallel-connected PhC sensor array on monolithic silicon on insulator (SOI) is proposed. Finite-difference time-domain (FDTD) simulations method is performed to demonstrate the high performance of our structures. As statistics show, the quality factor (Q) of our optimal MHHD microcavity attains higher than 7×104, while the sensitivity (S) reaches up to 233 nm/RIU(RIU = refractive index unit). Thus, the figure of merit (FOM) >104 of the sensor is obtained, which is enhanced by two orders of magnitude compared to the previous butt-coupled sensors [1-4]. As for the 3-channel parallel-connected PhC MHHD microcavity sensor array, the FOMs of three independent MHHD microcavity sensors are 8071, 8250 and 8250, respectively. In addition, the total footprint of the proposed 3-channel parallel-connected PhC sensor array is ultra-compactness of 12.5 μm ×31 μm (width × length). Therefore, the proposed high FOM sensor array is an ideal platform for realizing ultra-compact highly parallel refractive index (RI) sensing.
NASA Technical Reports Server (NTRS)
Davis, Jeffrey A.; Day, Timothy; Lilly, Roger A.; Taber, Donald B.; Liu, Hua-Kuang
1988-01-01
A new multichannel optical correlator/convolver architecture which uses an acoustooptic light modulator for the input channel and a Semetex magnetooptic spatial light modulator (MOSLM) for the set of parallel reference channels is presented. Details of the anamorphic optical system are discussed. Experimental results illustrate the use of the system as a convolver for performing digital multiplication by analog convolution (DMAC). A limited gray scale capability for data stored by the MOSLM is demonstrated by implementing this DMAC algorithm with trinary logic. Use of the MOSLM allows the number of parallel channels for the convolver to be increased significantly compared with previously reported techniques while retaining the capability for updating both channels at high speeds.
NASA Astrophysics Data System (ADS)
Davis, Jeffrey A.; Day, Timothy; Lilly, Roger A.; Taber, Donald B.; Liu, Hua-Kuang; Davis, J. A.; Day, T.; Lilly, R. A.; Taber, D. B.; Liu, H.-K.
1988-02-01
We present a new multichannel optical correlator/convolver architecture which uses an acoustooptic light modulator (AOLM) for the input channel and a Semetex magnetooptic spatial light modulator (MOSLM) for the set of parallel reference channels. Details of the anamorphic optical system are discussed. Experimental results illustrate use of the system as a convolver for performing digital multiplication by analog convolution (DMAC). A limited gray scale capability for data stored by the MOSLM is demonstrated by implementing this DMAC algorithm with trinary logic. Use of the MOSLM allows the number of parallel channels for the convolver to be increased significantly compared with previously reported techniques while retaining the capability for updating both channels at high speeds.
NASA Astrophysics Data System (ADS)
Davis, Jeffrey A.; Day, Timothy; Lilly, Roger A.; Taber, Donald B.; Liu, Hua-Kuang
A new multichannel optical correlator/convolver architecture which uses an acoustooptic light modulator for the input channel and a Semetex magnetooptic spatial light modulator (MOSLM) for the set of parallel reference channels is presented. Details of the anamorphic optical system are discussed. Experimental results illustrate the use of the system as a convolver for performing digital multiplication by analog convolution (DMAC). A limited gray scale capability for data stored by the MOSLM is demonstrated by implementing this DMAC algorithm with trinary logic. Use of the MOSLM allows the number of parallel channels for the convolver to be increased significantly compared with previously reported techniques while retaining the capability for updating both channels at high speeds.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gong, Zhenhuan; Boyuka, David; Zou, X
Download Citation Email Print Request Permissions Save to Project The size and scope of cutting-edge scientific simulations are growing much faster than the I/O and storage capabilities of their run-time environments. The growing gap is exacerbated by exploratory, data-intensive analytics, such as querying simulation data with multivariate, spatio-temporal constraints, which induces heterogeneous access patterns that stress the performance of the underlying storage system. Previous work addresses data layout and indexing techniques to improve query performance for a single access pattern, which is not sufficient for complex analytics jobs. We present PARLO a parallel run-time layout optimization framework, to achieve multi-levelmore » data layout optimization for scientific applications at run-time before data is written to storage. The layout schemes optimize for heterogeneous access patterns with user-specified priorities. PARLO is integrated with ADIOS, a high-performance parallel I/O middleware for large-scale HPC applications, to achieve user-transparent, light-weight layout optimization for scientific datasets. It offers simple XML-based configuration for users to achieve flexible layout optimization without the need to modify or recompile application codes. Experiments show that PARLO improves performance by 2 to 26 times for queries with heterogeneous access patterns compared to state-of-the-art scientific database management systems. Compared to traditional post-processing approaches, its underlying run-time layout optimization achieves a 56% savings in processing time and a reduction in storage overhead of up to 50%. PARLO also exhibits a low run-time resource requirement, while also limiting the performance impact on running applications to a reasonable level.« less
Massively parallel processor networks with optical express channels
Deri, R.J.; Brooks, E.D. III; Haigh, R.E.; DeGroot, A.J.
1999-08-24
An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination. 3 figs.
Massively parallel processor networks with optical express channels
Deri, Robert J.; Brooks, III, Eugene D.; Haigh, Ronald E.; DeGroot, Anthony J.
1999-01-01
An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination.
Study of Thread Level Parallelism in a Video Encoding Application for Chip Multiprocessor Design
NASA Astrophysics Data System (ADS)
Debes, Eric; Kaine, Greg
2002-11-01
In media applications there is a high level of available thread level parallelism (TLP). In this paper we study the intra TLP in a video encoder. We show that a well-distributed highly optimized encoder running on a symmetric multiprocessor (SMP) system can run 3.2 faster on a 4-way SMP machine than on a single processor. The multithreaded encoder running on an SMP system is then used to understand the requirements of a chip multiprocessor (CMP) architecture, which is one possible architectural direction to better exploit TLP. In the framework of this study, we use a software approach to evaluate the dataflow between processors for the video encoder running on an SMP system. An estimation of the dataflow is done with L2 cache miss event counters using Intel® VTuneTM performance analyzer. The experimental measurements are compared to theoretical results.
NASA Astrophysics Data System (ADS)
Noh, Young-Chan; Sohn, Byung-Ju; Kim, Yoonjae; Joo, Sangwon; Bell, William; Saunders, Roger
2017-11-01
A new set of Infrared Atmospheric Sounding Interferometer (IASI) channels was re-selected from 314 EUMETSAT channels. In selecting channels, we calculated the impact of the individually added channel on the improvement in the analysis outputs from a one-dimensional variational analysis (1D-Var) for the Unified Model (UM) data assimilation system at the Met Office, using the channel score index (CSI) as a figure of merit. Then, 200 channels were selected in order by counting each individual channel's CSI contribution. Compared with the operationally used 183 channels for the UM at the Met Office, the new set shares 149 channels, while the other 51 channels are new. Also examined is the selection from the entropy reduction method with the same 1D-Var approach. Results suggest that channel selection can be made in a more objective fashion using the proposed CSI method. This is because the most important channels can be selected across the whole IASI observation spectrum. In the experimental trial runs using the UM global assimilation system, the new channels had an overall neutral impact in terms of improvement in forecasts, as compared with results from the operational channels. However, upper-tropospheric moist biases shown in the control run with operational channels were significantly reduced in the experimental trial with the newly selected channels. The reduction of moist biases was mainly due to the additional water vapor channels, which are sensitive to the upper-tropospheric water vapor.
Scalable descriptive and correlative statistics with Titan.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thompson, David C.; Pebay, Philippe Pierre
This report summarizes the existing statistical engines in VTK/Titan and presents the parallel versions thereof which have already been implemented. The ease of use of these parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; then, this theoretical property is verified with test runs that demonstrate optimal parallel speed-up with up to 200 processors.
NASA Astrophysics Data System (ADS)
Vrabec, M.; Slavec, P.; Poglajen, S.; Busetti, M.
2012-04-01
We use multibeam and parametric subbottom sonar data, complemented with multichannel and high-resolution single-channel seismic profiles, to investigate sea-bottom morphology and subbottom sediment structure in the south-eastern half of the Gulf of Trieste, northern Adriatic Sea. The study area comprises 180 km2 of predominantly flat seabed with the water depth from 20 to 25 m. Pre-Quaternary basement consists of Mesozoic-Paleogene carbonate platform unit, overlain by Eocene marls and sandstones, covered by up to 300 m thick Quaternary sediments of predominantly continental origin. The uppermost few meters of sediment consist of Holocene fine-grained marine deposits. Structurally, the investigated area belongs to the imbricated rim of the Adriatic microplate and is dissected by several NE-dipping low-angle thrusts with up to several kms of displacement. The thrusts are cut by younger NE-SW-trending steeply dipping faults with sinistral and/or normal offset, mapped onshore. The continuation of those faults into the offshore area is suggested by mismatch of thrust structures between parallel seismic profiles. Geodetic data on present-day tectonic activity is controversial. Whereas the Adriatic microplate is currently moving northwards towards Eurasia at the rate of 2-4 mm/yr, the GNSS data show no measurable deformation in the Gulf of Trieste. On the other hand, onshore precise-levelling data suggest localized vertical motions in the range of 1 mm/yr, interpreted as an indication of thrust activity. High-resolution swath bathymetry revealed several current-related erosional and depositional features such as gullies and megadunes with up to 5 m of relief. The most conspicuous seabed morphological features are pre-Holocene river channels preserved in low-erosion submarine environment, which make excellent markers for studying the long-term geomorphological evolution of the area. The WNW-ESE-trending paleo-Rižana river is characterized by highly sinuous meandering channels. Sequential profiles perpendicular to the river course suggest consistent ~NE-ward lateral shifting of channels, parallel with inclination of the present-day seabed and with the present-day lateral gradient in channel depth. A longitudinal profile of the Rižana river plain revealed downstream increase in elevation of the stream bed, visible both from seabed bathymetry and from vertical position of channel lag deposits in subbottom sonar profiles. These observations suggest post-depositional tectonic tilting of the fluvial sediments that could be related either to activation of NE-dipping thrusts in the pre-Quaternary basement, or to minor anticlinal folding associated with Quaternary transpressional faulting along NW-SE-trending zones, implied from seismic profiles NW-ward of our study area. An enigmatic low-sinuosity channel feature runs along the coastline in the NE-SW direction and crosses the paleo-Rižana channel. Subbottom sonar profiles show asymmetric channel geometry and strong reflectors (channel lag deposits?) at the channel bottom, typical of other documented river channels in the area. This feature is vertically offset by a NE-SW-trending linear morphological flexure that corresponds in location and orientation to the onshore Monte Spaccato fault. Subbottom profiling revealed in several places an abrupt truncation of horizontal reflectors that could be manifestation of faulting. These indications of Late Quaternary - Holocene tectonic activity may have important implications for seismic hazard in the heavily populated coastal area of the Gulf of Trieste.
Progress towards NASA MODIS and Suomi NPP Cloud Property Data Record Continuity
NASA Astrophysics Data System (ADS)
Platnick, S.; Meyer, K.; Holz, R.; Ackerman, S. A.; Heidinger, A.; Wind, G.; Platnick, S. E.; Wang, C.; Marchant, B.; Frey, R.
2017-12-01
The Suomi NPP VIIRS imager provides an opportunity to extend the 17+ year EOS MODIS climate data record into the next generation operational era. Similar to MODIS, VIIRS provides visible through IR observations at moderate spatial resolution with a 1330 LT equatorial crossing consistent with the MODIS on the Aqua platform. However, unlike MODIS, VIIRS lacks key water vapor and CO2 absorbing channels used for high cloud detection and cloud-top property retrievals. In addition, there is a significant mismatch in the spectral location of the 2.2 μm shortwave-infrared channels used for cloud optical/microphysical retrievals and cloud thermodynamic phase. Given these instrument differences between MODIS EOS and VIIRS S-NPP/JPSS, a merged MODIS-VIIRS cloud record to serve the science community in the coming decades requires different algorithm approaches than those used for MODIS alone. This new approach includes two parallel efforts: (1) Imager-only algorithms with only spectral channels common to VIIRS and MODIS (i.e., eliminate use of MODIS CO2 and NIR/IR water vapor channels). Since the algorithms are run with similar spectral observations, they provide a basis for establishing a continuous cloud data record across the two imagers. (2) Merged imager and sounder measurements (i.e.., MODIS-AIRS, VIIRS-CrIS) in lieu of higher-spatial resolution MODIS absorption channels absent on VIIRS. The MODIS-VIIRS continuity algorithm for cloud optical property retrievals leverages heritage algorithms that produce the existing MODIS cloud mask (MOD35), optical and microphysical properties product (MOD06), and the NOAA AWG Cloud Height Algorithm (ACHA). We discuss our progress towards merging the MODIS observational record with VIIRS in order to generate cloud optical property climate data record continuity across the observing systems. In addition, we summarize efforts to reconcile apparent radiometric biases between analogous imager channels, a critical consideration for obtaining inter-sensor climate data record continuity.
ProperCAD: A portable object-oriented parallel environment for VLSI CAD
NASA Technical Reports Server (NTRS)
Ramkumar, Balkrishna; Banerjee, Prithviraj
1993-01-01
Most parallel algorithms for VLSI CAD proposed to date have one important drawback: they work efficiently only on machines that they were designed for. As a result, algorithms designed to date are dependent on the architecture for which they are developed and do not port easily to other parallel architectures. A new project under way to address this problem is described. A Portable object-oriented parallel environment for CAD algorithms (ProperCAD) is being developed. The objectives of this research are (1) to develop new parallel algorithms that run in a portable object-oriented environment (CAD algorithms using a general purpose platform for portable parallel programming called CARM is being developed and a C++ environment that is truly object-oriented and specialized for CAD applications is also being developed); and (2) to design the parallel algorithms around a good sequential algorithm with a well-defined parallel-sequential interface (permitting the parallel algorithm to benefit from future developments in sequential algorithms). One CAD application that has been implemented as part of the ProperCAD project, flat VLSI circuit extraction, is described. The algorithm, its implementation, and its performance on a range of parallel machines are discussed in detail. It currently runs on an Encore Multimax, a Sequent Symmetry, Intel iPSC/2 and i860 hypercubes, a NCUBE 2 hypercube, and a network of Sun Sparc workstations. Performance data for other applications that were developed are provided: namely test pattern generation for sequential circuits, parallel logic synthesis, and standard cell placement.
Development for SSV on a parallel processing system (PARAGON)
NASA Astrophysics Data System (ADS)
Gothard, Benny M.; Allmen, Mark; Carroll, Michael J.; Rich, Dan
1995-12-01
A goal of the surrogate semi-autonomous vehicle (SSV) program is to have multiple vehicles navigate autonomously and cooperatively with other vehicles. This paper describes the process and tools used in porting UGV/SSV (unmanned ground vehicle) autonomous mobility and target recognition algorithms from a SISD (single instruction single data) processor architecture (i.e., a Sun SPARC workstation running C/UNIX) to a MIMD (multiple instruction multiple data) parallel processor architecture (i.e., PARAGON-a parallel set of i860 processors running C/UNIX). It discusses the gains in performance and the pitfalls of such a venture. It also examines the merits of this processor architecture (based on this conceptual prototyping effort) and programming paradigm to meet the final SSV demonstration requirements.
Parallelization and automatic data distribution for nuclear reactor simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liebrock, L.M.
1997-07-01
Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directlymore » affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed.« less
Parallel Evolution of Sperm Hyper-Activation Ca2+ Channels
Phadnis, Nitin
2017-01-01
Abstract Sperm hyper-activation is a dramatic change in sperm behavior where mature sperm burst into a final sprint in the race to the egg. The mechanism of sperm hyper-activation in many metazoans, including humans, consists of a jolt of Ca2+ into the sperm flagellum via CatSper ion channels. Surprisingly, all nine CatSper genes have been independently lost in several animal lineages. In Drosophila, sperm hyper-activation is performed through the cooption of the polycystic kidney disease 2 (pkd2) Ca2+ channel. The parallels between CatSpers in primates and pkd2 in Drosophila provide a unique opportunity to examine the molecular evolution of the sperm hyper-activation machinery in two independent, nonhomologous calcium channels separated by > 500 million years of divergence. Here, we use a comprehensive phylogenomic approach to investigate the selective pressures on these sperm hyper-activation channels. First, we find that the entire CatSper complex evolves rapidly under recurrent positive selection in primates. Second, we find that pkd2 has parallel patterns of adaptive evolution in Drosophila. Third, we show that this adaptive evolution of pkd2 is driven by its role in sperm hyper-activation. These patterns of selection suggest that the evolution of the sperm hyper-activation machinery is driven by sexual conflict with antagonistic ligands that modulate channel activity. Together, our results add sperm hyper-activation channels to the class of fast evolving reproductive proteins and provide insights into the mechanisms used by the sexes to manipulate sperm behavior. PMID:28810709
Speaker Recognition Using Real vs. Synthetic Parallel Data for DNN Channel Compensation
2016-09-08
Speaker Recognition Using Real vs Synthetic Parallel Data for DNN Channel Compensation Fred Richardson, Michael Brandstein, Jennifer Melot, and...DNNs trained with real Mixer 2 multichannel data perform only slightly better than DNNs trained with synthetic multichannel data for microphone SR on...Mixer 6. Large re- ductions in pooled error rates of 50% EER and 30% min DCF are achieved using DNNs trained on real Mixer 2 data. Nearly the same
NASA Technical Reports Server (NTRS)
Golomidov, Y. V.; Li, S. K.; Popov, S. A.; Smolov, V. B.
1986-01-01
After a classification and analysis of electronic and optoelectronic switching devices, the design principles and structure of a matrix optical switch is described. The switching and pair-exclusion operations in this type of switch are examined, and a method for the optical switching of communication channels is elaborated. Finally, attention is given to the structural organization of a parallel computer system with a matrix optical switch.
Multi-channel temperature measurement system for automotive battery stack
NASA Astrophysics Data System (ADS)
Lewczuk, Radoslaw; Wojtkowski, Wojciech
2017-08-01
A multi-channel temperature measurement system for monitoring of automotive battery stack is presented in the paper. The presented system is a complete battery temperature measuring system for hybrid / electric vehicles that incorporates multi-channel temperature measurements with digital temperature sensors communicating through 1-Wire buses, individual 1-Wire bus for each sensor for parallel computing (parallel measurements instead of sequential), FPGA device which collects data from sensors and translates it for CAN bus frames. CAN bus is incorporated for communication with car Battery Management System and uses additional CAN bus controller which communicates with FPGA device through SPI bus. The described system can parallel measure up to 12 temperatures but can be easily extended in the future in case of additional needs. The structure of the system as well as particular devices are described in the paper. Selected results of experimental investigations which show proper operation of the system are presented as well.
Parallel processing spacecraft communication system
NASA Technical Reports Server (NTRS)
Bolotin, Gary S. (Inventor); Donaldson, James A. (Inventor); Luong, Huy H. (Inventor); Wood, Steven H. (Inventor)
1998-01-01
An uplink controlling assembly speeds data processing using a special parallel codeblock technique. A correct start sequence initiates processing of a frame. Two possible start sequences can be used; and the one which is used determines whether data polarity is inverted or non-inverted. Processing continues until uncorrectable errors are found. The frame ends by intentionally sending a block with an uncorrectable error. Each of the codeblocks in the frame has a channel ID. Each channel ID can be separately processed in parallel. This obviates the problem of waiting for error correction processing. If that channel number is zero, however, it indicates that the frame of data represents a critical command only. That data is handled in a special way, independent of the software. Otherwise, the processed data further handled using special double buffering techniques to avoid problems from overrun. When overrun does occur, the system takes action to lose only the oldest data.
Multirate-based fast parallel algorithms for 2-D DHT-based real-valued discrete Gabor transform.
Tao, Liang; Kwan, Hon Keung
2012-07-01
Novel algorithms for the multirate and fast parallel implementation of the 2-D discrete Hartley transform (DHT)-based real-valued discrete Gabor transform (RDGT) and its inverse transform are presented in this paper. A 2-D multirate-based analysis convolver bank is designed for the 2-D RDGT, and a 2-D multirate-based synthesis convolver bank is designed for the 2-D inverse RDGT. The parallel channels in each of the two convolver banks have a unified structure and can apply the 2-D fast DHT algorithm to speed up their computations. The computational complexity of each parallel channel is low and is independent of the Gabor oversampling rate. All the 2-D RDGT coefficients of an image are computed in parallel during the analysis process and can be reconstructed in parallel during the synthesis process. The computational complexity and time of the proposed parallel algorithms are analyzed and compared with those of the existing fastest algorithms for 2-D discrete Gabor transforms. The results indicate that the proposed algorithms are the fastest, which make them attractive for real-time image processing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fang, Chin; Corttrell, R. A.
This Technical Note provides an overview of high-performance parallel Big Data transfers with and without encryption for data in-transit over multiple network channels. It shows that with the parallel approach, it is feasible to carry out high-performance parallel "encrypted" Big Data transfers without serious impact to throughput. But other impacts, e.g. the energy-consumption part should be investigated. It also explains our rationales of using a statistics-based approach for gaining understanding from test results and for improving the system. The presentation is of high-level nature. Nevertheless, at the end we will pose some questions and identify potentially fruitful directions for futuremore » work.« less
Blaettler, M; Bruegger, A; Forster, I C; Lehareinger, Y
1988-03-01
The design of an analog interface to a digital audio signal processor (DASP)-video cassette recorder (VCR) system is described. The complete system represents a low-cost alternative to both FM instrumentation tape recorders and multi-channel chart recorders. The interface or DASP input-output unit described in this paper enables the recording and playback of up to 12 analog channels with a maximum of 12 bit resolution and a bandwidth of 2 kHz per channel. Internal control and timing in the recording component of the interface is performed using ROMs which can be reprogrammed to suit different analog-to-digital converter hardware. Improvement in the bandwidth specifications is possible by connecting channels in parallel. A parallel 16 bit data output port is provided for direct transfer of the digitized data to a computer.
Note: optical receiver system for 152-channel magnetoencephalography.
Kim, Jin-Mok; Kwon, Hyukchan; Yu, Kwon-kyu; Lee, Yong-Ho; Kim, Kiwoong
2014-11-01
An optical receiver system composing 13 serial data restore/synchronizer modules and a single module combiner converted optical 32-bit serial data into 32-bit synchronous parallel data for a computer to acquire 152-channel magnetoencephalography (MEG) signals. A serial data restore/synchronizer module identified 32-bit channel-voltage bits from 48-bit streaming serial data, and then consecutively reproduced 13 times of 32-bit serial data, acting in a synchronous clock. After selecting a single among 13 reproduced data in each module, a module combiner converted it into 32-bit parallel data, which were carried to 32-port digital input board in a computer. When the receiver system together with optical transmitters were applied to 152-channel superconducting quantum interference device sensors, this MEG system maintained a field noise level of 3 fT/√Hz @ 100 Hz at a sample rate of 1 kSample/s per channel.
Fire detection and incidents localization based on public information channels and social media
NASA Astrophysics Data System (ADS)
Thanos, Konstantinos-Georgios; Skroumpelou, Katerina; Rizogiannis, Konstantinos; Kyriazanos, Dimitris M.; Astyakopoulos, Alkiviadis; Thomopoulos, Stelios C. A.
2017-05-01
In this paper a solution is presented aiming to assist the early detection and localization of a fire incident by exploiting crowdsourcing and unofficial civilian online reports. It consists of two components: (a) the potential fire incident detection and (b) the visualization component. The first component comprises two modules that run in parallel and aim to collect reports posted on public platforms and conclude to potential fire incident locations. It collects the public reports, distinguishes reports that refer to a potential fire incident and store the corresponding information in a structured way. The second module aggregates all these stored reports and conclude to a probable fire location, based on the amount of reports per area, the time and location of these reports. In further the result is entered to a fusion module which combines it with information collected by sensors if available in order to provide a more accurate fire event detection capability. The visualization component is a fully - operational public information channel which provides accurate and up-to-date information about active and past fires, raises awareness about forest fires and the relevant hazards among citizens. The channel has visualization capabilities for presenting in an efficient way information regarding detected fire incidents fire expansion areas, and relevant information such as detecting sensors and reporting origin. The paper concludes with insight to current CONOPS end user with regards to the inclusion of the proposed solution to the current CONOPS of fire detection.
NASA Technical Reports Server (NTRS)
1997-01-01
A color image of part of the Nilosyrtis Mensae region of Mars containing the impact craters Antoniadi and Baldet (south to north) in the lower left corner; north toward top. The scene shows heavily cratered highlands on the south separated from the relatively smooth lowland plains on the northeast corner by a belt of dissected terrain, containing flat-floored valleys, mesas, buttes, and channels. The channels are (left to right) Auqakuh and Huo Hsing Valles; Nili Fossae lie in lower right corner of image. This image is a composite of Viking medium-resolution images in black and white and low-resolution images in color. The image extends from latitude 20 degrees N. to 40 degrees N. and from longitude 280 degrees to 305 degrees. Mercator projection is used below 30 degrees N.; Lambert projection is used above 30 degrees N. The dissected terrain along the highlands/lowlands boundary consist of the flat-floored valleys (mensae) and farther north the small, rounded hills of knobby terrain. Flows on the mensa floors contain striae that run parallel to valley walls; where valleys meet, the striae merge, similar to medial moraines on glaciers. Terraces within the valley hills have been interpreted as either layer rocks or wave terraces. The knobby terrain has been interpreted as remnants of the old, densely cratered highland terrain perhaps eroded by mass wasting. Auqakuh and Huo Hsing Valles and Nili Fossae are fretted channels and linear depressions that likely formed by sapping and mass wasting along lines of structural weakness.Self-Scheduling Parallel Methods for Multiple Serial Codes with Application to WOPWOP
NASA Technical Reports Server (NTRS)
Long, Lyle N.; Brentner, Kenneth S.
2000-01-01
This paper presents a scheme for efficiently running a large number of serial jobs on parallel computers. Two examples are given of computer programs that run relatively quickly, but often they must be run numerous times to obtain all the results needed. It is very common in science and engineering to have codes that are not massive computing challenges in themselves, but due to the number of instances that must be run, they do become large-scale computing problems. The two examples given here represent common problems in aerospace engineering: aerodynamic panel methods and aeroacoustic integral methods. The first example simply solves many systems of linear equations. This is representative of an aerodynamic panel code where someone would like to solve for numerous angles of attack. The complete code for this first example is included in the appendix so that it can be readily used by others as a template. The second example is an aeroacoustics code (WOPWOP) that solves the Ffowcs Williams Hawkings equation to predict the far-field sound due to rotating blades. In this example, one quite often needs to compute the sound at numerous observer locations, hence parallelization is utilized to automate the noise computation for a large number of observers.
GPU accelerated cell-based adaptive mesh refinement on unstructured quadrilateral grid
NASA Astrophysics Data System (ADS)
Luo, Xisheng; Wang, Luying; Ran, Wei; Qin, Fenghua
2016-10-01
A GPU accelerated inviscid flow solver is developed on an unstructured quadrilateral grid in the present work. For the first time, the cell-based adaptive mesh refinement (AMR) is fully implemented on GPU for the unstructured quadrilateral grid, which greatly reduces the frequency of data exchange between GPU and CPU. Specifically, the AMR is processed with atomic operations to parallelize list operations, and null memory recycling is realized to improve the efficiency of memory utilization. It is found that results obtained by GPUs agree very well with the exact or experimental results in literature. An acceleration ratio of 4 is obtained between the parallel code running on the old GPU GT9800 and the serial code running on E3-1230 V2. With the optimization of configuring a larger L1 cache and adopting Shared Memory based atomic operations on the newer GPU C2050, an acceleration ratio of 20 is achieved. The parallelized cell-based AMR processes have achieved 2x speedup on GT9800 and 18x on Tesla C2050, which demonstrates that parallel running of the cell-based AMR method on GPU is feasible and efficient. Our results also indicate that the new development of GPU architecture benefits the fluid dynamics computing significantly.
Wang, Xinrui; Fitts, Robert H
2017-08-01
Regular exercise training is known to affect the action potential duration (APD) and improve heart function, but involvement of β-adrenergic receptor (β-AR) subtypes and/or the ATP-sensitive K + (K ATP ) channel is unknown. To address this, female and male Sprague-Dawley rats were randomly assigned to voluntary wheel-running or control groups; they were anesthetized after 6-8 wk of training, and myocytes were isolated. Exercise training significantly increased APD of apex and base myocytes at 1 Hz and decreased APD at 10 Hz. Ca 2+ transient durations reflected the changes in APD, while Ca 2+ transient amplitudes were unaffected by wheel running. The nonselective β-AR agonist isoproterenol shortened the myocyte APD, an effect reduced by wheel running. The isoproterenol-induced shortening of APD was largely reversed by the selective β 1 -AR blocker atenolol, but not the β 2 -AR blocker ICI 118,551, providing evidence that wheel running reduced the sensitivity of the β 1 -AR. At 10 Hz, the K ATP channel inhibitor glibenclamide prolonged the myocyte APD more in exercise-trained than control rats, implicating a role for this channel in the exercise-induced APD shortening at 10 Hz. A novel finding of this work was the dual importance of altered β 1 -AR responsiveness and K ATP channel function in the training-induced regulation of APD. Of physiological importance to the beating heart, the reduced response to adrenergic agonists would enhance cardiac contractility at resting rates, where sympathetic drive is low, by prolonging APD and Ca 2+ influx; during exercise, an increase in K ATP channel activity would shorten APD and, thus, protect the heart against Ca 2+ overload or inadequate filling. NEW & NOTEWORTHY Our data demonstrated that regular exercise prolonged the action potential and Ca 2+ transient durations in myocytes isolated from apex and base regions at 1-Hz and shortened both at 10-Hz stimulation. Novel findings were that wheel running shifted the β-adrenergic receptor agonist dose-response curve rightward compared with controls by reducing β 1 -adrenergic receptor responsiveness and that, at the high activation rate, myocytes from trained animals showed higher K ATP channel function. Copyright © 2017 the American Physiological Society.
Scalable Domain Decomposed Monte Carlo Particle Transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Brien, Matthew Joseph
2013-12-05
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.
Proposed scheme for parallel 10Gb/s VSR system and its verilog HDL realization
NASA Astrophysics Data System (ADS)
Zhou, Yi; Chen, Hongda; Zuo, Chao; Jia, Jiuchun; Shen, Rongxuan; Chen, Xiongbin
2005-02-01
This paper proposes a novel and innovative scheme for 10Gb/s parallel Very Short Reach (VSR) optical communication system. The optimized scheme properly manages the SDH/SONET redundant bytes and adjusts the position of error detecting bytes and error correction bytes. Compared with the OIF-VSR4-01.0 proposal, the scheme has a coding process module. The SDH/SONET frames in transmission direction are disposed as follows: (1) The Framer-Serdes Interface (FSI) gets 16×622.08Mb/s STM-64 frame. (2) The STM-64 frame is byte-wise stripped across 12 channels, all channels are data channels. During this process, the parity bytes and CRC bytes are generated in the similar way as OIF-VSR4-01.0 and stored in the code process module. (3) The code process module will regularly convey the additional parity bytes and CRC bytes to all 12 data channels. (4) After the 8B/10B coding, the 12 channels is transmitted to the parallel VCSEL array. The receive process approximately in reverse order of transmission process. By applying this scheme to 10Gb/s VSR system, the frame size in VSR system is reduced from 15552×12 bytes to 14040×12 bytes, the system redundancy is reduced obviously.
NASA Astrophysics Data System (ADS)
Steffen, K.; Huff, R. D.; Cullen, N.; Rignot, E.; Stewart, C.; Jenkins, A.
2003-12-01
Petermann Gletscher is the largest and most influential outlet glacier in central northern Greenland. Located at 81 N, 60 W, it drains an area of 71,580 km2, with a discharge of 12 cubic km of ice per year into the Arctic Ocean. We finished a second field season in spring 2003 collecting in situ data on local climate, ice velocity, strain rates, ice thickness profiles and bottom melt rates of the floating ice tongue. Last years findings have been confirmed that large channels of several hundred meters in depth at the underside of the floating ice tongue are running roughly parallel to the flow direction. We mapped these channels using ground penetrating radar at 25 MHz frequency and multi-phase radar in profiling mode over half of the glacier's width. In addition, NASA airborne laser altimeter data was collected along and cross-glacier for accurate assessment of surface topography. We will present a 3-D model of the floating ice tongue and provide hypothesis of the origin and mechanism that caused these large ice channels at the bottom of the floating ice tongue. Multi-phase radar point measurements revealed interesting results of bottom melt rates, which exceed all previous estimates. It is worth mentioned that the largest bottom melt rates were not found at the grounding line, which is common on ice shelves in the Antarctica. In addition, GPS tidal motion has been measured over one lunar cycle at the flex zone and on the free floating ice tongue and the result will be compared to historic measurements made at the beginning of last century. The surface climate has been recorded by two automatic weather stations over a 12 month period, and the local climate of this remote region will be presented.
Upgrades and Real Time Ntm Control Application of the Ece Radiometer on Asdex Upgrade
NASA Astrophysics Data System (ADS)
Hicks, N. K.; Suttrop, W.; Behler, K.; Giannone, L.; Manini, A.; Maraschek, M.; Raupp, G.; Reich, M.; Sips, A. C. C.; Stober, J.; Treutterer, W.; ASDEX Upgrade Team; Cirant, S.
2009-04-01
The 60-channel electron cyclotron emission (ECE) radiometer diagnostic on the ASDEX Upgrade tokamak is presently being upgraded to include a 1 MHz sampling rate data acquisition system. This expanded capability allows electron temperature measurements up to 500 kHz (anti-aliasing filter cut-off) with spatial resolution ~1 cm, and will thus provide measurement of plasma phenomena on the MHD timescale, such as neoclassical tearing modes (NTMs). The upgraded and existing systems may be run in parallel for comparison, and some of the first plasma measurements using the two systems together are presented. A particular planned application of the upgraded radiometer is integration into a real-time NTM stabilization loop using targeted deposition of electron cyclotron resonance heating (ECRH). For this loop, it is necessary to determine the locations of the NTM and ECRH deposition using ECE measurements. As the magnetic island of the NTM repeatedly rotates through the ECE line of sight, electron temperature fluctuations at the NTM frequency are observed. The magnetic perturbation caused by the NTM is independently measured using Mirnov coils, and a correlation profile between these magnetic measurements and the ECE data is constructed. The phase difference between ECE oscillations on opposite sides of the island manifests as a zero-crossing of the correlation profile, which determines the NTM location in ECE channel space. To determine the location of ECRH power deposition, the power from a given gyrotron may be modulated at a particular frequency. Correlation analysis of this modulated signal and the ECE data identifies a particular ECE channel associated with the deposition of that gyrotron. Real time equilibrium reconstruction allows the ECE channels to be translated into flux surface and spatial coordinates for use in the feedback loop.
High-Throughput Screening of Na(V)1.7 Modulators Using a Giga-Seal Automated Patch Clamp Instrument.
Chambers, Chris; Witton, Ian; Adams, Cathryn; Marrington, Luke; Kammonen, Juha
2016-03-01
Voltage-gated sodium (Na(V)) channels have an essential role in the initiation and propagation of action potentials in excitable cells, such as neurons. Of these channels, Na(V)1.7 has been indicated as a key channel for pain sensation. While extensive efforts have gone into discovering novel Na(V)1.7 modulating compounds for the treatment of pain, none has reached the market yet. In the last two years, new compound screening technologies have been introduced, which may speed up the discovery of such compounds. The Sophion Qube(®) is a next-generation 384-well giga-seal automated patch clamp (APC) screening instrument, capable of testing thousands of compounds per day. By combining high-throughput screening and follow-up compound testing on the same APC platform, it should be possible to accelerate the hit-to-lead stage of ion channel drug discovery and help identify the most interesting compounds faster. Following a period of instrument beta-testing, a Na(V)1.7 high-throughput screen was run with two Pfizer plate-based compound subsets. In total, data were generated for 158,000 compounds at a median success rate of 83%, which can be considered high in APC screening. In parallel, IC50 assay validation and protocol optimization was completed with a set of reference compounds to understand how the IC50 potencies generated on the Qube correlate with data generated on the more established Sophion QPatch(®) APC platform. In summary, the results presented here demonstrate that the Qube provides a comparable but much faster approach to study Na(V)1.7 in a robust and reliable APC assay for compound screening.
Static analysis of the hull plate using the finite element method
NASA Astrophysics Data System (ADS)
Ion, A.
2015-11-01
This paper aims at presenting the static analysis for two levels of a container ship's construction as follows: the first level is at the girder / hull plate and the second level is conducted at the entire strength hull of the vessel. This article will describe the work for the static analysis of a hull plate. We shall use the software package ANSYS Mechanical 14.5. The program is run on a computer with four Intel Xeon X5260 CPU processors at 3.33 GHz, 32 GB memory installed. In terms of software, the shared memory parallel version of ANSYS refers to running ANSYS across multiple cores on a SMP system. The distributed memory parallel version of ANSYS (Distributed ANSYS) refers to running ANSYS across multiple processors on SMP systems or DMP systems.
NASA Technical Reports Server (NTRS)
Mclyman, W. T.
1981-01-01
Transformer transmits power and digital data across rotating interface. Array has many parallel data channels, each with potential l megabaud data rate. Ferrite-cored transformers are spaced along rotor; airgap between them reduces crosstalk.
Massively parallel quantum computer simulator
NASA Astrophysics Data System (ADS)
De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.
2007-01-01
We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers.
ERIC Educational Resources Information Center
White, A. S.
1976-01-01
Describes a simple water channel, for use with an overhead projector. It is run from a water tap and may be used for flow visualization experiments, including the effect of streamlining and elementary building aerodynamics. (MLH)
Ferré, Jean-Christophe; Petr, Jan; Bannier, Elise; Barillot, Christian; Gauvrit, Jean-Yves
2012-05-01
To compare 12-channel and 32-channel phased-array coils and to determine the optimal parallel imaging (PI) technique and factor for brain perfusion imaging using Pulsed Arterial Spin labeling (PASL) at 3 Tesla (T). Twenty-seven healthy volunteers underwent 10 different PASL perfusion PICORE Q2TIPS scans at 3T using 12-channel and 32-channel coils without PI and with GRAPPA or mSENSE using factor 2. PI with factor 3 and 4 were used only with the 32-channel coil. Visual quality was assessed using four parameters. Quantitative analyses were performed using temporal noise, contrast-to-noise and signal-to-noise ratios (CNR, SNR). Compared with 12-channel acquisition, the scores for 32-channel acquisition were significantly higher for overall visual quality, lower for noise and higher for SNR and CNR. With the 32-channel coil, artifact compromise achieved the best score with PI factor 2. Noise increased, SNR and CNR decreased with PI factor. However mSENSE 2 scores were not always significantly different from acquisition without PI. For PASL at 3T, the 32-channel coil at 3T provided better quality than the 12-channel coil. With the 32-channel coil, mSENSE 2 seemed to offer the best compromise for decreasing artifacts without significantly reducing SNR, CNR. Copyright © 2012 Wiley Periodicals, Inc.
A comparison of five benchmarks
NASA Technical Reports Server (NTRS)
Huss, Janice E.; Pennline, James A.
1987-01-01
Five benchmark programs were obtained and run on the NASA Lewis CRAY X-MP/24. A comparison was made between the programs codes and between the methods for calculating performance figures. Several multitasking jobs were run to gain experience in how parallel performance is measured.
DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors.
Schmollinger, Martin; Nieselt, Kay; Kaufmann, Michael; Morgenstern, Burkhard
2004-09-09
Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.
Scalable Domain Decomposed Monte Carlo Particle Transport
NASA Astrophysics Data System (ADS)
O'Brien, Matthew Joseph
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation. The main algorithms we consider are: • Domain decomposition of constructive solid geometry: enables extremely large calculations in which the background geometry is too large to fit in the memory of a single computational node. • Load Balancing: keeps the workload per processor as even as possible so the calculation runs efficiently. • Global Particle Find: if particles are on the wrong processor, globally resolve their locations to the correct processor based on particle coordinate and background domain. • Visualizing constructive solid geometry, sourcing particles, deciding that particle streaming communication is completed and spatial redecomposition. These algorithms are some of the most important parallel algorithms required for domain decomposed Monte Carlo particle transport. We demonstrate that our previous algorithms were not scalable, prove that our new algorithms are scalable, and run some of the algorithms up to 2 million MPI processes on the Sequoia supercomputer.
NASA Astrophysics Data System (ADS)
Calafiura, Paolo; Leggett, Charles; Seuster, Rolf; Tsulaia, Vakhtang; Van Gemmeren, Peter
2015-12-01
AthenaMP is a multi-process version of the ATLAS reconstruction, simulation and data analysis framework Athena. By leveraging Linux fork and copy-on-write mechanisms, it allows for sharing of memory pages between event processors running on the same compute node with little to no change in the application code. Originally targeted to optimize the memory footprint of reconstruction jobs, AthenaMP has demonstrated that it can reduce the memory usage of certain configurations of ATLAS production jobs by a factor of 2. AthenaMP has also evolved to become the parallel event-processing core of the recently developed ATLAS infrastructure for fine-grained event processing (Event Service) which allows the running of AthenaMP inside massively parallel distributed applications on hundreds of compute nodes simultaneously. We present the architecture of AthenaMP, various strategies implemented by AthenaMP for scheduling workload to worker processes (for example: Shared Event Queue and Shared Distributor of Event Tokens) and the usage of AthenaMP in the diversity of ATLAS event processing workloads on various computing resources: Grid, opportunistic resources and HPC.
1060-nm VCSEL-based parallel-optical modules for optical interconnects
NASA Astrophysics Data System (ADS)
Nishimura, N.; Nagashima, K.; Kise, T.; Rizky, A. F.; Uemura, T.; Nekado, Y.; Ishikawa, Y.; Nasu, H.
2015-03-01
The capability of mounting a parallel-optical module onto a PCB through solder-reflow process contributes to reduce the number of piece parts, simplify its assembly process, and minimize a foot print for both AOC and on-board applications. We introduce solder-reflow-capable parallel-optical modules employing 1060-nm InGaAs/GaAs VCSEL which leads to the advantages of realizing wider modulation bandwidth, longer transmission distance, and higher reliability. We demonstrate 4-channel parallel optical link performance operated at a bit stream of 28 Gb/s 231-1 PRBS for each channel and transmitted through a 50-μm-core MMF beyond 500 m. We also introduce a new mounting technology of paralleloptical module to realize maintaining good coupling and robust electrical connection during solder-reflow process between an optical module and a polymer-waveguide-embedded PCB.
One-dimensional acoustic standing waves in rectangular channels for flow cytometry.
Austin Suthanthiraraj, Pearlson P; Piyasena, Menake E; Woods, Travis A; Naivar, Mark A; Lόpez, Gabriel P; Graves, Steven W
2012-07-01
Flow cytometry has become a powerful analytical tool for applications ranging from blood diagnostics to high throughput screening of molecular assemblies on microsphere arrays. However, instrument size, expense, throughput, and consumable use limit its use in resource poor areas of the world, as a component in environmental monitoring, and for detection of very rare cell populations. For these reasons, new technologies to improve the size and cost-to-performance ratio of flow cytometry are required. One such technology is the use of acoustic standing waves that efficiently concentrate cells and particles to the center of flow channels for analysis. The simplest form of this method uses one-dimensional acoustic standing waves to focus particles in rectangular channels. We have developed one-dimensional acoustic focusing flow channels that can be fabricated in simple capillary devices or easily microfabricated using photolithography and deep reactive ion etching. Image and video analysis demonstrates that these channels precisely focus single flowing streams of particles and cells for traditional flow cytometry analysis. Additionally, use of standing waves with increasing harmonics and in parallel microfabricated channels is shown to effectively create many parallel focused streams. Furthermore, we present the fabrication of an inexpensive optical platform for flow cytometry in rectangular channels and use of the system to provide precise analysis. The simplicity and low-cost of the acoustic focusing devices developed here promise to be effective for flow cytometers that have reduced size, cost, and consumable use. Finally, the straightforward path to parallel flow streams using one-dimensional multinode acoustic focusing, indicates that simple acoustic focusing in rectangular channels may also have a prominent role in high-throughput flow cytometry. Copyright © 2012 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Li, Husheng; Betz, Sharon M.; Poor, H. Vincent
2007-05-01
This paper examines the performance of decision feedback based iterative channel estimation and multiuser detection in channel coded aperiodic DS-CDMA systems operating over multipath fading channels. First, explicit expressions describing the performance of channel estimation and parallel interference cancellation based multiuser detection are developed. These results are then combined to characterize the evolution of the performance of a system that iterates among channel estimation, multiuser detection and channel decoding. Sufficient conditions for convergence of this system to a unique fixed point are developed.
Active parallel redundancy for electronic integrator-type control circuits
NASA Technical Reports Server (NTRS)
Peterson, R. A.
1971-01-01
Circuit extends concept of redundant feedback control from type-0 to type-1 control systems. Inactive channels are slaves to the active channel, if latter fails, it is rejected and slave channel is activated. High reliability and elimination of single-component catastrophic failure are important in closed-loop control systems.
Parallel Evolution of Sperm Hyper-Activation Ca2+ Channels.
Cooper, Jacob C; Phadnis, Nitin
2017-07-01
Sperm hyper-activation is a dramatic change in sperm behavior where mature sperm burst into a final sprint in the race to the egg. The mechanism of sperm hyper-activation in many metazoans, including humans, consists of a jolt of Ca2+ into the sperm flagellum via CatSper ion channels. Surprisingly, all nine CatSper genes have been independently lost in several animal lineages. In Drosophila, sperm hyper-activation is performed through the cooption of the polycystic kidney disease 2 (pkd2) Ca2+ channel. The parallels between CatSpers in primates and pkd2 in Drosophila provide a unique opportunity to examine the molecular evolution of the sperm hyper-activation machinery in two independent, nonhomologous calcium channels separated by > 500 million years of divergence. Here, we use a comprehensive phylogenomic approach to investigate the selective pressures on these sperm hyper-activation channels. First, we find that the entire CatSper complex evolves rapidly under recurrent positive selection in primates. Second, we find that pkd2 has parallel patterns of adaptive evolution in Drosophila. Third, we show that this adaptive evolution of pkd2 is driven by its role in sperm hyper-activation. These patterns of selection suggest that the evolution of the sperm hyper-activation machinery is driven by sexual conflict with antagonistic ligands that modulate channel activity. Together, our results add sperm hyper-activation channels to the class of fast evolving reproductive proteins and provide insights into the mechanisms used by the sexes to manipulate sperm behavior. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Parallel computing for automated model calibration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burke, John S.; Danielson, Gary R.; Schulz, Douglas A.
2002-07-29
Natural resources model calibration is a significant burden on computing and staff resources in modeling efforts. Most assessments must consider multiple calibration objectives (for example magnitude and timing of stream flow peak). An automated calibration process that allows real time updating of data/models, allowing scientists to focus effort on improving models is needed. We are in the process of building a fully featured multi objective calibration tool capable of processing multiple models cheaply and efficiently using null cycle computing. Our parallel processing and calibration software routines have been generically, but our focus has been on natural resources model calibration. Somore » far, the natural resources models have been friendly to parallel calibration efforts in that they require no inter-process communication, only need a small amount of input data and only output a small amount of statistical information for each calibration run. A typical auto calibration run might involve running a model 10,000 times with a variety of input parameters and summary statistical output. In the past model calibration has been done against individual models for each data set. The individual model runs are relatively fast, ranging from seconds to minutes. The process was run on a single computer using a simple iterative process. We have completed two Auto Calibration prototypes and are currently designing a more feature rich tool. Our prototypes have focused on running the calibration in a distributed computing cross platform environment. They allow incorporation of?smart? calibration parameter generation (using artificial intelligence processing techniques). Null cycle computing similar to SETI@Home has also been a focus of our efforts. This paper details the design of the latest prototype and discusses our plans for the next revision of the software.« less
VCSELs for datacom applications
NASA Astrophysics Data System (ADS)
Wipiejewski, Torsten; Wolf, Hans-Dieter; Korte, Lutz; Huber, Wolfgang; Kristen, Guenter; Hoyler, Charlotte; Hedrich, Harald; Kleinbub, Oliver; Albrecht, Tony; Mueller, Juergen; Orth, Andreas; Spika, Zeljko; Lutgen, Stephan; Pflaeging, Hartwig; Harrasser, Joerg; Droegemueller, Karsten; Plickert, Volker; Kuhl, Detlef; Blank, Juergen; Pietsch, Doris; Stange, Herwig; Karstensen, Holger
1999-04-01
The use of oxide confined VCSELs in datacom applications is demonstrated. The devices exhibit low threshold currents of approximately 3 mA and low electrical series resistance of about 50 (Omega) . The emission wavelength is in the 850 nm range. Life times of the devices are several million hours under normal operating conditions. VCSEL arrays are employed in a high performance parallel optical link called PAROLITM. This optical ink provides 12 parallel channels with a total bandwidth exceeding 12 Gbit/s. The VCSELs optimized for the parallel optical link show excellent threshold current uniformity between channels of < 50 (mu) A. The array life time drops compared to a single device, but is still larger than 1 million hours.
Processes Leading to Beaded Channels Formation in Central Yakutia
NASA Astrophysics Data System (ADS)
Tarbeeva, A. M.; Lebedeva, L.; Efremov, V. S.; Krylenko, I. V.; Surkov, V. V.
2017-12-01
Beaded channels, consisting of deepened and widened pools and connecting narrow runs, are common fluvial forms in permafrost regions. Recent studies have shown that beaded channels are very important for connecting alluvial rivers with headwater lakes allowing fish passage and foraging habitats, as well as regulating river runoff. Beaded channels are known as typical thermokarst landforms; however, there is no evidence of their origin and formative processes. Geomorphological analyzes of beaded channels have been completed in several permafrost regions including field observations of Shestakovka River in Central Yakutia. The study aims to recognize the modern exogenic processes and formative mechanisms of beaded river channels. We show that beaded channel of Shestakovka River form in the perennially frozen sand with low ice content, leading us to hypothesize that thermokarst is not the main process of formation. Due to the significant volume of water, the pools don't freeze over entirely during winters, even under harsh climatic conditions. As a result, lenses of pressurized water remain under surface ice underlain by perennially thawed sediments. The presence of thawed sediments under the pools and frozen sediments under the runs leads to uneven thermoerosion of the riverbed during floods, providing the beaded form of the channel. In addition, freezing of pools during winter leads to pressure increasing under ice cover and formation of ice mounds, which crack several times during winter leading to disturbance of riverbanks. Many 1st to 3rd order streams have a specific transitional meandering-to-beaded form resembling the shape of unconfined meandering rivers, but consisting of pools and runs. However, such channels exhibit no evidences of present-day erosion of concave banks and sediment accumulation at the convex banks as typically being observed in normally meandering rivers. Such forms of channels indicates that their formation occurred by the greater channel-forming flow discharges in the past. Transition to the beaded channel planform took place only later, presumably as a result of climate changes. Reduction of water runoff and freezing over of taliks leaded to activation of cryogenic processes (thermokarst, uneven thermoerosion, disturbance of riverbanks during the cracking of ice mounds).
Experimental demonstration of subcarrier multiplexed quantum key distribution system.
Mora, José; Ruiz-Alba, Antonio; Amaya, Waldimar; Martínez, Alfonso; García-Muñoz, Víctor; Calvo, David; Capmany, José
2012-06-01
We provide, to our knowledge, the first experimental demonstration of the feasibility of sending several parallel keys by exploiting the technique of subcarrier multiplexing (SCM) widely employed in microwave photonics. This approach brings several advantages such as high spectral efficiency compatible with the actual secure key rates, the sharing of the optical fainted pulse by all the quantum multiplexed channels reducing the system complexity, and the possibility of upgrading with wavelength division multiplexing in a two-tier scheme, to increase the number of parallel keys. Two independent quantum SCM channels featuring a sifted key rate of 10 Kb/s/channel over a link with quantum bit error rate <2% is reported.
An information theory of image gathering
NASA Technical Reports Server (NTRS)
Fales, Carl L.; Huck, Friedrich O.
1991-01-01
Shannon's mathematical theory of communication is extended to image gathering. Expressions are obtained for the total information that is received with a single image-gathering channel and with parallel channels. It is concluded that the aliased signal components carry information even though these components interfere with the within-passband components in conventional image gathering and restoration, thereby degrading the fidelity and visual quality of the restored image. An examination of the expression for minimum mean-square-error, or Wiener-matrix, restoration from parallel image-gathering channels reveals a method for unscrambling the within-passband and aliased signal components to restore spatial frequencies beyond the sampling passband out to the spatial frequency response cutoff of the optical aperture.
Using the Parallel Computing Toolbox with MATLAB on the Peregrine System |
parallel pool took %g seconds.\\n', toc) % "single program multiple data" spmd fprintf('Worker %d says Hello World!\\n', labindex) end delete(gcp); % close the parallel pool exit To run the script on a compute node, create the file helloWorld.sub: #!/bin/bash #PBS -l walltime=05:00 #PBS -l nodes=1 #PBS -N
NASA Astrophysics Data System (ADS)
Sullivan, C.; Good, R. G. R.; Binns, A. D.
2017-12-01
Sediment transport processes in streams provides valuable insight into the temporal evolution of planform and bedform geometry. The majority of previous experimental research in the literature has focused on bedload transport and corresponding bedform development in rectangular, confined channels, which does not consider planform adjustment processes in streams. In contrast, research conducted with laboratory streams having movable banks can investigate planform development in addition to bedform development, which is more representative of natural streams. The goal of this research is to explore the relationship between bedload transport rates and the morphological adjustments in meandering streams. To accomplish this, a series of experimental runs were conducted in a 5.6 m by 1.9 m river basin flume at the University of Guelph to analyze the bedload impacts on bed formations and planform adjustments in response to varying flow conditions. In total, three experimental runs were conducted: two runs using steady state conditions and one run using unsteady flow conditions in the form of a symmetrical hydrograph implementing quasi steady state flow. The runs were performed in a series of time-steps in order to monitor the evolution of the stream morphology and the bedload transport rates. Structure from motion (SfM) was utilized to capture the channel morphology after each time-step, and Agisoft PhotoScan software was used to produce digital elevation models to analyze the morphological evolution of the channel with time. Bedload transport rates were quantified using a sediment catch at the end of the flume. Although total flow volumes were similar for each run, the morphological evolution and bedload transport rates in each run varied. The observed bedload transport rates from the flume are compared with existing bedload transport formulas to assess their accuracy with respect to sediment transport in unconfined meandering channels. The measured sediment transport rates varied from the existing equations, which can be attributed to the sediment characteristics, planform morphology and bed formations. The results from this research provide greater knowledge of morphological processes in natural meandering streams to improve the capabilities of computational modelling and river engineering practice.
Results of closed cycle MHD power generation test with a helium-cesium working fluid
NASA Technical Reports Server (NTRS)
Sovie, R. J.
1977-01-01
The cross sectional dimensions of the MHD channel in the NASA Lewis closed loop facility were reduced to 3.8 x 11.4 cm. Tests were run in this channel using a helium-cesium working fluid at stagnation pressures of 160,000 n/M2, stagnation temperatures of 2000-2060 K and an entrance Mach number of 0.36. In these tests Faraday open circuit voltages of 200 V were measured which correspond to a Faraday field of 1750 V/M. Power generation tests were run for different groups of electrode configurations and channel lengths. Hall fields up to 1450 V/M were generated. Power extraction per electrode of 183 W and power densities of 1.7 MW/M3 were obtained. A total power output of 2 kW was generated for tests with 14 electrodes. The power densities obtained in this channel represent a factor of 3 improvement over those previously reported for the M = 0.2 channel.
Increasing airport capacity with modified IFR approach procedures for close-spaced parallel runways
DOT National Transportation Integrated Search
2001-01-01
Because of wake turbulence considerations, current instrument approach : procedures treat close-spaced (i.e., less than 2,500 feet apart) parallel run : ways as a single runway. This restriction is designed to assure safety for all : aircraft types u...
Parallel Computation of the Regional Ocean Modeling System (ROMS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, P; Song, Y T; Chao, Y
2005-04-05
The Regional Ocean Modeling System (ROMS) is a regional ocean general circulation modeling system solving the free surface, hydrostatic, primitive equations over varying topography. It is free software distributed world-wide for studying both complex coastal ocean problems and the basin-to-global scale ocean circulation. The original ROMS code could only be run on shared-memory systems. With the increasing need to simulate larger model domains with finer resolutions and on a variety of computer platforms, there is a need in the ocean-modeling community to have a ROMS code that can be run on any parallel computer ranging from 10 to hundreds ofmore » processors. Recently, we have explored parallelization for ROMS using the MPI programming model. In this paper, an efficient parallelization strategy for such a large-scale scientific software package, based on an existing shared-memory computing model, is presented. In addition, scientific applications and data-performance issues on a couple of SGI systems, including Columbia, the world's third-fastest supercomputer, are discussed.« less
Besnier, Francois; Glover, Kevin A.
2013-01-01
This software package provides an R-based framework to make use of multi-core computers when running analyses in the population genetics program STRUCTURE. It is especially addressed to those users of STRUCTURE dealing with numerous and repeated data analyses, and who could take advantage of an efficient script to automatically distribute STRUCTURE jobs among multiple processors. It also consists of additional functions to divide analyses among combinations of populations within a single data set without the need to manually produce multiple projects, as it is currently the case in STRUCTURE. The package consists of two main functions: MPI_structure() and parallel_structure() as well as an example data file. We compared the performance in computing time for this example data on two computer architectures and showed that the use of the present functions can result in several-fold improvements in terms of computation time. ParallelStructure is freely available at https://r-forge.r-project.org/projects/parallstructure/. PMID:23923012
NASA Astrophysics Data System (ADS)
Gardiner, B. L.; Thomson, D. J.
2006-12-01
Starting with the designs of earlier solar radio telescopes, particularly the one at Bell Labs, Murray Hill, we have built a new instrument. The major differences between this telescope and its predecessors are that it has: 1) parallel low and high gain channels for both polarizations; 2) four additional channels for active interference cancellation; and 3) all eight IF strips terminating in 100 MHz, 14--bit analog--to--digital converters with synchronized sampling. The advantages of such a configuration are: a) The parallel low and high gain channels allow a higher dynamic range without saturating than a single channel. b) Estimating bispectra between the channels gives a sensitive test for saturation in the higher gain channel. c) In the usual case, when both channels are in their linear region, one can use them with a noise injection diode to track the amplifier noise figures. d) With the noise diode off, the two channels can be used in a mode similar to remote reference. As the telescope is operating in a small city we anticipate that more than 90% of the measurements will be contaminated by various communications signals and impulsive noise. Thus all the signal processing will build on various robust statistical procedures that have proven effective in other applications. The best mode of operating the four active interference cancelling channels is still under study
Encoding methods for B1+ mapping in parallel transmit systems at ultra high field
NASA Astrophysics Data System (ADS)
Tse, Desmond H. Y.; Poole, Michael S.; Magill, Arthur W.; Felder, Jörg; Brenner, Daniel; Jon Shah, N.
2014-08-01
Parallel radiofrequency (RF) transmission, either in the form of RF shimming or pulse design, has been proposed as a solution to the B1+ inhomogeneity problem in ultra high field magnetic resonance imaging. As a prerequisite, accurate B1+ maps from each of the available transmit channels are required. In this work, four different encoding methods for B1+ mapping, namely 1-channel-on, all-channels-on-except-1, all-channels-on-1-inverted and Fourier phase encoding, were evaluated using dual refocusing acquisition mode (DREAM) at 9.4 T. Fourier phase encoding was demonstrated in both phantom and in vivo to be the least susceptible to artefacts caused by destructive RF interference at 9.4 T. Unlike the other two interferometric encoding schemes, Fourier phase encoding showed negligible dependency on the initial RF phase setting and therefore no prior B1+ knowledge is required. Fourier phase encoding also provides a flexible way to increase the number of measurements to increase SNR, and to allow further reduction of artefacts by weighted decoding. These advantages of Fourier phase encoding suggest that it is a good choice for B1+ mapping in parallel transmit systems at ultra high field.
Memory access in shared virtual memory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berrendorf, R.
1992-01-01
Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.
Memory access in shared virtual memory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berrendorf, R.
1992-09-01
Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.
Twelve Channel Optical Fiber Connector Assembly: From Commercial Off the Shelf to Space Flight Use
NASA Technical Reports Server (NTRS)
Ott, Melaine N.
1998-01-01
The commercial off the shelf (COTS) twelve channel optical fiber MTP array connector and ribbon cable assembly is being validated for space flight use and the results of this study to date are presented here. The interconnection system implemented for the Parallel Fiber Optic Data Bus (PFODB) physical layer will include a 100/140 micron diameter optical fiber in the cable configuration among other enhancements. As part of this investigation, the COTS 62.5/125 microns optical fiber cable assembly has been characterized for space environment performance as a baseline for improving the performance of the 100/140 micron diameter ribbon cable for the Parallel FODB application. Presented here are the testing and results of random vibration and thermal environmental characterization of this commercial off the shelf (COTS) MTP twelve channel ribbon cable assembly. This paper is the first in a series of papers which will characterize and document the performance of Parallel FODB's physical layer from COTS to space flight worthy.
SISYPHUS: A high performance seismic inversion factory
NASA Astrophysics Data System (ADS)
Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas
2016-04-01
In the recent years the massively parallel high performance computers became the standard instruments for solving the forward and inverse problems in seismology. The respective software packages dedicated to forward and inverse waveform modelling specially designed for such computers (SPECFEM3D, SES3D) became mature and widely available. These packages achieve significant computational performance and provide researchers with an opportunity to solve problems of bigger size at higher resolution within a shorter time. However, a typical seismic inversion process contains various activities that are beyond the common solver functionality. They include management of information on seismic events and stations, 3D models, observed and synthetic seismograms, pre-processing of the observed signals, computation of misfits and adjoint sources, minimization of misfits, and process workflow management. These activities are time consuming, seldom sufficiently automated, and therefore represent a bottleneck that can substantially offset performance benefits provided by even the most powerful modern supercomputers. Furthermore, a typical system architecture of modern supercomputing platforms is oriented towards the maximum computational performance and provides limited standard facilities for automation of the supporting activities. We present a prototype solution that automates all aspects of the seismic inversion process and is tuned for the modern massively parallel high performance computing systems. We address several major aspects of the solution architecture, which include (1) design of an inversion state database for tracing all relevant aspects of the entire solution process, (2) design of an extensible workflow management framework, (3) integration with wave propagation solvers, (4) integration with optimization packages, (5) computation of misfits and adjoint sources, and (6) process monitoring. The inversion state database represents a hierarchical structure with branches for the static process setup, inversion iterations, and solver runs, each branch specifying information at the event, station and channel levels. The workflow management framework is based on an embedded scripting engine that allows definition of various workflow scenarios using a high-level scripting language and provides access to all available inversion components represented as standard library functions. At present the SES3D wave propagation solver is integrated in the solution; the work is in progress for interfacing with SPECFEM3D. A separate framework is designed for interoperability with an optimization module; the workflow manager and optimization process run in parallel and cooperate by exchanging messages according to a specially designed protocol. A library of high-performance modules implementing signal pre-processing, misfit and adjoint computations according to established good practices is included. Monitoring is based on information stored in the inversion state database and at present implements a command line interface; design of a graphical user interface is in progress. The software design fits well into the common massively parallel system architecture featuring a large number of computational nodes running distributed applications under control of batch-oriented resource managers. The solution prototype has been implemented on the "Piz Daint" supercomputer provided by the Swiss Supercomputing Centre (CSCS).
Variants of Independence in the Perception of Facial Identity and Expression
ERIC Educational Resources Information Center
Fitousi, Daniel; Wenger, Michael J.
2013-01-01
A prominent theory in the face perception literature--the parallel-route hypothesis (Bruce & Young, 1986)--assumes a dedicated channel for the processing of identity that is separate and independent from the channel(s) in which nonidentity information is processed (e.g., expression, eye gaze). The current work subjected this assumption to…
Lee, Pil Hyong; Hwang, Sang Soon
2009-01-01
In fuel cells flow configuration and operating conditions such as cell temperature, humidity at each electrode and stoichiometric number are very crucial for improving performance. Too many flow channels could enhance the performance but result in high parasite loss. Therefore a trade-off between pressure drop and efficiency of a fuel cell should be considered for optimum design. This work focused on numerical simulation of the effects of operating conditions, especially cathode humidity, with simple micro parallel flow channels. It is known that the humidity at the cathode flow channel becomes very important for enhancing the ion conductivity of polymer membrane because fully humidified condition was normally set at anode. To investigate the effect of humidity on the performance of a fuel cell, in this study humidification was set to 100% at the anode flow channel and was changed by 0–100% at the cathode flow channel. Results showed that the maximum power density could be obtained under 60% humidified condition at the cathode where oxygen concentration was moderately high while maintaining high ion conductivity at a membrane. PMID:22291556
Lee, Pil Hyong; Hwang, Sang Soon
2009-01-01
In fuel cells flow configuration and operating conditions such as cell temperature, humidity at each electrode and stoichiometric number are very crucial for improving performance. Too many flow channels could enhance the performance but result in high parasite loss. Therefore a trade-off between pressure drop and efficiency of a fuel cell should be considered for optimum design. This work focused on numerical simulation of the effects of operating conditions, especially cathode humidity, with simple micro parallel flow channels. It is known that the humidity at the cathode flow channel becomes very important for enhancing the ion conductivity of polymer membrane because fully humidified condition was normally set at anode. To investigate the effect of humidity on the performance of a fuel cell, in this study humidification was set to 100% at the anode flow channel and was changed by 0-100% at the cathode flow channel. Results showed that the maximum power density could be obtained under 60% humidified condition at the cathode where oxygen concentration was moderately high while maintaining high ion conductivity at a membrane.
Darcy Flow in a Wavy Channel Filled with a Porous Medium
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gray, Donald D; Ogretim, Egemen; Bromhal, Grant S
2013-05-17
Flow in channels bounded by wavy or corrugated walls is of interest in both technological and geological contexts. This paper presents an analytical solution for the steady Darcy flow of an incompressible fluid through a homogeneous, isotropic porous medium filling a channel bounded by symmetric wavy walls. This packed channel may represent an idealized packed fracture, a situation which is of interest as a potential pathway for the leakage of carbon dioxide from a geological sequestration site. The channel walls change from parallel planes, to small amplitude sine waves, to large amplitude nonsinusoidal waves as certain parameters are increased. Themore » direction of gravity is arbitrary. A plot of piezometric head against distance in the direction of mean flow changes from a straight line for parallel planes to a series of steeply sloping sections in the reaches of small aperture alternating with nearly constant sections in the large aperture bulges. Expressions are given for the stream function, specific discharge, piezometric head, and pressure.« less
Drug innovation, price controls, and parallel trade.
Matteucci, Giorgio; Reverberi, Pierfrancesco
2016-12-21
We study the long-run welfare effects of parallel trade (PT) in pharmaceuticals. We develop a two-country model of PT with endogenous quality, where the pharmaceutical firm negotiates the price of the drug with the government in the foreign country. We show that, even though the foreign government does not consider global R&D costs, (the threat of) PT improves the quality of the drug as long as the foreign consumers' valuation of quality is high enough. We find that the firm's short-run profit may be higher when PT is allowed. Nonetheless, this is neither necessary nor sufficient for improving drug quality in the long run. We also show that improving drug quality is a sufficient condition for PT to increase global welfare. Finally, we show that, when PT is allowed, drug quality may be higher with than without price controls.
Product selectivity control induced by using liquid-liquid parallel laminar flow in a microreactor.
Amemiya, Fumihiro; Matsumoto, Hideyuki; Fuse, Keishi; Kashiwagi, Tsuneo; Kuroda, Chiaki; Fuchigami, Toshio; Atobe, Mahito
2011-06-07
Product selectivity control based on a liquid-liquid parallel laminar flow has been successfully demonstrated by using a microreactor. Our electrochemical microreactor system enables regioselective cross-coupling reaction of aldehyde with allylic chloride via chemoselective cathodic reduction of substrate by the combined use of suitable flow mode and corresponding cathode material. The formation of liquid-liquid parallel laminar flow in the microreactor was supported by the estimation of benzaldehyde diffusion coefficient and computational fluid dynamics simulation. The diffusion coefficient for benzaldehyde in Bu(4)NClO(4)-HMPA medium was determined to be 1.32 × 10(-7) cm(2) s(-1) by electrochemical measurements, and the flow simulation using this value revealed the formation of clear concentration gradient of benzaldehyde in the microreactor channel over a specific channel length. In addition, the necessity of the liquid-liquid parallel laminar flow was confirmed by flow mode experiments.
Learning and Parallelization Boost Constraint Search
ERIC Educational Resources Information Center
Yun, Xi
2013-01-01
Constraint satisfaction problems are a powerful way to abstract and represent academic and real-world problems from both artificial intelligence and operations research. A constraint satisfaction problem is typically addressed by a sequential constraint solver running on a single processor. Rather than construct a new, parallel solver, this work…
Lee, Si Hoon; Lindquist, Nathan C.; Wittenberg, Nathan J.; Jordan, Luke R.; Oh, Sang-Hyun
2012-01-01
With recent advances in high-throughput proteomics and systems biology, there is a growing demand for new instruments that can precisely quantify a wide range of receptor-ligand binding kinetics in a high-throughput fashion. Here we demonstrate a surface plasmon resonance (SPR) imaging spectroscopy instrument capable of extracting binding kinetics and affinities from 50 parallel microfluidic channels simultaneously. The instrument utilizes large-area (~cm2) metallic nanohole arrays as SPR sensing substrates and combines a broadband light source, a high-resolution imaging spectrometer and a low-noise CCD camera to extract spectral information from every channel in real time with a refractive index resolution of 7.7 × 10−6. To demonstrate the utility of our instrument for quantifying a wide range of biomolecular interactions, each parallel microfluidic channel is coated with a biomimetic supported lipid membrane containing ganglioside (GM1) receptors. The binding kinetics of cholera toxin b (CTX-b) to GM1 are then measured in a single experiment from 50 channels. By combining the highly parallel microfluidic device with large-area periodic nanohole array chips, our SPR imaging spectrometer system enables high-throughput, label-free, real-time SPR biosensing, and its full-spectral imaging capability combined with nanohole arrays could enable integration of SPR imaging with concurrent surface-enhanced Raman spectroscopy. PMID:22895607
The Automated Instrumentation and Monitoring System (AIMS) reference manual
NASA Technical Reports Server (NTRS)
Yan, Jerry; Hontalas, Philip; Listgarten, Sherry
1993-01-01
Whether a researcher is designing the 'next parallel programming paradigm,' another 'scalable multiprocessor' or investigating resource allocation algorithms for multiprocessors, a facility that enables parallel program execution to be captured and displayed is invaluable. Careful analysis of execution traces can help computer designers and software architects to uncover system behavior and to take advantage of specific application characteristics and hardware features. A software tool kit that facilitates performance evaluation of parallel applications on multiprocessors is described. The Automated Instrumentation and Monitoring System (AIMS) has four major software components: a source code instrumentor which automatically inserts active event recorders into the program's source code before compilation; a run time performance-monitoring library, which collects performance data; a trace file animation and analysis tool kit which reconstructs program execution from the trace file; and a trace post-processor which compensate for data collection overhead. Besides being used as prototype for developing new techniques for instrumenting, monitoring, and visualizing parallel program execution, AIMS is also being incorporated into the run-time environments of various hardware test beds to evaluate their impact on user productivity. Currently, AIMS instrumentors accept FORTRAN and C parallel programs written for Intel's NX operating system on the iPSC family of multi computers. A run-time performance-monitoring library for the iPSC/860 is included in this release. We plan to release monitors for other platforms (such as PVM and TMC's CM-5) in the near future. Performance data collected can be graphically displayed on workstations (e.g. Sun Sparc and SGI) supporting X-Windows (in particular, Xl IR5, Motif 1.1.3).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dritz, K.W.; Boyle, J.M.
This paper addresses the problem of measuring and analyzing the performance of fine-grained parallel programs running on shared-memory multiprocessors. Such processors use locking (either directly in the application program, or indirectly in a subroutine library or the operating system) to serialize accesses to global variables. Given sufficiently high rates of locking, the chief factor preventing linear speedup (besides lack of adequate inherent parallelism in the application) is lock contention - the blocking of processes that are trying to acquire a lock currently held by another process. We show how a high-resolution, low-overhead clock may be used to measure both lockmore » contention and lack of parallel work. Several ways of presenting the results are covered, culminating in a method for calculating, in a single multiprocessing run, both the speedup actually achieved and the speedup lost to contention for each lock and to lack of parallel work. The speedup losses are reported in the same units, ''processor-equivalents,'' as the speedup achieved. Both are obtained without having to perform the usual one-process comparison run. We chronicle also a variety of experiments motivated by actual results obtained with our measurement method. The insights into program performance that we gained from these experiments helped us to refine the parts of our programs concerned with communication and synchronization. Ultimately these improvements reduced lock contention to a negligible amount and yielded nearly linear speedup in applications not limited by lack of parallel work. We describe two generally applicable strategies (''code motion out of critical regions'' and ''critical-region fissioning'') for reducing lock contention and one (''lock/variable fusion'') applicable only on certain architectures.« less
Visualization of Octree Adaptive Mesh Refinement (AMR) in Astrophysical Simulations
NASA Astrophysics Data System (ADS)
Labadens, M.; Chapon, D.; Pomaréde, D.; Teyssier, R.
2012-09-01
Computer simulations are important in current cosmological research. Those simulations run in parallel on thousands of processors, and produce huge amount of data. Adaptive mesh refinement is used to reduce the computing cost while keeping good numerical accuracy in regions of interest. RAMSES is a cosmological code developed by the Commissariat à l'énergie atomique et aux énergies alternatives (English: Atomic Energy and Alternative Energies Commission) which uses Octree adaptive mesh refinement. Compared to grid based AMR, the Octree AMR has the advantage to fit very precisely the adaptive resolution of the grid to the local problem complexity. However, this specific octree data type need some specific software to be visualized, as generic visualization tools works on Cartesian grid data type. This is why the PYMSES software has been also developed by our team. It relies on the python scripting language to ensure a modular and easy access to explore those specific data. In order to take advantage of the High Performance Computer which runs the RAMSES simulation, it also uses MPI and multiprocessing to run some parallel code. We would like to present with more details our PYMSES software with some performance benchmarks. PYMSES has currently two visualization techniques which work directly on the AMR. The first one is a splatting technique, and the second one is a custom ray tracing technique. Both have their own advantages and drawbacks. We have also compared two parallel programming techniques with the python multiprocessing library versus the use of MPI run. The load balancing strategy has to be smartly defined in order to achieve a good speed up in our computation. Results obtained with this software are illustrated in the context of a massive, 9000-processor parallel simulation of a Milky Way-like galaxy.
Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael
2012-06-01
We present l₁-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative self-consistent parallel imaging (SPIRiT). Like many iterative magnetic resonance imaging reconstructions, l₁-SPIRiT's image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing l₁-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of l₁-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT spoiled gradient echo (SPGR) sequence with up to 8× acceleration via Poisson-disc undersampling in the two phase-encoded directions.
Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael
2012-01-01
We present ℓ1-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the Wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative Self-Consistent Parallel Imaging (SPIRiT). Like many iterative MRI reconstructions, ℓ1-SPIRiT’s image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing ℓ1-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of ℓ1-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT Spoiled Gradient Echo (SPGR) sequence with up to 8× acceleration via poisson-disc undersampling in the two phase-encoded directions. PMID:22345529
Can parallel use of different running shoes decrease running-related injury risk?
Malisoux, L; Ramesh, J; Mann, R; Seil, R; Urhausen, A; Theisen, D
2015-02-01
The aim of this study was to determine if runners who use concomitantly different pairs of running shoes are at a lower risk of running-related injury (RRI). Recreational runners (n = 264) participated in this 22-week prospective follow-up and reported all information about their running session characteristics, other sport participation and injuries on a dedicated Internet platform. A RRI was defined as a physical pain or complaint located at the lower limbs or lower back region, sustained during or as a result of running practice and impeding planned running activity for at least 1 day. One-third of the participants (n = 87) experienced at least one RRI during the observation period. The adjusted Cox regression analysis revealed that the parallel use of more than one pair of running shoes was a protective factor [hazard ratio (HR) = 0.614; 95% confidence interval (CI) = 0.389-0.969], while previous injury was a risk factor (HR = 1.722; 95%CI = 1.114-2.661). Additionally, increased mean session distance (km; HR = 0.795; 95%CI = 0.725-0.872) and increased weekly volume of other sports (h/week; HR = 0.848; 95%CI = 0.732-0.982) were associated with lower RRI risk. Multiple shoe use and participation in other sports are strategies potentially leading to a variation of the load applied to the musculoskeletal system. They could be advised to recreational runners to prevent RRI. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Cao, Jianfang; Chen, Lichao; Wang, Min; Tian, Yun
2018-01-01
The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator's dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance.
Architectures for reasoning in parallel
NASA Technical Reports Server (NTRS)
Hall, Lawrence O.
1989-01-01
The research conducted has dealt with rule-based expert systems. The algorithms that may lead to effective parallelization of them were investigated. Both the forward and backward chained control paradigms were investigated in the course of this work. The best computer architecture for the developed and investigated algorithms has been researched. Two experimental vehicles were developed to facilitate this research. They are Backpac, a parallel backward chained rule-based reasoning system and Datapac, a parallel forward chained rule-based reasoning system. Both systems have been written in Multilisp, a version of Lisp which contains the parallel construct, future. Applying the future function to a function causes the function to become a task parallel to the spawning task. Additionally, Backpac and Datapac have been run on several disparate parallel processors. The machines are an Encore Multimax with 10 processors, the Concert Multiprocessor with 64 processors, and a 32 processor BBN GP1000. Both the Concert and the GP1000 are switch-based machines. The Multimax has all its processors hung off a common bus. All are shared memory machines, but have different schemes for sharing the memory and different locales for the shared memory. The main results of the investigations come from experiments on the 10 processor Encore and the Concert with partitions of 32 or less processors. Additionally, experiments have been run with a stripped down version of EMYCIN.
The Tera Multithreaded Architecture and Unstructured Meshes
NASA Technical Reports Server (NTRS)
Bokhari, Shahid H.; Mavriplis, Dimitri J.
1998-01-01
The Tera Multithreaded Architecture (MTA) is a new parallel supercomputer currently being installed at San Diego Supercomputing Center (SDSC). This machine has an architecture quite different from contemporary parallel machines. The computational processor is a custom design and the machine uses hardware to support very fine grained multithreading. The main memory is shared, hardware randomized and flat. These features make the machine highly suited to the execution of unstructured mesh problems, which are difficult to parallelize on other architectures. We report the results of a study carried out during July-August 1998 to evaluate the execution of EUL3D, a code that solves the Euler equations on an unstructured mesh, on the 2 processor Tera MTA at SDSC. Our investigation shows that parallelization of an unstructured code is extremely easy on the Tera. We were able to get an existing parallel code (designed for a shared memory machine), running on the Tera by changing only the compiler directives. Furthermore, a serial version of this code was compiled to run in parallel on the Tera by judicious use of directives to invoke the "full/empty" tag bits of the machine to obtain synchronization. This version achieves 212 and 406 Mflop/s on one and two processors respectively, and requires no attention to partitioning or placement of data issues that would be of paramount importance in other parallel architectures.
Parallel Ray Tracing Using the Message Passing Interface
2007-09-01
software is available for lens design and for general optical systems modeling. It tends to be designed to run on a single processor and can be very...Cameron, Senior Member, IEEE Abstract—Ray-tracing software is available for lens design and for general optical systems modeling. It tends to be designed to...National Aeronautics and Space Administration (NASA), optical ray tracing, parallel computing, parallel pro- cessing, prime numbers, ray tracing
PISCES: An environment for parallel scientific computation
NASA Technical Reports Server (NTRS)
Pratt, T. W.
1985-01-01
The parallel implementation of scientific computing environment (PISCES) is a project to provide high-level programming environments for parallel MIMD computers. Pisces 1, the first of these environments, is a FORTRAN 77 based environment which runs under the UNIX operating system. The Pisces 1 user programs in Pisces FORTRAN, an extension of FORTRAN 77 for parallel processing. The major emphasis in the Pisces 1 design is in providing a carefully specified virtual machine that defines the run-time environment within which Pisces FORTRAN programs are executed. Each implementation then provides the same virtual machine, regardless of differences in the underlying architecture. The design is intended to be portable to a variety of architectures. Currently Pisces 1 is implemented on a network of Apollo workstations and on a DEC VAX uniprocessor via simulation of the task level parallelism. An implementation for the Flexible Computing Corp. FLEX/32 is under construction. An introduction to the Pisces 1 virtual computer and the FORTRAN 77 extensions is presented. An example of an algorithm for the iterative solution of a system of equations is given. The most notable features of the design are the provision for several granularities of parallelism in programs and the provision of a window mechanism for distributed access to large arrays of data.
Framework for Parallel Preprocessing of Microarray Data Using Hadoop
2018-01-01
Nowadays, microarray technology has become one of the popular ways to study gene expression and diagnosis of disease. National Center for Biology Information (NCBI) hosts public databases containing large volumes of biological data required to be preprocessed, since they carry high levels of noise and bias. Robust Multiarray Average (RMA) is one of the standard and popular methods that is utilized to preprocess the data and remove the noises. Most of the preprocessing algorithms are time-consuming and not able to handle a large number of datasets with thousands of experiments. Parallel processing can be used to address the above-mentioned issues. Hadoop is a well-known and ideal distributed file system framework that provides a parallel environment to run the experiment. In this research, for the first time, the capability of Hadoop and statistical power of R have been leveraged to parallelize the available preprocessing algorithm called RMA to efficiently process microarray data. The experiment has been run on cluster containing 5 nodes, while each node has 16 cores and 16 GB memory. It compares efficiency and the performance of parallelized RMA using Hadoop with parallelized RMA using affyPara package as well as sequential RMA. The result shows the speed-up rate of the proposed approach outperforms the sequential approach and affyPara approach. PMID:29796018
A real-time MPEG software decoder using a portable message-passing library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kwong, Man Kam; Tang, P.T. Peter; Lin, Biquan
1995-12-31
We present a real-time MPEG software decoder that uses message-passing libraries such as MPL, p4 and MPI. The parallel MPEG decoder currently runs on the IBM SP system but can be easil ported to other parallel machines. This paper discusses our parallel MPEG decoding algorithm as well as the parallel programming environment under which it uses. Several technical issues are discussed, including balancing of decoding speed, memory limitation, 1/0 capacities, and optimization of MPEG decoding components. This project shows that a real-time portable software MPEG decoder is feasible in a general-purpose parallel machine.
5. Aerial view of turnpike path running through center of ...
5. Aerial view of turnpike path running through center of photograph along row of trees. 1917 realignment visible along left edge of photograph along edge of forest. Modernized alignment resumes at top right of photograph. View looking north. - Orange Turnpike, Parallel to new Orange Turnpike, Monroe, Orange County, NY
NASA Astrophysics Data System (ADS)
Dinesh, K. K.; Jayaraj, S.
2008-10-01
Present paper deals with temperature driven mass deposition rate of particles known as thermophoretic wall flux when a hot flue gas in natural convection flow through a cooled isothermal vertical parallel plate channel. Present study finds application in particle filters used to trap soot particles from post combustion gases issuing out of small furnaces with low technical implications. Governing equations are solved using finite difference marching technique with channel inlet values as initial values. Channel heights required to regain hydrostatic pressure at the exit are estimated for various entry velocities. Effect of temperature ratio between wall and gas on thermophoretic wall flux is analysed and wall flux found to increase with decrease in temperature ratio. Results are compared with published works wherever possible and can be used to predict particle deposition rate as well as the conditions favourable for maximum particle deposition rate.
Seismic signals of snow-slurry lahars in motion: 25 September 2007, Mt Ruapehu, New Zealand
NASA Astrophysics Data System (ADS)
Cole, S. E.; Cronin, S. J.; Sherburn, S.; Manville, V.
2009-05-01
Detection of ground shaking forms the basis of many lahar-warning systems. Seismic records of two lahar types at Ruapehu, New Zealand, in 2007 are used to examine their nature and internal dynamics. Upstream detection of a flow depends upon flow type and coupling with the ground. 3-D characteristics of seismic signals can be used to distinguish the dominant rheology and gross physical composition. Water-rich hyperconcentrated flows are turbulent; common inter-particle and particle-substrate collisions engender higher energy in cross-channel vibrations relative to channel-parallel. Plug-like snow-slurry lahars show greater energy in channel-parallel signals, due to lateral deposition insulating channel margins, and low turbulence. Direct comparison of flow size must account for flow rheology; a water-rich lahar will generate signals of greater amplitude than a similar-sized snow-slurry flow.
Tonomura, W; Moriguchi, H; Jimbo, Y; Konishi, S
2008-01-01
This paper describes an advanced Micro Channel Array (MCA) so as to record neuronal network at multiple points simultaneously. Developed MCA is designed for neuronal network analysis which has been studied by co-authors using MEA (Micro Electrode Arrays) system. The MCA employs the principle of the extracellular recording. Presented MCA has the following advantages. First of all, the electrodes integrated around individual micro channels are electrically isolated for parallel multipoint recording. Sucking and clamping of cells through micro channels is expected to improve the cellular selectivity and S/N ratio. In this study, hippocampal neurons were cultured on the developed MCA. As a result, the spontaneous and evoked spike potential could be recorded by sucking and clamping the cells at multiple points. Herein, we describe the successful experimental results together with the design and fabrication of the advanced MCA toward on-chip analysis of neuronal network.
Monson, H.O.
1961-01-24
A radiator-type fuel block assembly is described. It has a hexagonal body of neutron fissionable material having a plurality of longitudinal equal- spaced coolant channels therein aligned in rows parallel to each face of the hexagonal body. Each of these coolant channels is hexagonally shaped with the corners rounded and enlarged and the assembly has a maximum temperature isothermal line around each channel which is approximately straight and equidistant between adjacent channels.
Polarization division multiplexing for optical data communications
NASA Astrophysics Data System (ADS)
Ivanovich, Darko; Powell, Samuel B.; Gruev, Viktor; Chamberlain, Roger D.
2018-02-01
Multiple parallel channels are ubiquitous in optical communications, with spatial division multiplexing (separate physical paths) and wavelength division multiplexing (separate optical wavelengths) being the most common forms. Here, we investigate the viability of polarization division multiplexing, the separation of distinct parallel optical communication channels through the polarization properties of light. Two or more linearly polarized optical signals (at different polarization angles) are transmitted through a common medium, filtered using aluminum nanowire optical filters fabricated on-chip, and received using individual silicon photodetectors (one per channel). The entire receiver (including optics) is compatible with standard CMOS fabrication processes. The filter model is based upon an input optical signal formed as the sum of the Stokes vectors for each individual channel, transformed by the Mueller matrix that models the filter proper, resulting in an output optical signal that impinges on each photodiode. The results show that two- and three-channel systems can operate with a fixed-threshold comparator in the receiver circuit, but four-channel systems (and larger) will require channel coding of some form. For example, in the four-channel system, 10 of 16 distinct bit patterns are separable by the receiver. The model supports investigation of the range of variability tolerable in the fabrication of the on-chip polarization filters.
Complementary spin transistor using a quantum well channel.
Park, Youn Ho; Choi, Jun Woo; Kim, Hyung-Jun; Chang, Joonyeon; Han, Suk Hee; Choi, Heon-Jin; Koo, Hyun Cheol
2017-04-20
In order to utilize the spin field effect transistor in logic applications, the development of two types of complementary transistors, which play roles of the n- and p-type conventional charge transistors, is an essential prerequisite. In this research, we demonstrate complementary spin transistors consisting of two types of devices, namely parallel and antiparallel spin transistors using InAs based quantum well channels and exchange-biased ferromagnetic electrodes. In these spin transistors, the magnetization directions of the source and drain electrodes are parallel or antiparallel, respectively, depending on the exchange bias field direction. Using this scheme, we also realize a complementary logic operation purely with spin transistors controlled by the gate voltage, without any additional n- or p-channel transistor.
Komarov, Ivan; D'Souza, Roshan M
2012-01-01
The Gillespie Stochastic Simulation Algorithm (GSSA) and its variants are cornerstone techniques to simulate reaction kinetics in situations where the concentration of the reactant is too low to allow deterministic techniques such as differential equations. The inherent limitations of the GSSA include the time required for executing a single run and the need for multiple runs for parameter sweep exercises due to the stochastic nature of the simulation. Even very efficient variants of GSSA are prohibitively expensive to compute and perform parameter sweeps. Here we present a novel variant of the exact GSSA that is amenable to acceleration by using graphics processing units (GPUs). We parallelize the execution of a single realization across threads in a warp (fine-grained parallelism). A warp is a collection of threads that are executed synchronously on a single multi-processor. Warps executing in parallel on different multi-processors (coarse-grained parallelism) simultaneously generate multiple trajectories. Novel data-structures and algorithms reduce memory traffic, which is the bottleneck in computing the GSSA. Our benchmarks show an 8×-120× performance gain over various state-of-the-art serial algorithms when simulating different types of models.
Parallel distributed, reciprocal Monte Carlo radiation in coupled, large eddy combustion simulations
NASA Astrophysics Data System (ADS)
Hunsaker, Isaac L.
Radiation is the dominant mode of heat transfer in high temperature combustion environments. Radiative heat transfer affects the gas and particle phases, including all the associated combustion chemistry. The radiative properties are in turn affected by the turbulent flow field. This bi-directional coupling of radiation turbulence interactions poses a major challenge in creating parallel-capable, high-fidelity combustion simulations. In this work, a new model was developed in which reciprocal monte carlo radiation was coupled with a turbulent, large-eddy simulation combustion model. A technique wherein domain patches are stitched together was implemented to allow for scalable parallelism. The combustion model runs in parallel on a decomposed domain. The radiation model runs in parallel on a recomposed domain. The recomposed domain is stored on each processor after information sharing of the decomposed domain is handled via the message passing interface. Verification and validation testing of the new radiation model were favorable. Strong scaling analyses were performed on the Ember cluster and the Titan cluster for the CPU-radiation model and GPU-radiation model, respectively. The model demonstrated strong scaling to over 1,700 and 16,000 processing cores on Ember and Titan, respectively.
Efficient Helicopter Aerodynamic and Aeroacoustic Predictions on Parallel Computers
NASA Technical Reports Server (NTRS)
Wissink, Andrew M.; Lyrintzis, Anastasios S.; Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak
1996-01-01
This paper presents parallel implementations of two codes used in a combined CFD/Kirchhoff methodology to predict the aerodynamics and aeroacoustics properties of helicopters. The rotorcraft Navier-Stokes code, TURNS, computes the aerodynamic flowfield near the helicopter blades and the Kirchhoff acoustics code computes the noise in the far field, using the TURNS solution as input. The overall parallel strategy adds MPI message passing calls to the existing serial codes to allow for communication between processors. As a result, the total code modifications required for parallel execution are relatively small. The biggest bottleneck in running the TURNS code in parallel comes from the LU-SGS algorithm that solves the implicit system of equations. We use a new hybrid domain decomposition implementation of LU-SGS to obtain good parallel performance on the SP-2. TURNS demonstrates excellent parallel speedups for quasi-steady and unsteady three-dimensional calculations of a helicopter blade in forward flight. The execution rate attained by the code on 114 processors is six times faster than the same cases run on one processor of the Cray C-90. The parallel Kirchhoff code also shows excellent parallel speedups and fast execution rates. As a performance demonstration, unsteady acoustic pressures are computed at 1886 far-field observer locations for a sample acoustics problem. The calculation requires over two hundred hours of CPU time on one C-90 processor but takes only a few hours on 80 processors of the SP2. The resultant far-field acoustic field is analyzed with state of-the-art audio and video rendering of the propagating acoustic signals.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Volkov, M V; Garanin, S G; Dolgopolov, Yu V
2014-11-30
A seven-channel fibre laser system operated by the master oscillator – multichannel power amplifier scheme is the phase locked using a stochastic parallel gradient algorithm. The phase modulators on lithium niobate crystals are controlled by a multichannel electronic unit with the microcontroller processing signals in real time. The dynamic phase locking of the laser system with the bandwidth of 14 kHz is demonstrated, the time of phasing is 3 – 4 ms. (fibre and integrated-optical structures)
NASA Astrophysics Data System (ADS)
Dietterich, H. R.; Cashman, K. V.
2011-12-01
Hawaiian lava channels are characterized by numerous bifurcations and confluences that have important implications for flow behavior. The ubiquity of anastomosing flows, and their detailed observation over time, makes Hawai`i an ideal place to investigate the formation of these features and their effect on simple models of lava flow emplacement. Using a combination of high-resolution LiDAR data from the Kilauea December 1974 and Mauna Loa 1984 flows, orthoimagery of the Mauna Loa 1859 flow, and historical and InSAR mapping of the current eruption of Kilauea (1983-present), we quantify the geometry of distributary, anastomosing, and simple channel networks and compare these to flow advance rates and lengths. We use a pre-eruptive DEM of the Mauna Loa 1984 flow created from aerial photographs to investigate the relationship between underlying topography and channel morphology. In the Mauna Loa 1984 flow, the slope of the pre-eruptive surface correlates with the number of parallel channels. Slopes >4° generate up to thirteen parallel channels in contrast to slopes of <4° that produce fewer than eight parallel channels. In the 1983-1986 lava flows erupted from Pu`u `O`o, average effusion rate correlates with the number of bifurcations, each producing a new parallel channel. Flows with a volume flux <60 m3/s only have one bifurcation at most in the entire flow, while flows with a volume flux >60 m3/s contain up to four bifurcations. These data show that the splitting and merging of individual flows is a product of both the underlying ground surface and eruption rate. Important properties of the pre-eruptive topography include both the slope and the scale of surface roughness. We suggest that a crucial control is the height of the flow front in comparison to the scale of local topography and roughness. Greater slopes may create more active channels because the reduced flow thickness allows interaction with local obstacles of a greater size range. Conversely, higher viscosities could reduce the number of active channels by increasing the flow thickness. The effusion rate also influences the degree of flow branching, possibly by generating overflows and widening the flow. Branched channels can also rejoin at confluences, which occur on the leeward sides of obstacles and where the flow is confined against large-scale features, including fault scarps and older flow margins. We expect the maintenance of parallel channels past an obstacle that splits the flow to be a function of the slope and flux, which drives the flow downhill and governs the formation of levees. Our data reveal that by controlling the effective lava flux, bifurcations slow flow advance and restrict flow length. We postulate that flow branching may therefore restrict most Mauna Loa flow lengths to ~25 km, despite a wide range of effusion rates. In contrast, both confluences and the shut off of an active branch accelerate the flow. The complexity of Hawaiian flows has largely been ignored in predictive models of flow emplacement in Hawaii, but the flow geometries must be incorporated to improve syn-eruptive prediction of lava flow behavior.
Population annealing with weighted averages: A Monte Carlo method for rough free-energy landscapes
NASA Astrophysics Data System (ADS)
Machta, J.
2010-08-01
The population annealing algorithm introduced by Hukushima and Iba is described. Population annealing combines simulated annealing and Boltzmann weighted differential reproduction within a population of replicas to sample equilibrium states. Population annealing gives direct access to the free energy. It is shown that unbiased measurements of observables can be obtained by weighted averages over many runs with weight factors related to the free-energy estimate from the run. Population annealing is well suited to parallelization and may be a useful alternative to parallel tempering for systems with rough free-energy landscapes such as spin glasses. The method is demonstrated for spin glasses.
Local rollback for fault-tolerance in parallel computing systems
Blumrich, Matthias A [Yorktown Heights, NY; Chen, Dong [Yorktown Heights, NY; Gara, Alan [Yorktown Heights, NY; Giampapa, Mark E [Yorktown Heights, NY; Heidelberger, Philip [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Steinmacher-Burow, Burkhard [Boeblingen, DE; Sugavanam, Krishnan [Yorktown Heights, NY
2012-01-24
A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.
The seasonal-cycle climate model
NASA Technical Reports Server (NTRS)
Marx, L.; Randall, D. A.
1981-01-01
The seasonal cycle run which will become the control run for the comparison with runs utilizing codes and parameterizations developed by outside investigators is discussed. The climate model currently exists in two parallel versions: one running on the Amdahl and the other running on the CYBER 203. These two versions are as nearly identical as machine capability and the requirement for high speed performance will allow. Developmental changes are made on the Amdahl/CMS version for ease of testing and rapidity of turnaround. The changes are subsequently incorporated into the CYBER 203 version using vectorization techniques where speed improvement can be realized. The 400 day seasonal cycle run serves as a control run for both medium and long range climate forecasts alsensitivity studies.
Performance of the NOνA Data Acquisition and Trigger Systems for the full 14 kT Far Detector
NASA Astrophysics Data System (ADS)
Norman, A.; Davies, G. S.; Ding, P. F.; Dukes, E. C.; Duyan, H.; Frank, M. J.; R. C. Group; Habig, A.; Henderson, W.; Niner, E.; Mina, R.; Moren, A.; Mualem, L.; Oksuzian, Y.; Rebel, B.; Shanahan, P.; Sheshukov, A.; Tamsett, M.; Tomsen, K.; Vinton, L.; Wang, Z.; Zamorano, B.; Zirnstien, J.
2015-12-01
The NOvA experiment uses a continuous, free-running, dead-timeless data acquisition system to collect data from the 14 kT far detector. The DAQ system readouts the more than 344,000 detector channels and assembles the information into an raw unfiltered high bandwidth data stream. The NOvA trigger systems operate in parallel to the readout and asynchronously to the primary DAQ readout/event building chain. The data driven triggering systems for NOvA are unique in that they examine long contiguous time windows of the high resolution readout data and enable the detector to be sensitive to a wide range of physics interactions from those with fast, nanosecond scale signals up to processes with long delayed coincidences between hits which occur at the tens of milliseconds time scale. The trigger system is able to achieve a true 100% live time for the detector, making it sensitive to both beam spill related and off-spill physics.
The solid solution K3.84Ni0.78Fe3.19(PO4)5
Strutynska, Nataliia Yu.; Ogorodnyk, Ivan V.; Livitska, Oksana V.; Baumer, Vyacheslav N.; Slobodyanik, Nikolay S.
2014-01-01
The title compound, tetrapotassium tetra[nickel(II)/iron(III)] pentakis(orthophosphate), K3.84Ni0.78Fe3.19(PO4)5, has been obtained from a flux. The structure is isotypic with that of K4MgFe3(PO4)5. The three-dimensional framework is built up from (Ni/Fe)O5 trigonal bipyramids with a mixed Fe:Ni occupancy of 0.799 (8):0.196 (10) and isolated PO4 tetrahedra, one of which is on a general position and one of which has -4.. site symmetry. Two K+ cations are statistically occupied and are distributed over two positions in hexagonally shaped channels that run parallel to [001]. One K+ cation [occupancy 0.73 (3)] is surrounded by nine O atoms, while the other K+ cation [occupancy 0.23 (3)] is surrounded by eight O atoms. PMID:25161510
NASA Astrophysics Data System (ADS)
Zhang, Xuan; Jia, Li; Dang, Chao; Peng, Qi
2018-02-01
A simultaneous visualization and measurement experiment was carried out to investigate condensation flow patterns and condensing heat transfer characteristics of refrigerant R141b in parallel horizontal multi-channels with liquid-vapor separator. The hydraulic diameter of each channel was 1.5 mm and the channel length was 100 mm. The refrigerant vapor flowing in the small channels was cooled by cooling water. The parallel horizontal multi- channels were covered with a transparent silica glass for visualization of flow patterns. Experiments were performed at different inlet superheat temperatures (ranging from 3°C to 7°C). Mass velocity was in the range of 82.37 kg m-2s-1 to 35.56 kg m-2s-1. It was found that there were three different flow patterns through the multi- channels with the increase of mass velocity. The flow patterns in each channel pass almost tended to be same and all of them were annular flows. The efficiency of the liquid-vapor separator with U-type was related to vapor mass velocity and the pressure in the small channels. It was also found that the heat transfer coefficient increased with the increase of the mass velocity while the cooling water mass flow rate increased. It increased to a top point and then decreased. It increased with the increase of superheat in the low superheat temperature region.
On Channel-Discontinuity-Constraint Routing in Wireless Networks☆
Sankararaman, Swaminathan; Efrat, Alon; Ramasubramanian, Srinivasan; Agarwal, Pankaj K.
2011-01-01
Multi-channel wireless networks are increasingly deployed as infrastructure networks, e.g. in metro areas. Network nodes frequently employ directional antennas to improve spatial throughput. In such networks, between two nodes, it is of interest to compute a path with a channel assignment for the links such that the path and link bandwidths are the same. This is achieved when any two consecutive links are assigned different channels, termed as “Channel-Discontinuity-Constraint” (CDC). CDC-paths are also useful in TDMA systems, where, preferably, consecutive links are assigned different time-slots. In the first part of this paper, we develop a t-spanner for CDC-paths using spatial properties; a sub-network containing O(n/θ) links, for any θ > 0, such that CDC-paths increase in cost by at most a factor t = (1−2 sin (θ/2))−2. We propose a novel distributed algorithm to compute the spanner using an expected number of O(n log n) fixed-size messages. In the second part, we present a distributed algorithm to find minimum-cost CDC-paths between two nodes using O(n2) fixed-size messages, by developing an extension of Edmonds’ algorithm for minimum-cost perfect matching. In a centralized implementation, our algorithm runs in O(n2) time improving the previous best algorithm which requires O(n3) running time. Moreover, this running time improves to O(n/θ) when used in conjunction with the spanner developed. PMID:24443646
Parallel Event Analysis Under Unix
NASA Astrophysics Data System (ADS)
Looney, S.; Nilsson, B. S.; Oest, T.; Pettersson, T.; Ranjard, F.; Thibonnier, J.-P.
The ALEPH experiment at LEP, the CERN CN division and Digital Equipment Corp. have, in a joint project, developed a parallel event analysis system. The parallel physics code is identical to ALEPH's standard analysis code, ALPHA, only the organisation of input/output is changed. The user may switch between sequential and parallel processing by simply changing one input "card". The initial implementation runs on an 8-node DEC 3000/400 farm, using the PVM software, and exhibits a near-perfect speed-up linearity, reducing the turn-around time by a factor of 8.
RAMA: A file system for massively parallel computers
NASA Technical Reports Server (NTRS)
Miller, Ethan L.; Katz, Randy H.
1993-01-01
This paper describes a file system design for massively parallel computers which makes very efficient use of a few disks per processor. This overcomes the traditional I/O bottleneck of massively parallel machines by storing the data on disks within the high-speed interconnection network. In addition, the file system, called RAMA, requires little inter-node synchronization, removing another common bottleneck in parallel processor file systems. Support for a large tertiary storage system can easily be integrated in lo the file system; in fact, RAMA runs most efficiently when tertiary storage is used.
Extraordinary flood response of a small urban watershed to short-duration convective rainfall
Smith, J.A.; Miller, A.J.; Baeck, M.L.; Nelson, P.A.; Fisher, G.T.; Meierdiercks, K.L.
2005-01-01
The 9.1 km2 Moores Run watershed in Baltimore, Maryland, experiences floods with unit discharge peaks exceeding 1 m3 s-1 km-2 12 times yr-1, on average. Few, if any, drainage basins in the continental United States have a higher frequency. A thunderstorm system on 13 June 2003 produced the record flood peak (13.2 m3 s-1 km-2) during the 6-yr stream gauging record of Moores Run. In this paper, the hydrometeorology, hydrology, and hydraulics of extreme floods in Moores Run are examined through analyses of the 13 June 2003 storm and flood, as well as other major storm and flood events during the 2000-03 time period. The 13 June 2003 flood, like most floods in Moores Run, was produced by an organized system of thunderstorms. Analyses of the 13 June 2003 storm, which are based on volume scan reflectivity observations from the Sterling, Virginia, WSR-88D radar, are used to characterize the spatial and temporal variability of flash flood producing rainfall. Hydrology of flood response in Moores Run is characterized by highly efficient concentration of runoff through the storm drain network and relatively low runoff ratios. A detailed survey of high-water marks for the 13 June 2003 flood is used, in combination with analyses based on a 2D, depth-averaged open channel flow model (TELEMAC 2D) to examine hydraulics of the 13 June 2003 flood. Hydraulic analyses are used to examine peak discharge estimates for the 13 June flood peak, propagation of flood waves in the Moores Run channel, and 2D flow features associated with channel and floodplain geometry. ?? 2005 American Meteorological Society.
NASA Astrophysics Data System (ADS)
Azadegan, B.
2013-03-01
The presented Mathematica code is an efficient tool for simulation of planar channeling radiation spectra of relativistic electrons channeled along major crystallographic planes of a diamond-structure single crystal. The program is based on the quantum theory of channeling radiation which has been successfully applied to study planar channeling at electron energies between 10 and 100 MeV. Continuum potentials for different planes of diamond, silicon and germanium single crystals are calculated using the Doyle-Turner approximation to the atomic scattering factor and taking thermal vibrations of the crystal atoms into account. Numerical methods are applied to solve the one-dimensional Schrödinger equation. The code is designed to calculate the electron wave functions, transverse electron states in the planar continuum potential, transition energies, line widths of channeling radiation and depth dependencies of the population of quantum states. Finally the spectral distribution of spontaneously emitted channeling radiation is obtained. The simulation of radiation spectra considerably facilitates the interpretation of experimental data. Catalog identifier: AEOH_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOH_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 446 No. of bytes in distributed program, including test data, etc.: 209805 Distribution format: tar.gz Programming language: Mathematica. Computer: Platforms on which Mathematica is available. Operating system: Operating systems on which Mathematica is available. RAM: 1 MB Classification: 7.10. Nature of problem: Planar channeling radiation is emitted by relativistic charged particles during traversing a single crystal in direction parallel to a crystallographic plane. Channeling is modeled as the motion of charged particles in a continuous planar potential which is formed by the spatially and thermally averaged action of the individual electrostatic potentials of the crystal atoms of the corresponding plane. Classically, the motion of channeled particles through the crystal resembles transverse oscillations being the source of radiation emission. For electrons of energy less than 100 MeV considered here, planar channeling has to be treated quantum mechanically by a one-dimensional Schrödinger equation for the transverse motion. Hence, this motion of the channeled electrons is restricted to a number of discrete (bound) channeling states in the planar continuum potential, and the emission of channeling radiation is caused by spontaneous electron transitions between these eigenstates. Due to relativistic and Doppler effects, the energy of the emitted photons directed into a narrow forward cone is typically shifted up by about three to five orders of magnitude. Consequently, the observed energy spectrum of channeling radiation is characterized by a number of radiation lines in the energy domain of hard X-rays. Channeling radiation may, therefore, be applied as an intense, tunable, quasi-monochromatic X-ray source. Solution method: The problem consists in finding the electron wave function for the planar continuum potential. Both the wave functions and corresponding energies of channeling states solve the Schrödinger equation of transverse electron motion. In the framework of the so-called many-beam formalism, solving the Schrödinger equation reduces to a eigenvector-eigenvalue problem of a Hermitian matrix. For that the program employs the mathematical tools allocated in the commercial computation software Mathematica. The electric field of the atomic planes in the crystal forces dipole oscillations of the channeled charged particles. In the quantum mechanical approach, the dipole approximation is also valid for spontaneous transitions between bound states. The transition strength for dedicated states depends on the magnitude of the corresponding dipole matrix element. The photon energy correlates with the particle energy, and the spectral width of radiation lines is a function of the life times of the channeling states. Running time: The program has been tested on a PC AMD Athlon X2 245 processor 2.9 GHz with 2 GB RAM. Depending on electron energy and crystal thickness, the running time of the program amounts to 5-10 min.
Three dimensional simulations of viscous folding in diverging microchannels
NASA Astrophysics Data System (ADS)
Xu, Bingrui; Chergui, Jalel; Shin, Seungwon; Juric, Damir
2016-11-01
Three dimensional simulations on the viscous folding in diverging microchannels reported by Cubaud and Mason are performed using the parallel code BLUE for multi-phase flows. The more viscous liquid L1 is injected into the channel from the center inlet, and the less viscous liquid L2 from two side inlets. Liquid L1 takes the form of a thin filament due to hydrodynamic focusing in the long channel that leads to the diverging region. The thread then becomes unstable to a folding instability, due to the longitudinal compressive stress applied to it by the diverging flow of liquid L2. We performed a parameter study in which the flow rate ratio, the viscosity ratio, the Reynolds number, and the shape of the channel were varied relative to a reference model. In our simulations, the cross section of the thread produced by focusing is elliptical rather than circular. The initial folding axis can be either parallel or perpendicular to the narrow dimension of the chamber. In the former case, the folding slowly transforms via twisting to perpendicular folding, or it may remain parallel. The direction of folding onset is determined by the velocity profile and the elliptical shape of the thread cross section in the channel that feeds the diverging part of the cell.
Response Errors Explain the Failure of Independent-Channels Models of Perception of Temporal Order
García-Pérez, Miguel A.; Alcalá-Quintana, Rocío
2012-01-01
Independent-channels models of perception of temporal order (also referred to as threshold models or perceptual latency models) have been ruled out because two formal properties of these models (monotonicity and parallelism) are not borne out by data from ternary tasks in which observers must judge whether stimulus A was presented before, after, or simultaneously with stimulus B. These models generally assume that observed responses are authentic indicators of unobservable judgments, but blinks, lapses of attention, or errors in pressing the response keys (maybe, but not only, motivated by time pressure when reaction times are being recorded) may make observers misreport their judgments or simply guess a response. We present an extension of independent-channels models that considers response errors and we show that the model produces psychometric functions that do not satisfy monotonicity and parallelism. The model is illustrated by fitting it to data from a published study in which the ternary task was used. The fitted functions describe very accurately the absence of monotonicity and parallelism shown by the data. These characteristics of empirical data are thus consistent with independent-channels models when response errors are taken into consideration. The implications of these results for the analysis and interpretation of temporal order judgment data are discussed. PMID:22493586
Aberration compensation of an ultrasound imaging instrument with a reduced number of channels.
Jiang, Wei; Astheimer, Jeffrey P; Waag, Robert C
2012-10-01
Focusing and imaging qualities of an ultrasound imaging system that uses aberration correction were experimentally investigated as functions of the number of parallel channels. Front-end electronics that consolidate signals from multiple physical elements can be used to lower hardware and computational costs by reducing the number of parallel channels. However, the signals from sparse arrays of synthetic elements yield poorer aberration estimates. In this study, aberration estimates derived from synthetic arrays of varying element sizes are evaluated by comparing compensated receive focuses, compensated transmit focuses, and compensated b-scan images of a point target and a cyst phantom. An array of 80 x 80 physical elements with a pitch of 0.6 x 0.6 mm was used for all of the experiments and the aberration was produced by a phantom selected to mimic propagation through abdominal wall. The results show that aberration correction derived from synthetic arrays with pitches that have a diagonal length smaller than 70% of the correlation length of the aberration yield focuses and images of approximately the same quality. This connection between correlation length of the aberration and synthetic element size provides a guideline for determining the number of parallel channels that are required when designing imaging systems that employ aberration correction.
Results of closed cycle MHD power generation tests with a helium-cesium working fluid
NASA Technical Reports Server (NTRS)
Sovie, R. J.
1977-01-01
The cross-sectional dimensions of the MHD channel in the NASA Lewis closed loop facility have been reduced to 3.8 x 11.4 cm. Tests were run in this channel using a helium-cesium working fluid at stagnation pressures of 1.6 x 10 to the 5th N/sq m, stagnation temperatures of 2000-2060 K and an entrance Mach number of 0.36. In these tests Faraday open circuit voltages of 200 V were measured which correspond to a Faraday field of 1750 V/m. Power generation tests were run for different groups of electrode configurations and channel lengths. Hall fields up to 1450 V/m were generated. Power extraction per electrode of 183 W and power densities of 1.7 MW/cu m have been obtained. A total power output of 2 kW was generated for tests with 14 electrodes. The power densities obtained in this channel represent a factor of 3 improvement over those reported for the m = 0.2 channel at the last EAM Symposium.
Numerical investigation of heat transfer in parallel channels with water at supercritical pressure.
Shitsi, Edward; Kofi Debrah, Seth; Yao Agbodemegbe, Vincent; Ampomah-Amoako, Emmanuel
2017-11-01
Thermal phenomena such as heat transfer enhancement, heat transfer deterioration, and flow instability observed at supercritical pressures as a result of fluid property variations have the potential to affect the safety of design and operation of Supercritical Water-cooled Reactor SCWR, and also challenge the capabilities of both heat transfer correlations and Computational Fluid Dynamics CFD physical models. These phenomena observed at supercritical pressures need to be thoroughly investigated. An experimental study was carried out by Xi to investigate flow instability in parallel channels at supercritical pressures under different mass flow rates, pressures, and axial power shapes. Experimental data on flow instability at inlet of the heated channels were obtained but no heat transfer data along the axial length was obtained. This numerical study used 3D numerical tool STAR-CCM+ to investigate heat transfer at supercritical pressures along the axial lengths of the parallel channels with water ahead of experimental data. Homogeneous axial power shape HAPS was adopted and the heating powers adopted in this work were below the experimental threshold heating powers obtained for HAPS by Xi. The results show that the Fluid Centre-line Temperature FCLT increased linearly below and above the PCT region, but flattened at the PCT region for all the system parameters considered. The inlet temperature, heating power, pressure, gravity and mass flow rate have effects on WT (wall temperature) values in the NHT (normal heat transfer), EHT (enhanced heat transfer), DHT (deteriorated heat transfer) and recovery from DHT regions. While variation of all other system parameters in the EHT and PCT regions showed no significant difference in the WT and FCLT values respectively, the WT and FCLT values respectively increased with pressure in these regions. For most of the system parameters considered, the FCLT and WT values obtained in the two channels were nearly the same. The numerical study was not quantitatively compared with experimental data along the axial lengths of the parallel channels, but it was observed that the numerical tool STAR-CCM+ adopted was able to capture the trends for NHT, EHT, DHT and recovery from DHT regions. The heating powers used for the various simulations were below the experimentally observed threshold heating powers, but heat transfer deterioration HTD was observed, confirming the previous finding that HTD could occur before the occurrence of unstable behavior at supercritical pressures. For purposes of comparing the results of numerical simulations with experimental data, the heat transfer data on temperature oscillations obtained at the outlet of the heated channels and instability boundary results obtained at the inlet of the heated channels were compared. The numerical results obtained quite well agree with the experimental data. This work calls for provision of experimental data on heat transfer in parallel channels at supercritical pressures for validation of similar numerical studies.
Proposal for massively parallel data storage system
NASA Technical Reports Server (NTRS)
Mansuripur, M.
1992-01-01
An architecture for integrating large numbers of data storage units (drives) to form a distributed mass storage system is proposed. The network of interconnected units consists of nodes and links. At each node there resides a controller board, a data storage unit and, possibly, a local/remote user-terminal. The links (twisted-pair wires, coax cables, or fiber-optic channels) provide the communications backbone of the network. There is no central controller for the system as a whole; all decisions regarding allocation of resources, routing of messages and data-blocks, creation and distribution of redundant data-blocks throughout the system (for protection against possible failures), frequency of backup operations, etc., are made locally at individual nodes. The system can handle as many user-terminals as there are nodes in the network. Various users compete for resources by sending their requests to the local controller-board and receiving allocations of time and storage space. In principle, each user can have access to the entire system, and all drives can be running in parallel to service the requests for one or more users. The system is expandable up to a maximum number of nodes, determined by the number of routing-buffers built into the controller boards. Additional drives, controller-boards, user-terminals, and links can be simply plugged into an existing system in order to expand its capacity.
Generating unstructured nuclear reactor core meshes in parallel
Jain, Rajeev; Tautges, Timothy J.
2014-10-24
Recent advances in supercomputers and parallel solver techniques have enabled users to run large simulations problems using millions of processors. Techniques for multiphysics nuclear reactor core simulations are under active development in several countries. Most of these techniques require large unstructured meshes that can be hard to generate in a standalone desktop computers because of high memory requirements, limited processing power, and other complexities. We have previously reported on a hierarchical lattice-based approach for generating reactor core meshes. Here, we describe efforts to exploit coarse-grained parallelism during reactor assembly and reactor core mesh generation processes. We highlight several reactor coremore » examples including a very high temperature reactor, a full-core model of the Korean MONJU reactor, a ¼ pressurized water reactor core, the fast reactor Experimental Breeder Reactor-II core with a XX09 assembly, and an advanced breeder test reactor core. The times required to generate large mesh models, along with speedups obtained from running these problems in parallel, are reported. A graphical user interface to the tools described here has also been developed.« less
Roofline model toolkit: A practical tool for architectural and program analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lo, Yu Jung; Williams, Samuel; Van Straalen, Brian
We present preliminary results of the Roofline Toolkit for multicore, many core, and accelerated architectures. This paper focuses on the processor architecture characterization engine, a collection of portable instrumented micro benchmarks implemented with Message Passing Interface (MPI), and OpenMP used to express thread-level parallelism. These benchmarks are specialized to quantify the behavior of different architectural features. Compared to previous work on performance characterization, these microbenchmarks focus on capturing the performance of each level of the memory hierarchy, along with thread-level parallelism, instruction-level parallelism and explicit SIMD parallelism, measured in the context of the compilers and run-time environments. We also measuremore » sustained PCIe throughput with four GPU memory managed mechanisms. By combining results from the architecture characterization with the Roofline model based solely on architectural specifications, this work offers insights for performance prediction of current and future architectures and their software systems. To that end, we instrument three applications and plot their resultant performance on the corresponding Roofline model when run on a Blue Gene/Q architecture.« less
Parallelization of a hydrological model using the message passing interface
Wu, Yiping; Li, Tiejian; Sun, Liqun; Chen, Ji
2013-01-01
With the increasing knowledge about the natural processes, hydrological models such as the Soil and Water Assessment Tool (SWAT) are becoming larger and more complex with increasing computation time. Additionally, other procedures such as model calibration, which may require thousands of model iterations, can increase running time and thus further reduce rapid modeling and analysis. Using the widely-applied SWAT as an example, this study demonstrates how to parallelize a serial hydrological model in a Windows® environment using a parallel programing technology—Message Passing Interface (MPI). With a case study, we derived the optimal values for the two parameters (the number of processes and the corresponding percentage of work to be distributed to the master process) of the parallel SWAT (P-SWAT) on an ordinary personal computer and a work station. Our study indicates that model execution time can be reduced by 42%–70% (or a speedup of 1.74–3.36) using multiple processes (two to five) with a proper task-distribution scheme (between the master and slave processes). Although the computation time cost becomes lower with an increasing number of processes (from two to five), this enhancement becomes less due to the accompanied increase in demand for message passing procedures between the master and all slave processes. Our case study demonstrates that the P-SWAT with a five-process run may reach the maximum speedup, and the performance can be quite stable (fairly independent of a project size). Overall, the P-SWAT can help reduce the computation time substantially for an individual model run, manual and automatic calibration procedures, and optimization of best management practices. In particular, the parallelization method we used and the scheme for deriving the optimal parameters in this study can be valuable and easily applied to other hydrological or environmental models.
33 CFR 110.78 - Sturgeon Bay, Sturgeon Bay, Wis.
Code of Federal Regulations, 2010 CFR
2010-07-01
... channel edge; thence 222°, 500 feet; thence 300°, 1,200 feet; thence 042°, 500 feet to the point of... extended; thence south 530 feet to a point 100 feet from the northern edge of the channel; thence southeasterly 2,350 feet along a line parallel to the northern edge of the channel to a point on the east line...
The enigmatic ultra-long run-out of seafloor density driven flows
NASA Astrophysics Data System (ADS)
Dorrell, R. M.
2017-12-01
Dilute, particulate-laden, density-driven flows - turbidity currents - are a predominant mechanism for transporting sediment from source to sink in deep marine environments. These flows sculpt channels on the seafloor and, as evidenced by a wealth of bathymetric data, can travel for >1000km, forming some of the largest sedimentary landforms on the planet. For turbidity currents to travel such large dsitances, sediment must be self-maintained in suspension, i.e., be in a state of autosuspension. It has been shown that such self-maintained sediment suspensions can only occur whilst inertial forces are greater than gravitational forces, entailing supercritical flow. This conclusion is paradoxical, as inertia dominated flows rapidly entrain fluid, thereby thickening and slowing to become subcritical. However, current theory can only truly be applied to the proximal upper slope regions of seafloor channels where incised flows are fully confined. This contrasts with the distal reaches of long run out turbidity current systems, where the flow is only partially confined through self-channelization. Here it is shown that overspill of partially confined flow has a significant effect on the hydro- and morphodynamics of turbidity current systems. A new model is derived that shows that channel overspill acts to negate the effects of ambient fluid entrainment: a dynamic balance that limits increases in flow depth and maintains supercritical flow throughout the channel. In the new model mass, momentum and energy conservation is modulated by flow overspill onto channel banks, necessarily requiring description of the vertical structure of the flow. Analysis of continuously stratified steady state flow dynamics shows that the integration of overspill and stratification is necessary to enable maintained autosuspension and thus predict the ultra-long run-out of turbidity currents.
Stability of parallel electroosmotic flow subject to an axial modulated electric field
NASA Astrophysics Data System (ADS)
Suresh, Vinod; Homsy, George
2001-11-01
The stability of parallel electroosmotic flow in a micro-channel subjected to an AC electric field is studied. A spatially uniform time harmonic electric field is applied along the length of a two-dimensional micro-channel containing a dilute electrolytic solution, resulting in a time periodic parallel flow. The top and bottom walls of the channel are maintained at constant potential. The base state ion concentrations and double layer potential are determined using the Poisson-Boltzmann equation in the Debye-Hückel approximation. Experiments by other workers (Santiago et. al., unpublished) have shown that such a system can exhibit instabilities that take the form of mixing motion occurring in the bulk flow outside the double layer. It is shown that such instabilities can potentially result from the coupling of disturbances in the ion concentrations or electric potential to the base state velocity or ion concentrations, respectively. The stability boundary of the system is determined using Floquet theory and its dependence on the modulation frequency and amplitude of the axial electric field is studied.
NASA Astrophysics Data System (ADS)
Fradeneck, Austen; Kimber, Mark
2017-11-01
The present study evaluates the effectiveness of current RANS and LES models in simulating natural convection in high-aspect ratio parallel plate channels. The geometry under consideration is based on a simplification of the coolant and bypass channels in the very high-temperature gas reactor (VHTR). Two thermal conditions are considered, asymmetric and symmetric wall heating with an applied heat flux to match Rayleigh numbers experienced in the VHTR during a loss of flow accident (LOFA). RANS models are compared to analogous high-fidelity LES simulations. Preliminary results demonstrate the efficacy of the low-Reynolds number k- ɛ formulations and their enhancement to the standard form and Reynolds stress transport model in terms of calculating the turbulence production due to buoyancy and overall mean flow variables.
Wang, Min; Tian, Yun
2018-01-01
The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator's dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance. PMID:29861711
Digital tomosynthesis mammography using a parallel maximum-likelihood reconstruction method
NASA Astrophysics Data System (ADS)
Wu, Tao; Zhang, Juemin; Moore, Richard; Rafferty, Elizabeth; Kopans, Daniel; Meleis, Waleed; Kaeli, David
2004-05-01
A parallel reconstruction method, based on an iterative maximum likelihood (ML) algorithm, is developed to provide fast reconstruction for digital tomosynthesis mammography. Tomosynthesis mammography acquires 11 low-dose projections of a breast by moving an x-ray tube over a 50° angular range. In parallel reconstruction, each projection is divided into multiple segments along the chest-to-nipple direction. Using the 11 projections, segments located at the same distance from the chest wall are combined to compute a partial reconstruction of the total breast volume. The shape of the partial reconstruction forms a thin slab, angled toward the x-ray source at a projection angle 0°. The reconstruction of the total breast volume is obtained by merging the partial reconstructions. The overlap region between neighboring partial reconstructions and neighboring projection segments is utilized to compensate for the incomplete data at the boundary locations present in the partial reconstructions. A serial execution of the reconstruction is compared to a parallel implementation, using clinical data. The serial code was run on a PC with a single PentiumIV 2.2GHz CPU. The parallel implementation was developed using MPI and run on a 64-node Linux cluster using 800MHz Itanium CPUs. The serial reconstruction for a medium-sized breast (5cm thickness, 11cm chest-to-nipple distance) takes 115 minutes, while a parallel implementation takes only 3.5 minutes. The reconstruction time for a larger breast using a serial implementation takes 187 minutes, while a parallel implementation takes 6.5 minutes. No significant differences were observed between the reconstructions produced by the serial and parallel implementations.
Gellis, Allen C.; Myers, Michael; Noe, Gregory; Hupp, Cliff R.; Shenk, Edward; Myers, Luke
2017-01-01
Determining erosion and deposition rates in urban-suburban settings and how these processes are affected by large storms is important to understanding geomorphic processes in these landscapes. Sediment yields in the suburban and urban Upper Difficult Run are among the highest ever recorded in the Chesapeake Bay watershed, ranging from 161 to 376 Mg/km2/y. Erosion and deposition of streambanks, channel bed, and bars and deposition of floodplains were monitored between 1 March 2010 and 18 January 2013 in Upper Difficult Run, Virginia, USA. We documented the effects of two large storms, Tropical Storm Lee (September 2011), a 100-year event, and Super Storm Sandy (October 2012) a 5-year event, on channel erosion and deposition. Variability in erosion and deposition rates for all geomorphic features, temporally and spatially, are important conclusions of this study. Tropical Storm Lee was an erosive event, where erosion occurred on 82% of all streambanks and where 88% of streambanks that were aggrading before Tropical Storm Lee became erosional. Statistical analysis indicated that drainage area explains linear changes (cm/y) in eroding streambanks and that channel top width explains cross-sectional area changes (cm2/y) in eroding streambanks and floodplain deposition (mm/y). A quasi-sediment budget constructed for the study period using the streambanks, channel bed, channel bars, and floodplain measurements underestimated the measured suspended-sediment load by 61% (2130 Mg/y). Underestimation of the sediment load may be caused by measurement errors and to contributions from upland sediment sources, which were not measured but estimated at 36% of the gross input of sediment. Eroding streambanks contributed 42% of the gross input of sediment and accounted for 70% of the measured suspended-sediment load. Similar to other urban watersheds, the large percentage of impervious area in Difficult Run and direct runoff of precipitation leads to increased streamflow and streambank erosion. This study emphasizes the importance of streambanks in urban-suburban sediment budgets but also suggests that other sediment sources, such as upland sources, which were not measured in this study, can be an important source of sediment.
Stream channels of the Upper San Pedro with percent difference between results from two SWAT simulations run through AGWA: one using the 1973 NALC landcover for model parameterization, and the other using the 1997 NALC landcover.
An 81.6 μW FastICA processor for epileptic seizure detection.
Yang, Chia-Hsiang; Shih, Yi-Hsin; Chiueh, Herming
2015-02-01
To improve the performance of epileptic seizure detection, independent component analysis (ICA) is applied to multi-channel signals to separate artifacts and signals of interest. FastICA is an efficient algorithm to compute ICA. To reduce the energy dissipation, eigenvalue decomposition (EVD) is utilized in the preprocessing stage to reduce the convergence time of iterative calculation of ICA components. EVD is computed efficiently through an array structure of processing elements running in parallel. Area-efficient EVD architecture is realized by leveraging the approximate Jacobi algorithm, leading to a 77.2% area reduction. By choosing proper memory element and reduced wordlength, the power and area of storage memory are reduced by 95.6% and 51.7%, respectively. The chip area is minimized through fixed-point implementation and architectural transformations. Given a latency constraint of 0.1 s, an 86.5% area reduction is achieved compared to the direct-mapped architecture. Fabricated in 90 nm CMOS, the core area of the chip is 0.40 mm(2). The FastICA processor, part of an integrated epileptic control SoC, dissipates 81.6 μW at 0.32 V. The computation delay of a frame of 256 samples for 8 channels is 84.2 ms. Compared to prior work, 0.5% power dissipation, 26.7% silicon area, and 3.4 × computation speedup are achieved. The performance of the chip was verified by human dataset.
Synthesis and crystal structure of a novel pentaborate, Na{sub 3}ZnB{sub 5}O{sub 10}
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen Xuean; Li Ming; Chang Xinan
A novel ternary borate, trisodium zinc pentaborate, Na{sub 3}ZnB{sub 5}O{sub 10}, has been prepared by solid-state reaction at temperature below 750deg. C. The single-crystal X-ray structural analysis showed that Na{sub 3}ZnB{sub 5}O{sub 10} crystallizes in the monoclinic space group P2{sub 1}/n with a=6.6725(7)A, b=18.1730(10)A, c=7.8656(9)A, {beta}=114.604(6){sup o}, Z=4. It represents a new structure type in which double ring [B{sub 5}O{sub 10}]{sup 5-} building units are bridged by ZnO{sub 4} tetrahedra through common O atoms to form a two-dimensional {sub {approx}}{sup 2}[ZnB{sub 5}O{sub 10}]{sup 3-}-layer that affords one-dimensional channels running parallel to the [101] direction. Symmetry-center related {sub {approx}}{sup 2}[ZnB{sub 5}O{submore » 10}]{sup 3-} layers are stacked along the b-axis, with the interlayer void spaces and intralayer open channels filled by Na{sup +} cations to balance charge. The IR spectrum further confirms the presence of both BO{sub 3} and BO{sub 4} groups and UV-vis diffuse reflectance spectrum shows a band gap of about 3.2eV.« less
NASA Astrophysics Data System (ADS)
Magilligan, F. J.; Fisher, B.; Nislow, K.; Wright, J.; Mackey, G.
2006-12-01
Unlike watersheds in other parts of the US, little is known about the function of wood in New England Rivers, especially in Downeast Maine which possesses the few remaining wild runs of Atlantic Salmon in the US. Rivers in this region have been heavily affected by historical and contemporary land use disturbance especially hillslope and riparian logging which combine to limit the supply of large woody debris (LWD). Results from a multi-basin inventory representing over 40 river km of 6 drainages across Downeast Maine indicate LWD mean loadings of appx. 90 pieces per km with less than 1% having diameters> 50 cm with the rest split between 60 % small (10-20 cm) and 39% medium (20-50 cm) sizes. Because of their small size, most of this LWD is oriented parallel (~37%) or downstream (~33%), and channel-spanning stable wood pieces/jams are rare. Salmonid populations appear to depend on overwintering habitat and unlike other regions where LWD serves an important role in pool formation, LWD in Downeast Maine functions primarily as sites of critical sediment storage thus reducing overall channel embeddedness. To capture this function, we cored an array of LWD- related sediment wedges and used the fallout radionuclides, 7Be and 210Pb, to estimate sediment residence times. Results indicate that LWD is an important sediment sink having residence times ranging from the individual event to > several years.
Riparian Vegetation Effects on Near-Bank Turbulence During Overbank Flows: A Flume Experiment
NASA Astrophysics Data System (ADS)
McBride, M.; Thompson, D. M.; Owen, T. E.; Pearce, A. R.; Hession, W. C.; Rizzo, D.
2005-12-01
Measurements from a fixed-bed, Froude-scaled hydraulic model of a stream in northeastern Vermont demonstrated the importance of riparian vegetation effects on near-bank turbulence during overbank flood events. The prototype stream, a tributary to Sleepers River, increased in channel width within the last 40 years in response to passive reforestation of its riparian zone. Previous research has found that reaches of small streams with forested riparian zones are commonly wider that adjacent reaches with non-forested, or meadow, vegetation; however, the driving mechanisms for this morphologic difference are not fully explained. Flume experiments were performed to investigate near-bank turbulence as a mechanism for channel widening in response to reforestation. A 1:5 scale, simplified model of half a channel and its adjacent floodplain was constructed within a 6 m long recirculating flume. The test region was 3.7 m long and 0.9 m wide and oriented with the channel centerline at the flume wall. The channel bed slope was fixed at 0.03, and experiments were run at three discharges: 30, 33, and 36 l/s. Two types of riparian vegetation scenarios were simulated: forested, with rigid, randomly-distributed, wooden dowels, and non-forested, with synthetic grass carpeting. Three-dimensional velocities were measured with a Nortek Vectrino acoustic Doppler velocimeter at 41 different locations within the channel and floodplain at near-bed and 0.6-depth elevations. Observations of three-dimensional velocities and calculations of turbulent kinetic energy (TKE) showed significant differences between forested and non-forested runs. Results indicated that turbulence intensity, as quantified by TKE, roughly doubled throughout the channel and floodplain when forested vegetation was introduced. Given that sediment entrainment and transport can be amplified in flows with high turbulence intensity, our results demonstrated the potential for increased erosion during overbank flood events in stream reaches with recently reforested riparian zones. The concentration of high TKE values and vertical upwelling at the channel-floodplain interface in forested runs indicated a probable erosion hot spot that could promote channel widening.
Self-Formed Meanders (With Cutoffs) in a Laboratory Flume
NASA Astrophysics Data System (ADS)
Braudrick, C. A.; Leverich, G. T.; Sklar, L. S.; Dietrich, W. E.
2005-12-01
The development of a mechanistic understanding of channel geometry and morphodynamics has been inhibited by the inability to create self-formed, freely meandering, single thread channels in a laboratory flume. By being able to reliably generate such channels, studies of the influence of sediment supply and flow dynamics as well as bank strength on channel morphology can be experimentally explored. We have found that the key experimental controls are: 1) ratio of bank strength to boundary shear stress exerted on the bank; 2) bedload and suspended load rates; and 3) variable flow discharge. We have been able to create meandering channels in a sand bedded laboratory flume using alfalfa sprouts. The alfalfa sprouts decrease the bank erosion rate so that bank erosion would occur at approximately the same pace as bar growth. The addition of coarse suspended load was necessary to cause deposition on bars to grow to the floodplain height. The sprouts contributed to deposition by creating a rough floodplain surface. Steady discharge failed to produced meandering, apparently due to the lack of suspended load deposition on the bar surface. The channels were created in a 3.6-m wide and 6.1-m long flume with an adjustable slope set at 0.01. We introduced both bedload (sand) and suspended load (crushed silica) into the top of the flume, which has an initial channel with either one or two bends carved into the floodplain. Runs lasted between 1 and 4 hours and occurred once per week. Alfalfa seeds were spread evenly outside the low flow channel following each run and are allowed to grow between runs. With the same material and flow conditions, the channel rapidly braided without the alfalfa sprouts. Braided was also favored under steady flow conditions. Under dynamic flows with banks strengthened by sprouts, the resulting experimental channels had many of the features observed in meandering streams such as oxbow lakes and meander cutoffs. The cutoffs occurred during overbank flows when high flow channels were reoccupied. As the portion of the flow passing through the reoccupied channel increased, an upstream-propagating headcut was initiated. Once the headcut propagated past the upstream junction with the main channel, sediment deposition blocked the upstream end of a secondary channel. The cutoffs became oxbow lakes when rapid bar growth promoted lateral channel migration away from the downstream junction with the cutoff channel. With these results in hand we are completing the construction of a larger flume in which we will set forth experiments on the influence of sediment supply, discharge magnitude and duration, grain size and bank strength on channel geometry.
Crystal structure of a macrophage migration inhibitory factor from Giardia lamblia
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buchko, Garry W.; Abendroth, Jan; Robinson, Howard
2013-06-15
Macrophage migration inhibitory factor (MIF) is a eukaryotic cytokine that affects a broad spectrum of immune responses and its activation/inactivation is associated with numerous diseases. During protozoan infections MIF is not only expressed by the host, but, has also been observed to be expressed by some parasites and released into the host. To better understand the biological role of parasitic MIF proteins, the crystal structure of the MIF protein from Giardia lamblia (Gl-MIF), the etiological agent responsible for giardiasis, has been determined at 2.30 Å resolution. The 114-residue protein adopts an α/β fold consisting of a four-stranded β-sheet with twomore » anti-parallel α-helices packed against a face of the β-sheet. An additional short β-strand aligns anti-parallel to β4 of the β-sheet in the adjacent protein unit to help stabilize a trimer, the biologically relevant unit observed in all solved MIF crystal structures to date, and form a discontinuous β-barrel. The structure of Gl-MIF is compared to the MIF structures from humans (Hs-MIF) and three Plasmodium species (falciparum, berghei, and yoelii). The structure of all five MIF proteins are generally similar with the exception of a channel that runs through the center of each trimer complex. Relative to Hs-MIF, there are differences in solvent accessibility and electrostatic potential distribution in the channel of Gl-MIF and the Plasmodium-MIFs due primarily to two “gate-keeper” residues in the parasitic MIFs. For the Plasmodium MIFs the gate-keeper residues are at positions 44 (Y==>R) and 100 (V==>D) and for Gl-MIF it is at position 100 (V==>R). If these gate-keeper residues have a biological function and contribute to the progression of parasitemia they may also form the basis for structure-based drug design targeting parasitic MIF proteins.« less
Abendroth, Jan; Robinson, Howard; Zhang, Yanfeng; Hewitt, Stephen N.; Edwards, Thomas E.; Van Voorhis, Wesley C.; Myler, Peter J.
2013-01-01
Macrophage migration inhibitory factor (MIF) is a eukaryotic cytokine that affects a broad spectrum of immune responses and its activation/inactivation is associated with numerous diseases. During protozoan infections MIF is not only expressed by the host, but, has also been observed to be expressed by some parasites and released into the host. To better understand the biological role of parasitic MIF proteins, the crystal structure of the MIF protein from Giardia lamblia (Gl-MIF), the etiological agent responsible for giardiasis, has been determined at 2.30 Å resolution. The 114-residue protein adopts an α/β fold consisting of a four-stranded β-sheet with two anti-parallel α-helices packed against a face of the β-sheet. An additional short β-strand aligns anti-parallel to β4 of the β-sheet in the adjacent protein unit to help stabilize a trimer, the biologically relevant unit observed in all solved MIF crystal structures to date, and form a discontinuous β-barrel. The structure of Gl-MIF is compared to the MIF structures from humans (Hs-MIF) and three Plasmodium species (falciparum, berghei, and yoelii). The structure of all five MIF proteins are generally similar with the exception of a channel that runs through the center of each trimer complex. Relative to Hs-MIF, there are differences in solvent accessibility and electrostatic potential distribution in the channel of Gl-MIF and the Plasmodium-MIFs due primarily to two “gate-keeper” residues in the parasitic MIFs. For the Plasmodium MIFs the gate-keeper residues are at positions 44 (Y⇒R) and 100 (V⇒D) and for Gl-MIF it is at position 100 (V⇒R). If these gate-keeper residues have a biological function and contribute to the progression of parasitemia they may also form the basis for structure-based drug design targeting parasitic MIF proteins. PMID:23709284
Non-volatile memory for checkpoint storage
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blumrich, Matthias A.; Chen, Dong; Cipolla, Thomas M.
A system, method and computer program product for supporting system initiated checkpoints in high performance parallel computing systems and storing of checkpoint data to a non-volatile memory storage device. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity. In one embodiment, themore » non-volatile memory is a pluggable flash memory card.« less
Multitasking for flows about multiple body configurations using the chimera grid scheme
NASA Technical Reports Server (NTRS)
Dougherty, F. C.; Morgan, R. L.
1987-01-01
The multitasking of a finite-difference scheme using multiple overset meshes is described. In this chimera, or multiple overset mesh approach, a multiple body configuration is mapped using a major grid about the main component of the configuration, with minor overset meshes used to map each additional component. This type of code is well suited to multitasking. Both steady and unsteady two dimensional computations are run on parallel processors on a CRAY-X/MP 48, usually with one mesh per processor. Flow field results are compared with single processor results to demonstrate the feasibility of running multiple mesh codes on parallel processors and to show the increase in efficiency.
Design considerations for parallel graphics libraries
NASA Technical Reports Server (NTRS)
Crockett, Thomas W.
1994-01-01
Applications which run on parallel supercomputers are often characterized by massive datasets. Converting these vast collections of numbers to visual form has proven to be a powerful aid to comprehension. For a variety of reasons, it may be desirable to provide this visual feedback at runtime. One way to accomplish this is to exploit the available parallelism to perform graphics operations in place. In order to do this, we need appropriate parallel rendering algorithms and library interfaces. This paper provides a tutorial introduction to some of the issues which arise in designing parallel graphics libraries and their underlying rendering algorithms. The focus is on polygon rendering for distributed memory message-passing systems. We illustrate our discussion with examples from PGL, a parallel graphics library which has been developed on the Intel family of parallel systems.
Using the GeoFEST Faulted Region Simulation System
NASA Technical Reports Server (NTRS)
Parker, Jay W.; Lyzenga, Gregory A.; Donnellan, Andrea; Judd, Michele A.; Norton, Charles D.; Baker, Teresa; Tisdale, Edwin R.; Li, Peggy
2004-01-01
GeoFEST (the Geophysical Finite Element Simulation Tool) simulates stress evolution, fault slip and plastic/elastic processes in realistic materials, and so is suitable for earthquake cycle studies in regions such as Southern California. Many new capabilities and means of access for GeoFEST are now supported. New abilities include MPI-based cluster parallel computing using automatic PYRAMID/Parmetis-based mesh partitioning, automatic mesh generation for layered media with rectangular faults, and results visualization that is integrated with remote sensing data. The parallel GeoFEST application has been successfully run on over a half-dozen computers, including Intel Xeon clusters, Itanium II and Altix machines, and the Apple G5 cluster. It is not separately optimized for different machines, but relies on good domain partitioning for load-balance and low communication, and careful writing of the parallel diagonally preconditioned conjugate gradient solver to keep communication overhead low. Demonstrated thousand-step solutions for over a million finite elements on 64 processors require under three hours, and scaling tests show high efficiency when using more than (order of) 4000 elements per processor. The source code and documentation for GeoFEST is available at no cost from Open Channel Foundation. In addition GeoFEST may be used through a browser-based portal environment available to approved users. That environment includes semi-automated geometry creation and mesh generation tools, GeoFEST, and RIVA-based visualization tools that include the ability to generate a flyover animation showing deformations and topography. Work is in progress to support simulation of a region with several faults using 16 million elements, using a strain energy metric to adapt the mesh to faithfully represent the solution in a region of widely varying strain.
XMOS XC-2 Development Board for Mechanical Control and Data Collection
NASA Technical Reports Server (NTRS)
Jarnot, Robert F.; Bowden, William J.
2011-01-01
The scanning microwave limb sounder (SMLS) will use technological improvements in low-noise mixers to provide precise data on the Earth s atmospheric composition with high spatial resolution. This project focuses on the design and implementation of a realtime control system needed for airborne engineering tests of the SMLS. The system must coordinate the actuation of optical components using four motors with encoder readback, while collecting synchronized telemetric data from a GPS receiver and 3-axis gyrometric system. A graphical user interface for testing the control system was also designed using Python. Although the system could have been implemented with an FPGA(fieldprogrammable gate array)-based setup, a processor development kit manufactured by XMOS was chosen. The XMOS architecture allows parallel execution of multiple tasks on separate threads, making it ideal for this application. It is easily programmed using XC (a subset of C). The necessary communication interfaces were implemented in software, including Ethernet, with significant cost and time reduction compared to an FPGA-based approach. A simple approach to control the chopper, calibration mirror, and gimbal for the airborne SMLS was needed. The XMOS board allows for multiple threads and real-time data acquisition. The XC-2 development kit is an attractive choice for synchronized, real-time, event-driven applications. The XMOS is based on the transputer microprocessor architecture developed for parallel computing, which is being revamped in this new platform. The XMOS device has multiple cores capable of running parallel applications on separate threads. The threads communicate with each other via user-defined channels capable of transmitting data within the device. XMOS provides a C-based development environment using XC, which eliminates the need for custom tool kits associated with FPGA programming. The XC-2 has four cores and necessary hardware for Ethernet I/O.
Density-based parallel skin lesion border detection with webCL
2015-01-01
Background Dermoscopy is a highly effective and noninvasive imaging technique used in diagnosis of melanoma and other pigmented skin lesions. Many aspects of the lesion under consideration are defined in relation to the lesion border. This makes border detection one of the most important steps in dermoscopic image analysis. In current practice, dermatologists often delineate borders through a hand drawn representation based upon visual inspection. Due to the subjective nature of this technique, intra- and inter-observer variations are common. Because of this, the automated assessment of lesion borders in dermoscopic images has become an important area of study. Methods Fast density based skin lesion border detection method has been implemented in parallel with a new parallel technology called WebCL. WebCL utilizes client side computing capabilities to use available hardware resources such as multi cores and GPUs. Developed WebCL-parallel density based skin lesion border detection method runs efficiently from internet browsers. Results Previous research indicates that one of the highest accuracy rates can be achieved using density based clustering techniques for skin lesion border detection. While these algorithms do have unfavorable time complexities, this effect could be mitigated when implemented in parallel. In this study, density based clustering technique for skin lesion border detection is parallelized and redesigned to run very efficiently on the heterogeneous platforms (e.g. tablets, SmartPhones, multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units) by transforming the technique into a series of independent concurrent operations. Heterogeneous computing is adopted to support accessibility, portability and multi-device use in the clinical settings. For this, we used WebCL, an emerging technology that enables a HTML5 Web browser to execute code in parallel for heterogeneous platforms. We depicted WebCL and our parallel algorithm design. In addition, we tested parallel code on 100 dermoscopy images and showed the execution speedups with respect to the serial version. Results indicate that parallel (WebCL) version and serial version of density based lesion border detection methods generate the same accuracy rates for 100 dermoscopy images, in which mean of border error is 6.94%, mean of recall is 76.66%, and mean of precision is 99.29% respectively. Moreover, WebCL version's speedup factor for 100 dermoscopy images' lesion border detection averages around ~491.2. Conclusions When large amount of high resolution dermoscopy images considered in a usual clinical setting along with the critical importance of early detection and diagnosis of melanoma before metastasis, the importance of fast processing dermoscopy images become obvious. In this paper, we introduce WebCL and the use of it for biomedical image processing applications. WebCL is a javascript binding of OpenCL, which takes advantage of GPU computing from a web browser. Therefore, WebCL parallel version of density based skin lesion border detection introduced in this study can supplement expert dermatologist, and aid them in early diagnosis of skin lesions. While WebCL is currently an emerging technology, a full adoption of WebCL into the HTML5 standard would allow for this implementation to run on a very large set of hardware and software systems. WebCL takes full advantage of parallel computational resources including multi-cores and GPUs on a local machine, and allows for compiled code to run directly from the Web Browser. PMID:26423836
Density-based parallel skin lesion border detection with webCL.
Lemon, James; Kockara, Sinan; Halic, Tansel; Mete, Mutlu
2015-01-01
Dermoscopy is a highly effective and noninvasive imaging technique used in diagnosis of melanoma and other pigmented skin lesions. Many aspects of the lesion under consideration are defined in relation to the lesion border. This makes border detection one of the most important steps in dermoscopic image analysis. In current practice, dermatologists often delineate borders through a hand drawn representation based upon visual inspection. Due to the subjective nature of this technique, intra- and inter-observer variations are common. Because of this, the automated assessment of lesion borders in dermoscopic images has become an important area of study. Fast density based skin lesion border detection method has been implemented in parallel with a new parallel technology called WebCL. WebCL utilizes client side computing capabilities to use available hardware resources such as multi cores and GPUs. Developed WebCL-parallel density based skin lesion border detection method runs efficiently from internet browsers. Previous research indicates that one of the highest accuracy rates can be achieved using density based clustering techniques for skin lesion border detection. While these algorithms do have unfavorable time complexities, this effect could be mitigated when implemented in parallel. In this study, density based clustering technique for skin lesion border detection is parallelized and redesigned to run very efficiently on the heterogeneous platforms (e.g. tablets, SmartPhones, multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units) by transforming the technique into a series of independent concurrent operations. Heterogeneous computing is adopted to support accessibility, portability and multi-device use in the clinical settings. For this, we used WebCL, an emerging technology that enables a HTML5 Web browser to execute code in parallel for heterogeneous platforms. We depicted WebCL and our parallel algorithm design. In addition, we tested parallel code on 100 dermoscopy images and showed the execution speedups with respect to the serial version. Results indicate that parallel (WebCL) version and serial version of density based lesion border detection methods generate the same accuracy rates for 100 dermoscopy images, in which mean of border error is 6.94%, mean of recall is 76.66%, and mean of precision is 99.29% respectively. Moreover, WebCL version's speedup factor for 100 dermoscopy images' lesion border detection averages around ~491.2. When large amount of high resolution dermoscopy images considered in a usual clinical setting along with the critical importance of early detection and diagnosis of melanoma before metastasis, the importance of fast processing dermoscopy images become obvious. In this paper, we introduce WebCL and the use of it for biomedical image processing applications. WebCL is a javascript binding of OpenCL, which takes advantage of GPU computing from a web browser. Therefore, WebCL parallel version of density based skin lesion border detection introduced in this study can supplement expert dermatologist, and aid them in early diagnosis of skin lesions. While WebCL is currently an emerging technology, a full adoption of WebCL into the HTML5 standard would allow for this implementation to run on a very large set of hardware and software systems. WebCL takes full advantage of parallel computational resources including multi-cores and GPUs on a local machine, and allows for compiled code to run directly from the Web Browser.
1. Aerial view of turnpike path running diagonally up from ...
1. Aerial view of turnpike path running diagonally up from lower left (present-day Orange Turnpike alignment) and containing on towards upper right through tree clump in center of the bare spot on the landscape, and on through the trees. View looking south. - Orange Turnpike, Parallel to new Orange Turnpike, Monroe, Orange County, NY
Code of Federal Regulations, 2010 CFR
2010-04-01
... approximately 0.25 mile to its intersection with a trail and the 3,800-foot elevation line, T6N, R13W; then (9... (21) Proceed north and then generally southeast along the 3,600-foot elevation line that runs parallel... elevation line that runs north of the San Andreas Rift Zone to its intersection with the section 16 east...
Code of Federal Regulations, 2011 CFR
2011-04-01
... approximately 0.25 mile to its intersection with a trail and the 3,800-foot elevation line, T6N, R13W; then (9... (21) Proceed north and then generally southeast along the 3,600-foot elevation line that runs parallel... elevation line that runs north of the San Andreas Rift Zone to its intersection with the section 16 east...
Doheny, Edward J.; Starsoneck, Roger J.; Striz, Elise A.; Mayer, Paul M.
2006-01-01
Stream restoration efforts have been ongoing in Maryland since the early 1990s. Physical stream restoration often involves replacement of lost sediments to elevate degraded streambeds, re-establishment of riffle-pool sequences along the channel profile, planting vegetation in riparian zones, and re-constructing channel banks, point bars, flood plains, and stream-meanders. The primary goal of many restoration efforts is to re-establish geomorphic stability of the stream channel and reduce erosive energy from urban runoff. Monitoring streams prior to and after restoration could help quantify other possible benefits of stream restoration, such as improved water quality and biota. This report presents general watershed characteristics associated with the Minebank Run watershed; a small, urban watershed in the south-central section of Baltimore County, Maryland that was physically restored in phases during 1999, 2004, and 2005. The physiography, geology, hydrology, land use, soils, and pre-restoration geomorphic setting of the unrestored stream channel are discussed. The report describes a reach of Minebank Run that was selected for the purpose of collecting several types of environmental data prior to restoration, including continuous-record and partial-record stage and streamflow data, precipitation, and ground-water levels. Examples of surface-water data that were collected in and near the study reach during water years 2002 through 2004, including continuous-record streamflow, partial-record stage and discharge, and precipitation, are described. These data were used in analyses of several characteristics of surface-water hydrology in the watershed, including (1) rainfall totals, storm duration, and intensity, (2) instantaneous peak discharge and daily mean discharge, (3) stage-discharge ratings, (4) hydraulic-geometry relations, (5) water-surface slope, (6) time of concentration, (7) flood frequency, (8) flood volume, and (9) rainfall-runoff relations. Several hydrologic characteristics that are typical of urban environments were quantified by these analyses. These include (1) large ratios of peak discharge to daily mean discharge as an indicator of flashiness, (2) consistent shifting of the stage-discharge rating over short periods of time that indicates instability of the stream channel, (3) analyses of hydraulic-geometry relations that indicate mean velocities of 11 feet per second or more while the flow is contained in the stream channel, (4) discharges that are 4 to 5 times larger in Minebank Run for corresponding flood frequency recurrence intervals than in Slade Run, which is a Piedmont watershed of similar size with smaller percentages of urban development, and (5) flood waves that can travel through the stream channel at a velocity of 412 feet per minute, or 6.9 feet per second.
MINEBANK RUN PROJECT AS AN APPROACH FOR RESTORING DEGRADED URBAN WATERSHEDS AND RIPARIAN ECOSYSTEMS
Elevated nitrate levels in streams and groundwater pose human and ecological threats. Minebank Run, an urban stream in Baltimore MD, will be restored in 2004/2005 using various techniques including reshaping stream banks to reconnect stream channel to flood plain, stream bank r...
Data collected from 2002 through 2008 were used to assess geomorphic characteristics and geomorphic changes over time in a selected reach of Minebank Run, a small urban watershed near Towson, Maryland, prior to and after its physical restoration in 2004 and 2005. Data collected ...
Array signal recovery algorithm for a single-RF-channel DBF array
NASA Astrophysics Data System (ADS)
Zhang, Duo; Wu, Wen; Fang, Da Gang
2016-12-01
An array signal recovery algorithm based on sparse signal reconstruction theory is proposed for a single-RF-channel digital beamforming (DBF) array. A single-RF-channel antenna array is a low-cost antenna array in which signals are obtained from all antenna elements by only one microwave digital receiver. The spatially parallel array signals are converted into time-sequence signals, which are then sampled by the system. The proposed algorithm uses these time-sequence samples to recover the original parallel array signals by exploiting the second-order sparse structure of the array signals. Additionally, an optimization method based on the artificial bee colony (ABC) algorithm is proposed to improve the reconstruction performance. Using the proposed algorithm, the motion compensation problem for the single-RF-channel DBF array can be solved effectively, and the angle and Doppler information for the target can be simultaneously estimated. The effectiveness of the proposed algorithms is demonstrated by the results of numerical simulations.
NASA Astrophysics Data System (ADS)
Cho, Y.; Chang, C.-C.; Wang, L. V.; Zou, J.
2016-02-01
This paper reports the development of a new 16-channel parallel acoustic delay line (PADL) array for real-time photoacoustic tomography (PAT). The PADLs were directly fabricated from single-crystalline silicon substrates using deep reactive ion etching. Compared with other acoustic delay lines (e.g., optical fibers), the micromachined silicon PADLs offer higher acoustic transmission efficiency, smaller form factor, easier assembly, and mass production capability. To demonstrate its real-time photoacoustic imaging capability, the silicon PADL array was interfaced with one single-element ultrasonic transducer followed by one channel of data acquisition electronics to receive 16 channels of photoacoustic signals simultaneously. A PAT image of an optically-absorbing target embedded in an optically-scattering phantom was reconstructed, which matched well with the actual size of the imaged target. Because the silicon PADL array allows a signal-to-channel reduction ratio of 16:1, it could significantly simplify the design and construction of ultrasonic receivers for real-time PAT.
Experiences with serial and parallel algorithms for channel routing using simulated annealing
NASA Technical Reports Server (NTRS)
Brouwer, Randall Jay
1988-01-01
Two algorithms for channel routing using simulated annealing are presented. Simulated annealing is an optimization methodology which allows the solution process to back up out of local minima that may be encountered by inappropriate selections. By properly controlling the annealing process, it is very likely that the optimal solution to an NP-complete problem such as channel routing may be found. The algorithm presented proposes very relaxed restrictions on the types of allowable transformations, including overlapping nets. By freeing that restriction and controlling overlap situations with an appropriate cost function, the algorithm becomes very flexible and can be applied to many extensions of channel routing. The selection of the transformation utilizes a number of heuristics, still retaining the pseudorandom nature of simulated annealing. The algorithm was implemented as a serial program for a workstation, and a parallel program designed for a hypercube computer. The details of the serial implementation are presented, including many of the heuristics used and some of the resulting solutions.
An asymptotic induced numerical method for the convection-diffusion-reaction equation
NASA Technical Reports Server (NTRS)
Scroggs, Jeffrey S.; Sorensen, Danny C.
1988-01-01
A parallel algorithm for the efficient solution of a time dependent reaction convection diffusion equation with small parameter on the diffusion term is presented. The method is based on a domain decomposition that is dictated by singular perturbation analysis. The analysis is used to determine regions where certain reduced equations may be solved in place of the full equation. Parallelism is evident at two levels. Domain decomposition provides parallelism at the highest level, and within each domain there is ample opportunity to exploit parallelism. Run time results demonstrate the viability of the method.
Implementations of BLAST for parallel computers.
Jülich, A
1995-02-01
The BLAST sequence comparison programs have been ported to a variety of parallel computers-the shared memory machine Cray Y-MP 8/864 and the distributed memory architectures Intel iPSC/860 and nCUBE. Additionally, the programs were ported to run on workstation clusters. We explain the parallelization techniques and consider the pros and cons of these methods. The BLAST programs are very well suited for parallelization for a moderate number of processors. We illustrate our results using the program blastp as an example. As input data for blastp, a 799 residue protein query sequence and the protein database PIR were used.
A C++ Thread Package for Concurrent and Parallel Programming
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jie Chen; William Watson
1999-11-01
Recently thread libraries have become a common entity on various operating systems such as Unix, Windows NT and VxWorks. Those thread libraries offer significant performance enhancement by allowing applications to use multiple threads running either concurrently or in parallel on multiprocessors. However, the incompatibilities between native libraries introduces challenges for those who wish to develop portable applications.
drPACS: A Simple UNIX Execution Pipeline
NASA Astrophysics Data System (ADS)
Teuben, P.
2011-07-01
We describe a very simple yet flexible and effective pipeliner for UNIX commands. It creates a Makefile to define a set of serially dependent commands. The commands in the pipeline share a common set of parameters by which they can communicate. Commands must follow a simple convention to retrieve and store parameters. Pipeline parameters can optionally be made persistent across multiple runs of the pipeline. Tools were added to simplify running a large series of pipelines, which can then also be run in parallel.
Method for resource control in parallel environments using program organization and run-time support
NASA Technical Reports Server (NTRS)
Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)
2001-01-01
A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.
Method for resource control in parallel environments using program organization and run-time support
NASA Technical Reports Server (NTRS)
Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)
1999-01-01
A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.
Parallel Signal Processing and System Simulation using aCe
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2003-01-01
Recently, networked and cluster computation have become very popular for both signal processing and system simulation. A new language is ideally suited for parallel signal processing applications and system simulation since it allows the programmer to explicitly express the computations that can be performed concurrently. In addition, the new C based parallel language (ace C) for architecture-adaptive programming allows programmers to implement algorithms and system simulation applications on parallel architectures by providing them with the assurance that future parallel architectures will be able to run their applications with a minimum of modification. In this paper, we will focus on some fundamental features of ace C and present a signal processing application (FFT).
NASA Astrophysics Data System (ADS)
Battisacco, Elena; Franca, Mário J.; Schleiss, Anton J.
2016-04-01
Dams interrupt the longitudinal continuity of river reaches since they store water and trap sediment in the upstream reservoir. By the interruption of the sediment continuum, the transport capacity of downstream stretch exceeds the sediment supply, thus the flow becomes "hungry". Sediment replenishment is an increasingly used method for restoring the continuity in rivers and for re-establishing the sediment regime of such disturbed river reaches. This research evaluates the effect of different geometrical configurations of sediment replenishment on the evolution of the bed morphology by systematic laboratory experiments. A typical straight armoured gravel reach is reproduced in a laboratory flume in terms of slope, grain size and cross section. The total amount of replenished sediment is placed in four identical volumes on both channel banks, forming six different geometrical configurations. Both alternated and parallel combinations are studied. Preliminary studies demonstrate that a complete submergence condition of the replenishment deposits is most adequate for obtaining a complete erosion and a high persistence of the replenished material in the channel. The response of the channel bed morphology to replenishment is documented by camera and laser scanners installed on a moveable carriage. The parallel configurations create an initially strong narrowing of the channel section. The transport capacity is thus higher and most of the replenished sediments exit the channel. The parallel configurations result in a more spread distribution of grains but with no clear morphological pattern. Clear bed form patterns can be observed when applying alternated configurations. Furthermore, the wavelength of depositions correspond to the replenishment deposit length. These morphological forms can be assumed as mounds. In order to enhance channel bed morphology on an armoured bed by sediment replenishment, alternated deposit configurations are more favourable and effective. The present study is supported by FOEN (Federal Office for the Environment, Switzerland).
Lg Attenuation Anisotropy Across the Western US
NASA Astrophysics Data System (ADS)
Phillips, W. S.; Rowe, C. A.; Stead, R. J.; Begnaud, M. L.
2017-12-01
The USArray has allowed us to map seismic attenuation of local and regional phases to unprecedented spatial extent and resolution. Following standard mantle Pn velocity anisotropy methods, we have incorporated azimuthal anisotropy into our tomographic inversion of high-frequency Lg amplitudes. The Lg is a crustal shear phase made up of many trapped modes, thus results can be considered to be crustal averages. Azimuthal anisotropy reduces residual variance by just over 10% for 1.5-3 Hz Lg. We observe a median anisotropic variation of 12%, and a high of 50% in the Salton Trough. Low attenuation (high-Q) directions run parallel to topographic fabric and major strike slip faults in tectonically active areas, and often run parallel to mantle shear wave splitting directions in stable regions. Tradeoffs are of concern, and synthetic tests show that elongated attenuation anomalies will produce anisotropy artifacts, but of factors 2-3 times lower than observations. In particular, the strength of a long, narrow high-Q anomaly will trade off with high-Q directions parallel to the long axis, while an elongated low-Q anomaly will trade off with high-Q directions perpendicular to the long axis. We observe an elongated low-Q anomaly associated with the Walker Lane; however, observed high-Q directions run parallel to the long axis of this anomaly, opposite to the tradeoff effect, supporting the anisotropic observation, and implying that the effect may be underestimated. Further, we observe an elongated high-Q anomaly associated with the Great Valley and Sierra Nevada that runs across the long axis, again opposite to the tradeoff effect. This study was performed using waveforms, event locations and phase picks made available by IRIS, NEIC and ANF, and processing was done using semi-automated means, thus this is a technique that can be applied quickly to study crustal anisotropy over large areas when appropriate station density is available.
Kim, Yong-Kwan; Kang, Pil Soo; Kim, Dae-Il; Shin, Gunchul; Kim, Gyu Tae; Ha, Jeong Sook
2009-03-01
A printing-based lithographic technique for the patterning of V(2)O(5) nanowire channels with unidirectional orientation and controlled length is introduced. The simple, directional blowing of a patterned polymer stamp with N(2) gas, inked with randomly distributed V(2)O(5) nanowires, induces alignment of the nanowires perpendicular to the long axis of the line patterns. Subsequent stamping on the amine-terminated surface results in the selective transfer of the aligned nanowires with a controlled length corresponding to the width of the relief region of the polymer stamp. By employing such a gas-blowing-assisted, selective-transfer-printing technique, two kinds of device structures consisting of nanowire channels and two metal electrodes with top contact, whereby the nanowires were aligned either parallel (parallel device) or perpendicular (serial device) to the current flow in the conduction channel, are fabricated. The electrical properties demonstrate a noticeable difference between the two devices, with a large hysteresis in the parallel device but none in the serial device. Systematic analysis of the hysteresis and the electrical stability account for the observed hysteresis in terms of the proton diffusion in the water layer of the V(2)O(5) nanowires, induced by the application of an external bias voltage higher than a certain threshold voltage.
Nisisako, Takasi; Ando, Takuya; Hatsuzawa, Takeshi
2012-09-21
This study describes a microfluidic platform with coaxial annular world-to-chip interfaces for high-throughput production of single and compound emulsion droplets, having controlled sizes and internal compositions. The production module consists of two distinct elements: a planar square chip on which many copies of a microfluidic droplet generator (MFDG) are arranged circularly, and a cubic supporting module with coaxial annular channels for supplying fluids evenly to the inlets of the mounted chip, assembled from blocks with cylinders and holes. Three-dimensional flow was simulated to evaluate the distribution of flow velocity in the coaxial multiple annular channels. By coupling a 1.5 cm × 1.5 cm microfluidic chip with parallelized 144 MFDGs and a supporting module with two annular channels, for example, we could produce simple oil-in-water (O/W) emulsion droplets having a mean diameter of 90.7 μm and a coefficient of variation (CV) of 2.2% at a throughput of 180.0 mL h(-1). Furthermore, we successfully demonstrated high-throughput production of Janus droplets, double emulsions and triple emulsions, by coupling 1.5 cm × 1.5 cm - 4.5 cm × 4.5 cm microfluidic chips with parallelized 32-128 MFDGs of various geometries and supporting modules with 3-4 annular channels.
Urban infrastructure and longitudinal stream profiles
NASA Astrophysics Data System (ADS)
Lindner, G. A.; Miller, A. J.
2009-12-01
Urban streams usually are highly engineered or modified by human activity and are conventionally thought of as being geometrically, and thus hydraulically, simple. The work presented here, a contribution to NSF CNH Project 0709659, is designed to capture the influence of urban infrastructure on the character of longitudinal profiles and flow hydraulics along streams in the Baltimore metropolitan area. Detailed topographic data sets are derived from LiDAR supplemented by total-station surveys of the channel bed and low-flow water surface. These in turn are used to drive 2D depth-averaged hydraulic models comparing flow conditions over a range of urban development patterns and stormwater management regimes. Results from stream surveys of 1-2 km length indicate that channels in older, highly urbanized areas typically have straight planforms and strongly stepped profiles characterized by a series of deep, stagnant pools with short intervening riffles or runs. This pattern is associated with frequent interruption of the channel profile by bridges, culverts, road embankments and other artificial structures. In one survey reach of the Dead Run watershed, 50 percent of cumulative channel length has zero gradient at low flow, and 50 percent of cumulative head loss is accounted for by only 4 percent of channel length. In the suburban Red Run watershed recent development has occurred under strict stormwater management regulations with minimal encroachment on the riparian zone. Although their average gradients are similar, the Red Run survey reach is steeper than the Dead Run reach over most its length but has a smaller fraction of total head loss caused by local slope breaks. Modeling results indicate that these differences in stream morphology are associated with differences in velocity, flow pattern, and residence time at base flow; the stepped nature of the profile in the older urban area becomes less pronounced at intermediate to high flows, but the controlling influence of infrastructure may become dominant again during large floods. Because flashy urban streams have lower and more persistent low flows as well as more extreme flood flows, these hydraulic patterns may have implications for both biogeochemical cycling at base flow and transport and deposition of sediment and other constituents during flood periods. Continuing research will develop a typology of urban streams in terms of the influence of engineering practices on flow patterns and material transport.
NASA Astrophysics Data System (ADS)
Susskind, J.; Rosenberg, R. I.
2016-12-01
The GEOS-5 Data Assimilation System (DAS) generates a global analysis every six hours by combining the previous six hour forecast for that time period with contemporaneous observations. These observations include in-situ observations as well as those taken by satellite borne instruments, such as AIRS/AMSU on EOS Aqua and CrIS/ATMS on S-NPP. Operational data assimilation methodology assimilates observed channel radiances Ri for IR sounding instruments such as AIRS and CrIS, but only for those channels i in a given scene whose radiances are thought to be unaffected by clouds. A limitation of this approach is that radiances in most tropospheric sounding channels are affected by clouds under partial cloud cover conditions, which occurs most of the time. The AIRS Science Team Version-6 retrieval algorithm generates cloud cleared radiances (CCR's) for each channel in a given scene, which represent the radiances AIRS would have observed if the scene were cloud free, and then uses them to determine quality controlled (QC'd) temperature profiles T(p) under all cloud conditions. There are potential advantages to assimilate either AIRS QC'd CCR's or QC'd T(p) instead of Ri in that the spatial coverage of observations is greater under partial cloud cover. We tested these two alternate data assimilation approaches by running three parallel data assimilation experiments over different time periods using GEOS-5. Experiment 1 assimilated all observations as done operationally, Experiment 2 assimilated QC'd values of AIRS CCRs in place of AIRS radiances, and Experiment 3 assimilated QC'd values of T(p) in place of observed radiances. Assimilation of QC'd AIRS T(p) resulted in significant improvement in seven day forecast skill compared to assimilation of CCR's or assimilation of observed radiances, especially in the Southern Hemisphere Extra-tropics.
NASA Astrophysics Data System (ADS)
Watford, M.; DeCusatis, C.
2005-09-01
With the advent of new regulations governing the protection and recovery of sensitive business data, including the Sarbanes-Oxley Act, there has been a renewed interest in business continuity and disaster recovery applications for metropolitan area networks. Specifically, there has been a need for more efficient bandwidth utilization and lower cost per channel to facilitate mirroring of multi-terabit data bases. These applications have further blurred the boundary between metropolitan and wide area networks, with synchronous disaster recovery applications running up to 100 km and asynchronous solutions extending to 300 km or more. In this paper, we discuss recent enhancements in the Nortel Optical Metro 5200 Dense Wavelength Division Multiplexing (DWDM) platform, including features recently qualified for data communication applications such as Metro Mirror, Global Mirror, and Geographically Distributed Parallel Sysplex (GDPS). Using a 10 Gigabit/second (Gbit/s) backbone, this solution transports significantly more Fibre Channel protocol traffic with up to five times greater hardware density in the same physical package. This is also among the first platforms to utilize forward error correction (FEC) on the aggregate signals to improve bit error rate (BER) performance beyond industry standards. When combined with encapsulation into wide area network protocols, the use of FEC can compensate for impairments in BER across a service provider infrastructure without impacting application level performance. Design and implementation of these features will be discussed, including results from experimental test beds which validate these solutions for a number of applications. Future extensions of this environment will also be considered, including ways to provide configurable bandwidth on demand, mitigate Fibre Channel buffer credit management issues, and support for other GDPS protocols.
NASA Astrophysics Data System (ADS)
Georgiev, K.; Zlatev, Z.
2010-11-01
The Danish Eulerian Model (DEM) is an Eulerian model for studying the transport of air pollutants on large scale. Originally, the model was developed at the National Environmental Research Institute of Denmark. The model computational domain covers Europe and some neighbour parts belong to the Atlantic Ocean, Asia and Africa. If DEM model is to be applied by using fine grids, then its discretization leads to a huge computational problem. This implies that such a model as DEM must be run only on high-performance computer architectures. The implementation and tuning of such a complex large-scale model on each different computer is a non-trivial task. Here, some comparison results of running of this model on different kind of vector (CRAY C92A, Fujitsu, etc.), parallel computers with distributed memory (IBM SP, CRAY T3E, Beowulf clusters, Macintosh G4 clusters, etc.), parallel computers with shared memory (SGI Origin, SUN, etc.) and parallel computers with two levels of parallelism (IBM SMP, IBM BlueGene/P, clusters of multiprocessor nodes, etc.) will be presented. The main idea in the parallel version of DEM is domain partitioning approach. Discussions according to the effective use of the cache and hierarchical memories of the modern computers as well as the performance, speed-ups and efficiency achieved will be done. The parallel code of DEM, created by using MPI standard library, appears to be highly portable and shows good efficiency and scalability on different kind of vector and parallel computers. Some important applications of the computer model output are presented in short.
Thread-Level Parallelization and Optimization of NWChem for the Intel MIC Architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shan, Hongzhang; Williams, Samuel; Jong, Wibe de
In the multicore era it was possible to exploit the increase in on-chip parallelism by simply running multiple MPI processes per chip. Unfortunately, manycore processors' greatly increased thread- and data-level parallelism coupled with a reduced memory capacity demand an altogether different approach. In this paper we explore augmenting two NWChem modules, triples correction of the CCSD(T) and Fock matrix construction, with OpenMP in order that they might run efficiently on future manycore architectures. As the next NERSC machine will be a self-hosted Intel MIC (Xeon Phi) based supercomputer, we leverage an existing MIC testbed at NERSC to evaluate our experiments.more » In order to proxy the fact that future MIC machines will not have a host processor, we run all of our experiments in tt native mode. We found that while straightforward application of OpenMP to the deep loop nests associated with the tensor contractions of CCSD(T) was sufficient in attaining high performance, significant effort was required to safely and efficiently thread the TEXAS integral package when constructing the Fock matrix. Ultimately, our new MPI OpenMP hybrid implementations attain up to 65x better performance for the triples part of the CCSD(T) due in large part to the fact that the limited on-card memory limits the existing MPI implementation to a single process per card. Additionally, we obtain up to 1.6x better performance on Fock matrix constructions when compared with the best MPI implementations running multiple processes per card.« less
Thread-level parallelization and optimization of NWChem for the Intel MIC architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shan, Hongzhang; Williams, Samuel; de Jong, Wibe
In the multicore era it was possible to exploit the increase in on-chip parallelism by simply running multiple MPI processes per chip. Unfortunately, manycore processors' greatly increased thread- and data-level parallelism coupled with a reduced memory capacity demand an altogether different approach. In this paper we explore augmenting two NWChem modules, triples correction of the CCSD(T) and Fock matrix construction, with OpenMP in order that they might run efficiently on future manycore architectures. As the next NERSC machine will be a self-hosted Intel MIC (Xeon Phi) based supercomputer, we leverage an existing MIC testbed at NERSC to evaluate our experiments.more » In order to proxy the fact that future MIC machines will not have a host processor, we run all of our experiments in native mode. We found that while straightforward application of OpenMP to the deep loop nests associated with the tensor contractions of CCSD(T) was sufficient in attaining high performance, significant e ort was required to safely and efeciently thread the TEXAS integral package when constructing the Fock matrix. Ultimately, our new MPI+OpenMP hybrid implementations attain up to 65× better performance for the triples part of the CCSD(T) due in large part to the fact that the limited on-card memory limits the existing MPI implementation to a single process per card. Additionally, we obtain up to 1.6× better performance on Fock matrix constructions when compared with the best MPI implementations running multiple processes per card.« less
Monte Carlo simulation of a noisy quantum channel with memory.
Akhalwaya, Ismail; Moodley, Mervlyn; Petruccione, Francesco
2015-10-01
The classical capacity of quantum channels is well understood for channels with uncorrelated noise. For the case of correlated noise, however, there are still open questions. We calculate the classical capacity of a forgetful channel constructed by Markov switching between two depolarizing channels. Techniques have previously been applied to approximate the output entropy of this channel and thus its capacity. In this paper, we use a Metropolis-Hastings Monte Carlo approach to numerically calculate the entropy. The algorithm is implemented in parallel and its performance is studied and optimized. The effects of memory on the capacity are explored and previous results are confirmed to higher precision.
NASA Astrophysics Data System (ADS)
Tóth, Gábor; Keppens, Rony
2012-07-01
The Versatile Advection Code (VAC) is a freely available general hydrodynamic and magnetohydrodynamic simulation software that works in 1, 2 or 3 dimensions on Cartesian and logically Cartesian grids. VAC runs on any Unix/Linux system with a Fortran 90 (or 77) compiler and Perl interpreter. VAC can run on parallel machines using either the Message Passing Interface (MPI) library or a High Performance Fortran (HPF) compiler.
Schaafsma, Murk; van der Deijl, Wilfred; Smits, Jacqueline M; Rahmel, Axel O; de Vries Robbé, Pieter F; Hoitsma, Andries J
2011-05-01
Organ allocation systems have become complex and difficult to comprehend. We introduced decision tables to specify the rules of allocation systems for different organs. A rule engine with decision tables as input was tested for the Kidney Allocation System (ETKAS). We compared this rule engine with the currently used ETKAS by running 11,000 historical match runs and by running the rule engine in parallel with the ETKAS on our allocation system. Decision tables were easy to implement and successful in verifying correctness, completeness, and consistency. The outcomes of the 11,000 historical matches in the rule engine and the ETKAS were exactly the same. Running the rule engine simultaneously in parallel and in real time with the ETKAS also produced no differences. Specifying organ allocation rules in decision tables is already a great step forward in enhancing the clarity of the systems. Yet, using these tables as rule engine input for matches optimizes the flexibility, simplicity and clarity of the whole process, from specification to the performed matches, and in addition this new method allows well controlled simulations. © 2011 The Authors. Transplant International © 2011 European Society for Organ Transplantation.
Streaming data analytics via message passing with application to graph algorithms
Plimpton, Steven J.; Shead, Tim
2014-05-06
The need to process streaming data, which arrives continuously at high-volume in real-time, arises in a variety of contexts including data produced by experiments, collections of environmental or network sensors, and running simulations. Streaming data can also be formulated as queries or transactions which operate on a large dynamic data store, e.g. a distributed database. We describe a lightweight, portable framework named PHISH which enables a set of independent processes to compute on a stream of data in a distributed-memory parallel manner. Datums are routed between processes in patterns defined by the application. PHISH can run on top of eithermore » message-passing via MPI or sockets via ZMQ. The former means streaming computations can be run on any parallel machine which supports MPI; the latter allows them to run on a heterogeneous, geographically dispersed network of machines. We illustrate how PHISH can support streaming MapReduce operations, and describe streaming versions of three algorithms for large, sparse graph analytics: triangle enumeration, subgraph isomorphism matching, and connected component finding. Lastly, we also provide benchmark timings for MPI versus socket performance of several kernel operations useful in streaming algorithms.« less
NASA Technical Reports Server (NTRS)
2002-01-01
(Released 08 April 2002) This image shows the cratered highlands of Terra Sirenum in the southern hemisphere. Near the center of the image running from left to right one can see long parallel to semi-parallel fractures or troughs called graben. Mars Global Surveyor initially discovered gullies on the south-facing wall of these fractures. This image is located at 38oS, 174oW (186oE).
Long-range interactions and parallel scalability in molecular simulations
NASA Astrophysics Data System (ADS)
Patra, Michael; Hyvönen, Marja T.; Falck, Emma; Sabouri-Ghomi, Mohsen; Vattulainen, Ilpo; Karttunen, Mikko
2007-01-01
Typical biomolecular systems such as cellular membranes, DNA, and protein complexes are highly charged. Thus, efficient and accurate treatment of electrostatic interactions is of great importance in computational modeling of such systems. We have employed the GROMACS simulation package to perform extensive benchmarking of different commonly used electrostatic schemes on a range of computer architectures (Pentium-4, IBM Power 4, and Apple/IBM G5) for single processor and parallel performance up to 8 nodes—we have also tested the scalability on four different networks, namely Infiniband, GigaBit Ethernet, Fast Ethernet, and nearly uniform memory architecture, i.e. communication between CPUs is possible by directly reading from or writing to other CPUs' local memory. It turns out that the particle-mesh Ewald method (PME) performs surprisingly well and offers competitive performance unless parallel runs on PC hardware with older network infrastructure are needed. Lipid bilayers of sizes 128, 512 and 2048 lipid molecules were used as the test systems representing typical cases encountered in biomolecular simulations. Our results enable an accurate prediction of computational speed on most current computing systems, both for serial and parallel runs. These results should be helpful in, for example, choosing the most suitable configuration for a small departmental computer cluster.
Dynamic Load Balancing for Grid Partitioning on a SP-2 Multiprocessor: A Framework
NASA Technical Reports Server (NTRS)
Sohn, Andrew; Simon, Horst; Lasinski, T. A. (Technical Monitor)
1994-01-01
Computational requirements of full scale computational fluid dynamics change as computation progresses on a parallel machine. The change in computational intensity causes workload imbalance of processors, which in turn requires a large amount of data movement at runtime. If parallel CFD is to be successful on a parallel or massively parallel machine, balancing of the runtime load is indispensable. Here a framework is presented for dynamic load balancing for CFD applications, called Jove. One processor is designated as a decision maker Jove while others are assigned to computational fluid dynamics. Processors running CFD send flags to Jove in a predetermined number of iterations to initiate load balancing. Jove starts working on load balancing while other processors continue working with the current data and load distribution. Jove goes through several steps to decide if the new data should be taken, including preliminary evaluate, partition, processor reassignment, cost evaluation, and decision. Jove running on a single EBM SP2 node has been completely implemented. Preliminary experimental results show that the Jove approach to dynamic load balancing can be effective for full scale grid partitioning on the target machine IBM SP2.
Dynamic Load Balancing For Grid Partitioning on a SP-2 Multiprocessor: A Framework
NASA Technical Reports Server (NTRS)
Sohn, Andrew; Simon, Horst; Lasinski, T. A. (Technical Monitor)
1994-01-01
Computational requirements of full scale computational fluid dynamics change as computation progresses on a parallel machine. The change in computational intensity causes workload imbalance of processors, which in turn requires a large amount of data movement at runtime. If parallel CFD is to be successful on a parallel or massively parallel machine, balancing of the runtime load is indispensable. Here a framework is presented for dynamic load balancing for CFD applications, called Jove. One processor is designated as a decision maker Jove while others are assigned to computational fluid dynamics. Processors running CFD send flags to Jove in a predetermined number of iterations to initiate load balancing. Jove starts working on load balancing while other processors continue working with the current data and load distribution. Jove goes through several steps to decide if the new data should be taken, including preliminary evaluate, partition, processor reassignment, cost evaluation, and decision. Jove running on a single IBM SP2 node has been completely implemented. Preliminary experimental results show that the Jove approach to dynamic load balancing can be effective for full scale grid partitioning on the target machine IBM SP2.
Parallel design of JPEG-LS encoder on graphics processing units
NASA Astrophysics Data System (ADS)
Duan, Hao; Fang, Yong; Huang, Bormin
2012-01-01
With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.
Characterizing parallel file-access patterns on a large-scale multiprocessor
NASA Technical Reports Server (NTRS)
Purakayastha, A.; Ellis, Carla; Kotz, David; Nieuwejaar, Nils; Best, Michael L.
1995-01-01
High-performance parallel file systems are needed to satisfy tremendous I/O requirements of parallel scientific applications. The design of such high-performance parallel file systems depends on a comprehensive understanding of the expected workload, but so far there have been very few usage studies of multiprocessor file systems. This paper is part of the CHARISMA project, which intends to fill this void by measuring real file-system workloads on various production parallel machines. In particular, we present results from the CM-5 at the National Center for Supercomputing Applications. Our results are unique because we collect information about nearly every individual I/O request from the mix of jobs running on the machine. Analysis of the traces leads to various recommendations for parallel file-system design.
50 GFlops molecular dynamics on the Connection Machine 5
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lomdahl, P.S.; Tamayo, P.; Groenbech-Jensen, N.
1993-12-31
The authors present timings and performance numbers for a new short range three dimensional (3D) molecular dynamics (MD) code, SPaSM, on the Connection Machine-5 (CM-5). They demonstrate that runs with more than 10{sup 8} particles are now possible on massively parallel MIMD computers. To the best of their knowledge this is at least an order of magnitude more particles than what has previously been reported. Typical production runs show sustained performance (including communication) in the range of 47--50 GFlops on a 1024 node CM-5 with vector units (VUs). The speed of the code scales linearly with the number of processorsmore » and with the number of particles and shows 95% parallel efficiency in the speedup.« less
Implementation of the force decomposition machine for molecular dynamics simulations.
Borštnik, Urban; Miller, Benjamin T; Brooks, Bernard R; Janežič, Dušanka
2012-09-01
We present the design and implementation of the force decomposition machine (FDM), a cluster of personal computers (PCs) that is tailored to running molecular dynamics (MD) simulations using the distributed diagonal force decomposition (DDFD) parallelization method. The cluster interconnect architecture is optimized for the communication pattern of the DDFD method. Our implementation of the FDM relies on standard commodity components even for networking. Although the cluster is meant for DDFD MD simulations, it remains general enough for other parallel computations. An analysis of several MD simulation runs on both the FDM and a standard PC cluster demonstrates that the FDM's interconnect architecture provides a greater performance compared to a more general cluster interconnect. Copyright © 2012 Elsevier Inc. All rights reserved.
Depth-varying azimuthal anisotropy in the Tohoku subduction channel
NASA Astrophysics Data System (ADS)
Liu, Xin; Zhao, Dapeng
2017-09-01
We determine a detailed 3-D model of azimuthal anisotropy tomography of the Tohoku subduction zone from the Japan Trench outer-rise to the back-arc near the Japan Sea coast, using a large number of high-quality P and S wave arrival-time data of local earthquakes recorded by the dense seismic network on the Japan Islands. Depth-varying seismic azimuthal anisotropy is revealed in the Tohoku subduction channel. The shallow portion of the Tohoku megathrust zone (<30 km depth) generally exhibits trench-normal fast-velocity directions (FVDs) except for the source area of the 2011 Tohoku-oki earthquake (Mw 9.0) where the FVD is nearly trench-parallel, whereas the deeper portion of the megathrust zone (at depths of ∼30-50 km) mainly exhibits trench-parallel FVDs. Trench-normal FVDs are revealed in the mantle wedge beneath the volcanic front and the back-arc. The Pacific plate mainly exhibits trench-parallel FVDs, except for the top portion of the subducting Pacific slab where visible trench-normal FVDs are revealed. A qualitative tectonic model is proposed to interpret such anisotropic features, suggesting transposition of earlier fabrics in the oceanic lithosphere into subduction-induced new structures in the subduction channel.
Li, Bowei; Jiang, Lei; Xie, Hua; Gao, Yan; Qin, Jianhua; Lin, Bingcheng
2009-09-01
A micropump-actuated negative pressure pinched injection method is developed for parallel electrophoresis on a multi-channel LIF detection system. The system has a home-made device that could individually control 16-port solenoid valves and a high-voltage power supply. The laser beam is excitated and distributes to the array separation channels for detection. The hybrid Glass-PDMS microfluidic chip comprises two common reservoirs, four separation channels coupled to their respective pneumatic micropumps and two reference channels. Due to use of pressure as a driving force, the proposed method has no sample bias effect for separation. There is only one high-voltage supply needed for separation without relying on the number of channels, which is significant for high-throughput analysis, and the time for sample loading is shortened to 1 s. In addition, the integrated micropumps can provide the versatile interface for coupling with other function units to satisfy the complicated demands. The performance is verified by separation of DNA marker and Hepatitis B virus DNA samples. And this method is also expected to show the potential throughput for the DNA analysis in the field of disease diagnosis.
Grace: A cross-platform micromagnetic simulator on graphics processing units
NASA Astrophysics Data System (ADS)
Zhu, Ru
2015-12-01
A micromagnetic simulator running on graphics processing units (GPUs) is presented. Different from GPU implementations of other research groups which are predominantly running on NVidia's CUDA platform, this simulator is developed with C++ Accelerated Massive Parallelism (C++ AMP) and is hardware platform independent. It runs on GPUs from venders including NVidia, AMD and Intel, and achieves significant performance boost as compared to previous central processing unit (CPU) simulators, up to two orders of magnitude. The simulator paved the way for running large size micromagnetic simulations on both high-end workstations with dedicated graphics cards and low-end personal computers with integrated graphics cards, and is freely available to download.
NASA Astrophysics Data System (ADS)
Moon, Hongsik
What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.
Multi-LED parallel transmission for long distance underwater VLC system with one SPAD receiver
NASA Astrophysics Data System (ADS)
Wang, Chao; Yu, Hong-Yi; Zhu, Yi-Jun; Wang, Tao; Ji, Ya-Wei
2018-03-01
In this paper, a multiple light emitting diode (LED) chips parallel transmission (Multi-LED-PT) scheme for underwater visible light communication system with one photon-counting single photon avalanche diode (SPAD) receiver is proposed. As the lamp always consists of multi-LED chips, the data rate could be improved when we drive these multi-LED chips parallel by using the interleaver-division-multiplexing technique. For each chip, the on-off-keying modulation is used to reduce the influence of clipping. Then a serial successive interference cancellation detection algorithm based on ideal Poisson photon-counting channel by the SPAD is proposed. Finally, compared to the SPAD-based direct current-biased optical orthogonal frequency division multiplexing system, the proposed Multi-LED-PT system could improve the error-rate performance and anti-nonlinearity performance significantly under the effects of absorption, scattering and weak turbulence-induced channel fading together.
Responses to riparian restoration in the Spring Creek watershed, Central Pennsylvania
Carline, R.F.; Walsh, M.C.
2007-01-01
Riparian treatments, consisting of 3- to 4-m buffer strips, stream bank stabilization, and rock-lined stream crossings, were installed in two streams with livestock grazing to reduce sediment loading and stream bank erosion. Cedar Run and Slab Cabin Run, the treatment streams, and Spring Creek, an adjacent reference stream without riparian grazing, were monitored prior to (1991-1992) and 3-5 years after (2001-2003) riparian buffer installation to assess channel morphology, stream substrate composition, suspended sediments, and macroinvertebrate communities. Few changes were found in channel widths and depths, but channel-structuring flow events were rare in the drought period after restoration. Stream bank vegetation increased from 50% or less to 100% in nearly all formerly grazed riparian buffers. The proportion of fine sediments in stream substrates decreased in Cedar Run but not in Slab Cabin Run. After riparian treatments, suspended sediments during base flow and storm flow decreased 47-87% in both streams. Macroinvertebrate diversity did not improve after restoration in either treated stream. Relative to Spring Creek, macroinvertebrate densities increased in both treated streams by the end of the posttreatment sampling period. Despite drought conditions that may have altered physical and biological effects of riparian treatments, goals of the riparian restoration to minimize erosion and sedimentation were met. A relatively narrow grass buffer along 2.4 km of each stream was effective in improving water quality, stream substrates, and some biological metrics. ?? 2007 Society for Ecological Restoration International.
Communications oriented programming of parallel iterative solutions of sparse linear systems
NASA Technical Reports Server (NTRS)
Patrick, M. L.; Pratt, T. W.
1986-01-01
Parallel algorithms are developed for a class of scientific computational problems by partitioning the problems into smaller problems which may be solved concurrently. The effectiveness of the resulting parallel solutions is determined by the amount and frequency of communication and synchronization and the extent to which communication can be overlapped with computation. Three different parallel algorithms for solving the same class of problems are presented, and their effectiveness is analyzed from this point of view. The algorithms are programmed using a new programming environment. Run-time statistics and experience obtained from the execution of these programs assist in measuring the effectiveness of these algorithms.
NASA Astrophysics Data System (ADS)
Lawry, B. J.; Encarnacao, A.; Hipp, J. R.; Chang, M.; Young, C. J.
2011-12-01
With the rapid growth of multi-core computing hardware, it is now possible for scientific researchers to run complex, computationally intensive software on affordable, in-house commodity hardware. Multi-core CPUs (Central Processing Unit) and GPUs (Graphics Processing Unit) are now commonplace in desktops and servers. Developers today have access to extremely powerful hardware that enables the execution of software that could previously only be run on expensive, massively-parallel systems. It is no longer cost-prohibitive for an institution to build a parallel computing cluster consisting of commodity multi-core servers. In recent years, our research team has developed a distributed, multi-core computing system and used it to construct global 3D earth models using seismic tomography. Traditionally, computational limitations forced certain assumptions and shortcuts in the calculation of tomographic models; however, with the recent rapid growth in computational hardware including faster CPU's, increased RAM, and the development of multi-core computers, we are now able to perform seismic tomography, 3D ray tracing and seismic event location using distributed parallel algorithms running on commodity hardware, thereby eliminating the need for many of these shortcuts. We describe Node Resource Manager (NRM), a system we developed that leverages the capabilities of a parallel computing cluster. NRM is a software-based parallel computing management framework that works in tandem with the Java Parallel Processing Framework (JPPF, http://www.jppf.org/), a third party library that provides a flexible and innovative way to take advantage of modern multi-core hardware. NRM enables multiple applications to use and share a common set of networked computers, regardless of their hardware platform or operating system. Using NRM, algorithms can be parallelized to run on multiple processing cores of a distributed computing cluster of servers and desktops, which results in a dramatic speedup in execution time. NRM is sufficiently generic to support applications in any domain, as long as the application is parallelizable (i.e., can be subdivided into multiple individual processing tasks). At present, NRM has been effective in decreasing the overall runtime of several algorithms: 1) the generation of a global 3D model of the compressional velocity distribution in the Earth using tomographic inversion, 2) the calculation of the model resolution matrix, model covariance matrix, and travel time uncertainty for the aforementioned velocity model, and 3) the correlation of waveforms with archival data on a massive scale for seismic event detection. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
Distributed run of a one-dimensional model in a regional application using SOAP-based web services
NASA Astrophysics Data System (ADS)
Smiatek, Gerhard
This article describes the setup of a distributed computing system in Perl. It facilitates the parallel run of a one-dimensional environmental model on a number of simple network PC hosts. The system uses Simple Object Access Protocol (SOAP) driven web services offering the model run on remote hosts and a multi-thread environment distributing the work and accessing the web services. Its application is demonstrated in a regional run of a process-oriented biogenic emission model for the area of Germany. Within a network consisting of up to seven web services implemented on Linux and MS-Windows hosts, a performance increase of approximately 400% has been reached compared to a model run on the fastest single host.
Small Landslides in Aram-Ares Channel, Mars
NASA Astrophysics Data System (ADS)
Kraal, E. R.; Shoup, J.
2014-12-01
An east-west channel (located at 341°E and 3°N) connects Aram Chaos to Ares Valles. The valley is approximately 80 km long, 12 km wide, and 1.5 km deep. The channel is filled with a series of slope failures or landslides that form lobate aprons covering the valley floor. Preliminary studies of the valley on the north wall of the valley (south facing) characterized 6 landslides using gridded MOLA topography from JMARS, including area, drop height and run out distance. These relatively small landslides have surface areas ranging from 5.6 to 55 km2. Their aprons run out ~ 10 km, often covering the entire width of the valley floor. Drop height was measured using both maximum and minimum estimates due to resolution limits of the topography and ranged from 1200 to 2200 meters. Using the drop height and run out distance, we determine the coefficient of friction and maximum velocity for two of the landslides using previously established landslide equations based on physical properties. The coefficient of friction for the landslide events ranged from 0.5 to 1.5, which corresponds to a maximum landslide velocity of 87 m/s2 to 96 m/s2. The variations in the coefficients may be due to landslides size, relative size, or possible volatile or ice content. Preliminary geomorphic surface mapping is currently under way to identify the relationship between the aprons and the channel floor, relative age of the landslides, and other characteristics. Initial analysis indicates the channel floor and depositional aprons have experienced deflation and eolian processes and aprons have a variable level of erosion indicating that the landslides did not form during a single event.
A Multi-Channel Approach for Collaborative Web-Based Learning
ERIC Educational Resources Information Center
Azeta, A. A.
2008-01-01
This paper describes an architectural framework and a prototype implementation of a web-based multi-channel e-Learning application that allows students, lecturers and the research communities to collaborate irrespective of the communication device a user is carrying. The application was developed based on the concept of "right once run on any…
Hedrick, Lara B.; Welsh, Stuart A.; Anderson, James T.
2009-01-01
Impacts of highway construction on streams in the central Appalachians are a growing concern as new roads are created to promote tourism and economic development in the area. Alterations to the streambed of a first-order stream, Sauerkraut Run, Hardy County, WV, during construction of a highway overpass included placement and removal of a temporary culvert, straightening and regrading of a section of stream channel, and armourment of a bank with a reinforced gravel berm. We surveyed longitudinal profiles and cross sections in a reference reach and the altered reach of Sauerkraut Run from 2003 through 2007 to measure physical changes in the streambed. During the four-year period, three high-flow events changed the streambed downstream of construction including channel widening and aggradation and then degradation of the streambed. Upstream of construction, at a reinforced gravel berm, bank erosion was documented. The reference section remained relatively unchanged. Knowledge gained by documenting channel changes in response to natural and anthropogenic variables can be useful for managers and engineers involved in highway construction projects.
Scalable and balanced dynamic hybrid data assimilation
NASA Astrophysics Data System (ADS)
Kauranne, Tuomo; Amour, Idrissa; Gunia, Martin; Kallio, Kari; Lepistö, Ahti; Koponen, Sampsa
2017-04-01
Scalability of complex weather forecasting suites is dependent on the technical tools available for implementing highly parallel computational kernels, but to an equally large extent also on the dependence patterns between various components of the suite, such as observation processing, data assimilation and the forecast model. Scalability is a particular challenge for 4D variational assimilation methods that necessarily couple the forecast model into the assimilation process and subject this combination to an inherently serial quasi-Newton minimization process. Ensemble based assimilation methods are naturally more parallel, but large models force ensemble sizes to be small and that results in poor assimilation accuracy, somewhat akin to shooting with a shotgun in a million-dimensional space. The Variational Ensemble Kalman Filter (VEnKF) is an ensemble method that can attain the accuracy of 4D variational data assimilation with a small ensemble size. It achieves this by processing a Gaussian approximation of the current error covariance distribution, instead of a set of ensemble members, analogously to the Extended Kalman Filter EKF. Ensemble members are re-sampled every time a new set of observations is processed from a new approximation of that Gaussian distribution which makes VEnKF a dynamic assimilation method. After this a smoothing step is applied that turns VEnKF into a dynamic Variational Ensemble Kalman Smoother VEnKS. In this smoothing step, the same process is iterated with frequent re-sampling of the ensemble but now using past iterations as surrogate observations until the end result is a smooth and balanced model trajectory. In principle, VEnKF could suffer from similar scalability issues as 4D-Var. However, this can be avoided by isolating the forecast model completely from the minimization process by implementing the latter as a wrapper code whose only link to the model is calling for many parallel and totally independent model runs, all of them implemented as parallel model runs themselves. The only bottleneck in the process is the gathering and scattering of initial and final model state snapshots before and after the parallel runs which requires a very efficient and low-latency communication network. However, the volume of data communicated is small and the intervening minimization steps are only 3D-Var, which means their computational load is negligible compared with the fully parallel model runs. We present example results of scalable VEnKF with the 4D lake and shallow sea model COHERENS, assimilating simultaneously continuous in situ measurements in a single point and infrequent satellite images that cover a whole lake, with the fully scalable VEnKF.
NASA Technical Reports Server (NTRS)
2002-01-01
(Released 25 June 2002) The Science Tantalus Fossae is a set of long valleys on the eastern side of Alba Patera. These valleys are referred to as grabens and are formed by extension of the crust and faulting. When large amounts of pressure or tension are applied to rocks on timescales that are fast enough that the rock cannot respond by deforming, the rock breaks along faults. In the case of a graben, two parallel faults are formed by extension of the crust and the rock in between the faults drops downward into the space created by the extension. Numerous sets of grabens are visible in this THEMIS image, trending from north-northeast to south-southwest. Because the faults defining the graben are formed parallel to the direction of the applied stress, we know that extensional forces were pulling the crust apart in the west-northwest/east-southeast direction. The large number of grabens around Alba Patera is generally believed to be the result of extensional forces associated with the uplift of Alba Patera. Also visible in this image are a series of linearly aligned pits, called a pit chain. The pits are not the result of impact cratering, but are similar to sinkholes on Earth. Sinkholes are typically formed by the removal of rock (commonly limestone) underground by groundwater -- when enough rock is removed, the overlying rock becomes too heavy to be supported, and it collapses, forming a pit. Unlike sinkholes, however, the pit chains near Alba Patera were likely formed when empty underground lava tubes collapsed, accounting for the presence and alignment of many pits. Numerous channel features are also observed in the image, and follow the local topographic slope, which is downhill to the east-southeast. One of these, a long channel in the center of the image, nicely demonstrates the complex relations possible between geologic features. The geologist's rule of superposition says that a feature on top of (superposing) another feature, or cutting across another feature is younger than the feature it covers or cuts. In one location, the channel cuts across the somewhat subdued fault defining a graben (near the right side of the image), indicating that the channel was carved after the graben was formed. But in other places (near the center of the image), the channel is clearly cut by a large fault defining one of the grabens, indicating that some faulting was occurring after the channel was carved. These relationships can be observed throughout this image. By mapping out superposition relationships in detail, geologists can establish a complex sequence of events that occurred long ago. The Story The first thing that catches your eye in the image above is a string of round pits that are strewn dramatically on the surface. Although they may look like craters, nothing came hurtling in from the sky to make them. Instead, collapses along a lava tube have created this long dotted line on the Martian surface. The lava tube, a hollow feature beneath the surface, can't always withstand the weight from above, and so collapses in places, forming pits like the ones seen here. Throughout the rest of the image are a series of depressed valleys known as grabens that run roughly from the northeast to the southwest. They formed when the crust of the Martian surface was stretched so fast that it broke along faults. When that happened, the rock in between fell downward into the space created by the extension, creating the long subtle streaks of lowered terrain. They were probably created when Alba Patera, the shield volcano of this area, was elevated or 'uplifted' through tectonic forces. This area of long valleys is named after Tantalus, a king of ancient Lydia who, according to legend, betrayed the gods and was sent to Hades. In this subterranean place, he was forced to stand in water up to his chin underneath the branches of fruit trees. Every time he tried to drink, the water would recede, and every time he tried to eat, the boughs would move the fruit just out of reach. You can easily see where the word 'tantalize' comes from. Scientists are intrigued so much by the history of this area that they seek to understand its elusive past. Luckily, their interests are much more in reach than those of poor Tantalus. A number of channels in this image (running downhill from the west-northwest to the east-southeast) help them understand the chain of events that worked to create the compelling features in this region. Take a look at the channels close-up and see if you can tell whether the channels or the grabens happened first. A rule of thumb is that if one feature is on top of another or cuts across it, it is younger than the feature it covers or cuts. One of the channels in the center of the image is great to study. Toward the right side of the image, the channel cuts across a fault, indicating it formed before the graben. Follow the channel westward, however, and you'll see that a large fault cuts the channel, indicating that this graben formed after the channel. That probably means this criss-crossed region went through a seeming eternity of torture itself, as the land kept tearing and stretching, as channels were carved and recarved, as lava tubes formed and then finally collapsed, only to have their walls erode in further streaks as well.
Multiplexed chirp waveform synthesizer
Dudley, Peter A.; Tise, Bert L.
2003-09-02
A synthesizer for generating a desired chirp signal has M parallel channels, where M is an integer greater than 1, each channel including a chirp waveform synthesizer generating at an output a portion of a digital representation of the desired chirp signal; and a multiplexer for multiplexing the M outputs to create a digital representation of the desired chirp signal. Preferably, each channel receives input information that is a function of information representing the desired chirp signal.
Implementing Shared Memory Parallelism in MCBEND
NASA Astrophysics Data System (ADS)
Bird, Adam; Long, David; Dobson, Geoff
2017-09-01
MCBEND is a general purpose radiation transport Monte Carlo code from AMEC Foster Wheelers's ANSWERS® Software Service. MCBEND is well established in the UK shielding community for radiation shielding and dosimetry assessments. The existing MCBEND parallel capability effectively involves running the same calculation on many processors. This works very well except when the memory requirements of a model restrict the number of instances of a calculation that will fit on a machine. To more effectively utilise parallel hardware OpenMP has been used to implement shared memory parallelism in MCBEND. This paper describes the reasoning behind the choice of OpenMP, notes some of the challenges of multi-threading an established code such as MCBEND and assesses the performance of the parallel method implemented in MCBEND.
Limpanuparb, Taweetham; Milthorpe, Josh; Rendell, Alistair P
2014-10-30
Use of the modern parallel programming language X10 for computing long-range Coulomb and exchange interactions is presented. By using X10, a partitioned global address space language with support for task parallelism and the explicit representation of data locality, the resolution of the Ewald operator can be parallelized in a straightforward manner including use of both intranode and internode parallelism. We evaluate four different schemes for dynamic load balancing of integral calculation using X10's work stealing runtime, and report performance results for long-range HF energy calculation of large molecule/high quality basis running on up to 1024 cores of a high performance cluster machine. Copyright © 2014 Wiley Periodicals, Inc.
New NAS Parallel Benchmarks Results
NASA Technical Reports Server (NTRS)
Yarrow, Maurice; Saphir, William; VanderWijngaart, Rob; Woo, Alex; Kutler, Paul (Technical Monitor)
1997-01-01
NPB2 (NAS (NASA Advanced Supercomputing) Parallel Benchmarks 2) is an implementation, based on Fortran and the MPI (message passing interface) message passing standard, of the original NAS Parallel Benchmark specifications. NPB2 programs are run with little or no tuning, in contrast to NPB vendor implementations, which are highly optimized for specific architectures. NPB2 results complement, rather than replace, NPB results. Because they have not been optimized by vendors, NPB2 implementations approximate the performance a typical user can expect for a portable parallel program on distributed memory parallel computers. Together these results provide an insightful comparison of the real-world performance of high-performance computers. New NPB2 features: New implementation (CG), new workstation class problem sizes, new serial sample versions, more performance statistics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, William Michael; Plimpton, Steven James; Wang, Peng
2010-03-01
LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS has potentials for soft materials (biomolecules, polymers) and solid-state materials (metals, semiconductors) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale. LAMMPS runs on single processors or in parallel using message-passing techniques and a spatial-decomposition of the simulation domain. The code is designed to be easy to modify or extend with new functionality.
Experimental implementation of array-compressed parallel transmission at 7 tesla.
Yan, Xinqiang; Cao, Zhipeng; Grissom, William A
2016-06-01
To implement and validate a hardware-based array-compressed parallel transmission (acpTx) system. In array-compressed parallel transmission, a small number of transmit channels drive a larger number of transmit coils, which are connected via an array compression network that implements optimized coil-to-channel combinations. A two channel-to-eight coil array compression network was developed using power splitters, attenuators and phase shifters, and a simulation was performed to investigate the effects of coil coupling on power dissipation in a simplified network. An eight coil transmit array was constructed using induced current elimination decoupling, and the coil and network were validated in benchtop measurements, B1+ mapping scans, and an accelerated spiral excitation experiment. The developed attenuators came within 0.08 dB of the desired attenuations, and reflection coefficients were -22 dB or better. The simulation demonstrated that up to 3× more power was dissipated in the network when coils were poorly isolated (-9.6 dB), versus well-isolated (-31 dB). Compared to split circularly-polarized coil combinations, the additional degrees of freedom provided by the array compression network led to 54% lower squared excitation error in the spiral experiment. Array-compressed parallel transmission was successfully implemented in a hardware system. Further work is needed to develop remote network tuning and to minimize network power dissipation. Magn Reson Med 75:2545-2552, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Cryogenic switched MOSFET characterization
NASA Technical Reports Server (NTRS)
1981-01-01
Both p channel and n channel enhancement mode MOSFETs can be readily switched on and off at temperatures as low as 2.8 K so that switch sampled readout of a VLWIR Ge:Ga focal plane is electronically possible. Noise levels as low as 100 rms electrons per sample (independent of sample rate) can be achieved using existing p channel MOSFETs, at overall rates up to 30,000 samples/second per multiplexed channel (e.g., 32 detectors at a rate of almost 1,000 frames/second). Run of the mill devices, including very low power dissipation n channel FETs would still permit noise levels of the order of 500 electrons/sample.
Deflection Measurements on Propeller 5503 in Ahead and Crashback
2016-10-01
the dots on the blade that were visible for the run . Not all points could be determined for each picture during each run . One issue discovered with...Channel (LCC) in February and April of 2009. The deflection of the blades was measured using defocused particle image velocimetry. Comparisons were made... Blade Deflection Measurement CalTech
Distributed computing feasibility in a non-dedicated homogeneous distributed system
NASA Technical Reports Server (NTRS)
Leutenegger, Scott T.; Sun, Xian-He
1993-01-01
The low cost and availability of clusters of workstations have lead researchers to re-explore distributed computing using independent workstations. This approach may provide better cost/performance than tightly coupled multiprocessors. In practice, this approach often utilizes wasted cycles to run parallel jobs. The feasibility of such a non-dedicated parallel processing environment assuming workstation processes have preemptive priority over parallel tasks is addressed. An analytical model is developed to predict parallel job response times. Our model provides insight into how significantly workstation owner interference degrades parallel program performance. A new term task ratio, which relates the parallel task demand to the mean service demand of nonparallel workstation processes, is introduced. It was proposed that task ratio is a useful metric for determining how large the demand of a parallel applications must be in order to make efficient use of a non-dedicated distributed system.
Report to the High Order Language Working Group (HOLWG)
1977-01-14
as running, runnable, suspended or dormant, may be synchronized by semaphore variables, may be schedaled using clock and duration data types and mpy...Recursive and non-recursive routines G6. Parallel processes, synchronization , critical regions G7. User defined parameterized exception handling G8...typed and lacks extensibility, parallel processing, synchronization and real-time features. Overall Evaluation IBM strongly recommended PL/I as a
Evaluating SPLASH-2 Applications Using MapReduce
NASA Astrophysics Data System (ADS)
Zhu, Shengkai; Xiao, Zhiwei; Chen, Haibo; Chen, Rong; Zhang, Weihua; Zang, Binyu
MapReduce has been prevalent for running data-parallel applications. By hiding other non-functionality parts such as parallelism, fault tolerance and load balance from programmers, MapReduce significantly simplifies the programming of large clusters. Due to the mentioned features of MapReduce above, researchers have also explored the use of MapReduce on other application domains, such as machine learning, textual retrieval and statistical translation, among others.
Automatic Adaptation of Tunable Distributed Applications
2001-01-01
size, weight, and battery life, with a single CPU, less memory, smaller hard disk, and lower bandwidth network connectivity. The power of PDAs is...wireless, and bluetooth [32] facilities; thus achieving different rates of data transmission. 1 With the trend of “write once, run everywhere...applications, a single component can execute on multiple processors (or machines) in parallel. These parallel applications, written in a specialized language
A software platform for continuum modeling of ion channels based on unstructured mesh
NASA Astrophysics Data System (ADS)
Tu, B.; Bai, S. Y.; Chen, M. X.; Xie, Y.; Zhang, L. B.; Lu, B. Z.
2014-01-01
Most traditional continuum molecular modeling adopted finite difference or finite volume methods which were based on a structured mesh (grid). Unstructured meshes were only occasionally used, but an increased number of applications emerge in molecular simulations. To facilitate the continuum modeling of biomolecular systems based on unstructured meshes, we are developing a software platform with tools which are particularly beneficial to those approaches. This work describes the software system specifically for the simulation of a typical, complex molecular procedure: ion transport through a three-dimensional channel system that consists of a protein and a membrane. The platform contains three parts: a meshing tool chain for ion channel systems, a parallel finite element solver for the Poisson-Nernst-Planck equations describing the electrodiffusion process of ion transport, and a visualization program for continuum molecular modeling. The meshing tool chain in the platform, which consists of a set of mesh generation tools, is able to generate high-quality surface and volume meshes for ion channel systems. The parallel finite element solver in our platform is based on the parallel adaptive finite element package PHG which wass developed by one of the authors [1]. As a featured component of the platform, a new visualization program, VCMM, has specifically been developed for continuum molecular modeling with an emphasis on providing useful facilities for unstructured mesh-based methods and for their output analysis and visualization. VCMM provides a graphic user interface and consists of three modules: a molecular module, a meshing module and a numerical module. A demonstration of the platform is provided with a study of two real proteins, the connexin 26 and hemolysin ion channels.
Rand, A.C. Jr.
1961-05-01
An unloading device for individual vertical fuel channels in a nuclear reactor is shown. The channels are arranged in parallel rows and underneath each is a separate supporting block on which the fuel in the channel rests. The blocks are raounted in contiguous rows on an array of parallel pairs of tracks over the bottom of the reactor. Oblong hollows in the blocks form a continuous passageway through the middle of the row of blocks on each pair of tracks. At the end of each passageway is a horizontal grappling rod with a T- or L extension at the end next to the reactor of a length to permit it to pass through the oblong passageway in one position, but when rotated ninety degrees the head will strike one of the longer sides of the oblong hollow of one of the blocks. The grappling rod is actuated by a controllable reciprocating and rotating device which extends it beyond any individual block desired, rotates it and retracts it far enough to permit the fuel in the vertical channel above the block to fall into a handling tank below the reactor.
Tonomura, Wataru; Moriguchi, Hiroyuki; Jimbo, Yasuhiko; Konishi, Satoshi
2010-08-01
This paper describes an advanced Micro Channel Array (MCA) for recording electrophysiological signals of neuronal networks at multiple points simultaneously. The developed MCA is designed for neuronal network analysis which has been studied by the co-authors using the Micro Electrode Arrays (MEA) system, and employs the principles of extracellular recordings. A prerequisite for extracellular recordings with good signal-to-noise ratio is a tight contact between cells and electrodes. The MCA described herein has the following advantages. The electrodes integrated around individual micro channels are electrically isolated to enable parallel multipoint recording. Reliable clamping of a targeted cell through micro channels is expected to improve the cellular selectivity and the attachment between the cell and the electrode toward steady electrophysiological recordings. We cultured hippocampal neurons on the developed MCA. As a result, the spontaneous and evoked spike potentials could be recorded by sucking and clamping the cells at multiple points. In this paper, we describe the design and fabrication of the MCA and the successful electrophysiological recordings leading to the development of an effective cellular network analysis device.
Turbo Trellis Coded Modulation With Iterative Decoding for Mobile Satellite Communications
NASA Technical Reports Server (NTRS)
Divsalar, D.; Pollara, F.
1997-01-01
In this paper, analytical bounds on the performance of parallel concatenation of two codes, known as turbo codes, and serial concatenation of two codes over fading channels are obtained. Based on this analysis, design criteria for the selection of component trellis codes for MPSK modulation, and a suitable bit-by-bit iterative decoding structure are proposed. Examples are given for throughput of 2 bits/sec/Hz with 8PSK modulation. The parallel concatenation example uses two rate 4/5 8-state convolutional codes with two interleavers. The convolutional codes' outputs are then mapped to two 8PSK modulations. The serial concatenated code example uses an 8-state outer code with rate 4/5 and a 4-state inner trellis code with 5 inputs and 2 x 8PSK outputs per trellis branch. Based on the above mentioned design criteria for fading channels, a method to obtain he structure of the trellis code with maximum diversity is proposed. Simulation results are given for AWGN and an independent Rayleigh fading channel with perfect Channel State Information (CSI).
Multiple channel data acquisition system
Crawley, H. Bert; Rosenberg, Eli I.; Meyer, W. Thomas; Gorbics, Mark S.; Thomas, William D.; McKay, Roy L.; Homer, Jr., John F.
1990-05-22
A multiple channel data acquisition system for the transfer of large amounts of data from a multiplicity of data channels has a plurality of modules which operate in parallel to convert analog signals to digital data and transfer that data to a communications host via a FASTBUS. Each module has a plurality of submodules which include a front end buffer (FEB) connected to input circuitry having an analog to digital converter with cache memory for each of a plurality of channels. The submodules are interfaced with the FASTBUS via a FASTBUS coupler which controls a module bus and a module memory. The system is triggered to effect rapid parallel data samplings which are stored to the cache memories. The cache memories are uploaded to the FEBs during which zero suppression occurs. The data in the FEBs is reformatted and compressed by a local processor during transfer to the module memory. The FASTBUS coupler is used by the communications host to upload the compressed and formatted data from the module memory. The local processor executes programs which are downloaded to the module memory through the FASTBUS coupler.
Multiple channel data acquisition system
Crawley, H.B.; Rosenberg, E.I.; Meyer, W.T.; Gorbics, M.S.; Thomas, W.D.; McKay, R.L.; Homer, J.F. Jr.
1990-05-22
A multiple channel data acquisition system for the transfer of large amounts of data from a multiplicity of data channels has a plurality of modules which operate in parallel to convert analog signals to digital data and transfer that data to a communications host via a FASTBUS. Each module has a plurality of submodules which include a front end buffer (FEB) connected to input circuitry having an analog to digital converter with cache memory for each of a plurality of channels. The submodules are interfaced with the FASTBUS via a FASTBUS coupler which controls a module bus and a module memory. The system is triggered to effect rapid parallel data samplings which are stored to the cache memories. The cache memories are uploaded to the FEBs during which zero suppression occurs. The data in the FEBs is reformatted and compressed by a local processor during transfer to the module memory. The FASTBUS coupler is used by the communications host to upload the compressed and formatted data from the module memory. The local processor executes programs which are downloaded to the module memory through the FASTBUS coupler. 25 figs.
Hybrid Structure Multichannel All-Fiber Current Sensor.
Jiang, Junzhen; Zhang, Hao; He, Youwu; Qiu, Yishen
2017-08-02
We have experimentally developed a hybrid-structure multi-channel all-fiber current sensor with ordinary silica fiber using fiber loop architecture. According to the rationale of time division multiplexing, the sensor combines parallel and serial structures. The purpose of the hybrid-structure multi-channel all-fiber current sensor is to get more information from the different measured points simultaneously. In addition, the hybrid-structure fiber current sensor exhibited a good linear response for each channel. A three-channel experiment was performed in the study and showed that the system could detect different current positions. Each channel could individually detect the current and needed a separate calibration system. Furthermore, the three channels will not affect each other.
Annular fuel and air co-flow premixer
Stevenson, Christian Xavier; Melton, Patrick Benedict; York, William David
2013-10-15
Disclosed is a premixer for a combustor including an annular outer shell and an annular inner shell. The inner shell defines an inner flow channel inside of the inner shell and is located to define an outer flow channel between the outer shell and the inner shell. A fuel discharge annulus is located between the outer flow channel and the inner flow channel and is configured to inject a fuel flow into a mixing area in a direction substantially parallel to an outer airflow through the outer flow channel and an inner flow through the inner flow channel. Further disclosed are a combustor including a plurality of premixers and a method of premixing air and fuel in a combustor.
Simulation framework for intelligent transportation systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ewing, T.; Doss, E.; Hanebutte, U.
1996-10-01
A simulation framework has been developed for a large-scale, comprehensive, scaleable simulation of an Intelligent Transportation System (ITS). The simulator is designed for running on parallel computers and distributed (networked) computer systems, but can run on standalone workstations for smaller simulations. The simulator currently models instrumented smart vehicles with in-vehicle navigation units capable of optimal route planning and Traffic Management Centers (TMC). The TMC has probe vehicle tracking capabilities (display position and attributes of instrumented vehicles), and can provide two-way interaction with traffic to provide advisories and link times. Both the in-vehicle navigation module and the TMC feature detailed graphicalmore » user interfaces to support human-factors studies. Realistic modeling of variations of the posted driving speed are based on human factors studies that take into consideration weather, road conditions, driver personality and behavior, and vehicle type. The prototype has been developed on a distributed system of networked UNIX computers but is designed to run on parallel computers, such as ANL`s IBM SP-2, for large-scale problems. A novel feature of the approach is that vehicles are represented by autonomous computer processes which exchange messages with other processes. The vehicles have a behavior model which governs route selection and driving behavior, and can react to external traffic events much like real vehicles. With this approach, the simulation is scaleable to take advantage of emerging massively parallel processor (MPP) systems.« less
AMITIS: A 3D GPU-Based Hybrid-PIC Model for Space and Plasma Physics
NASA Astrophysics Data System (ADS)
Fatemi, Shahab; Poppe, Andrew R.; Delory, Gregory T.; Farrell, William M.
2017-05-01
We have developed, for the first time, an advanced modeling infrastructure in space simulations (AMITIS) with an embedded three-dimensional self-consistent grid-based hybrid model of plasma (kinetic ions and fluid electrons) that runs entirely on graphics processing units (GPUs). The model uses NVIDIA GPUs and their associated parallel computing platform, CUDA, developed for general purpose processing on GPUs. The model uses a single CPU-GPU pair, where the CPU transfers data between the system and GPU memory, executes CUDA kernels, and writes simulation outputs on the disk. All computations, including moving particles, calculating macroscopic properties of particles on a grid, and solving hybrid model equations are processed on a single GPU. We explain various computing kernels within AMITIS and compare their performance with an already existing well-tested hybrid model of plasma that runs in parallel using multi-CPU platforms. We show that AMITIS runs ∼10 times faster than the parallel CPU-based hybrid model. We also introduce an implicit solver for computation of Faraday’s Equation, resulting in an explicit-implicit scheme for the hybrid model equation. We show that the proposed scheme is stable and accurate. We examine the AMITIS energy conservation and show that the energy is conserved with an error < 0.2% after 500,000 timesteps, even when a very low number of particles per cell is used.
GELATIO: a general framework for modular digital analysis of high-purity Ge detector signals
NASA Astrophysics Data System (ADS)
Agostini, M.; Pandola, L.; Zavarise, P.; Volynets, O.
2011-08-01
GELATIO is a new software framework for advanced data analysis and digital signal processing developed for the GERDA neutrinoless double beta decay experiment. The framework is tailored to handle the full analysis flow of signals recorded by high purity Ge detectors and photo-multipliers from the veto counters. It is designed to support a multi-channel modular and flexible analysis, widely customizable by the user either via human-readable initialization files or via a graphical interface. The framework organizes the data into a multi-level structure, from the raw data up to the condensed analysis parameters, and includes tools and utilities to handle the data stream between the different levels. GELATIO is implemented in C++. It relies upon ROOT and its extension TAM, which provides compatibility with PROOF, enabling the software to run in parallel on clusters of computers or many-core machines. It was tested on different platforms and benchmarked in several GERDA-related applications. A stable version is presently available for the GERDA Collaboration and it is used to provide the reference analysis of the experiment data.
Genetic Parallel Programming: design and implementation.
Cheang, Sin Man; Leung, Kwong Sak; Lee, Kin Hong
2006-01-01
This paper presents a novel Genetic Parallel Programming (GPP) paradigm for evolving parallel programs running on a Multi-Arithmetic-Logic-Unit (Multi-ALU) Processor (MAP). The MAP is a Multiple Instruction-streams, Multiple Data-streams (MIMD), general-purpose register machine that can be implemented on modern Very Large-Scale Integrated Circuits (VLSIs) in order to evaluate genetic programs at high speed. For human programmers, writing parallel programs is more difficult than writing sequential programs. However, experimental results show that GPP evolves parallel programs with less computational effort than that of their sequential counterparts. It creates a new approach to evolving a feasible problem solution in parallel program form and then serializes it into a sequential program if required. The effectiveness and efficiency of GPP are investigated using a suite of 14 well-studied benchmark problems. Experimental results show that GPP speeds up evolution substantially.
Parallel-In-Time For Moving Meshes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Falgout, R. D.; Manteuffel, T. A.; Southworth, B.
2016-02-04
With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is appliedmore » to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.« less
Research in Parallel Algorithms and Software for Computational Aerosciences
NASA Technical Reports Server (NTRS)
Domel, Neal D.
1996-01-01
Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Research in Parallel Algorithms and Software for Computational Aerosciences
NASA Technical Reports Server (NTRS)
Domel, Neal D.
1996-01-01
Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Guthoff, Rudolf F; Wienss, Holger; Hahnel, Christian; Wree, Andreas
2005-07-01
Evaluation of a new method to visualize distribution and morphology of human corneal nerves (Adelta- and C-fibers) by means of fluorescence staining, confocal laser scanning microscopy, and 3-dimensional (3D) reconstruction. Trephinates of corneas with a diagnosis of Fuchs corneal dystrophy were sliced into layers of 200 microm thickness using a Draeger microkeratome (Storz, Germany). The anterior lamella was stained with the Life/Dead-Kit (Molecular Probes Inc.), examined by the confocal laser scanning microscope "Odyssey XL," step size between 0.5 and 1 microm, and optical sections were digitally 3D-reconstructed. Immediate staining of explanted corneas by the Life/Dead-Kit gave a complete picture of the nerves in the central human cornea. Thin nerves running parallel to the Bowman layer in the subepithelial plexus perforate the Bowman layer orthogonally through tube-like structures. Passing the Bowman layer, Adelta- and C-fibers can be clearly distinguished by fiber diameter, and, while running in the basal epithelial plexus, by their spatial arrangement. Adelta-fibers run straight and parallel to the Bowman layer underneath the basal cell layer. C-fibers, after a short run parallel to the Bowman layer, send off multiple branches penetrating epithelial cell layers orthogonally, ending blindly in invaginations of the superficial cells. In contrast to C-fibers, Adelta-fibers show characteristic bulbous formations when kinking into the basal epithelial plexus. Ex-vivo fluorescence staining of the cornea and 3D reconstructions of confocal scans provide a fast and easily reproducible tool to visualize nerves of the anterior living cornea at high resolution. This may help to clarify gross variations of nerve fiber patterns under various clinical and experimental conditions.
NASA Astrophysics Data System (ADS)
Cowley, Adam; Maynes, Daniel; Crockett, Julie; Iverson, Brian
2017-11-01
This work experimentally investigates the effects of heating on laminar flow in high aspect ratio superhydrophobic (SH) microchannels. When water that is saturated with dissolved air is used, the unwetted cavities of the SH surfaces act as nucleation sites and air effervesces out of solution onto the surfaces. The microchannels consist of a rib/cavity structured SH surface, that is heated, and a glass surface that is utilized for flow visualization. Two channel heights of nominally 183 and 366 μm are considered. The friction factor-Reynolds product (fRe) is obtained via pressure drop and volumetric flow rate measurements and the temperature profile along the channel is obtained via thermocouples embedded in an aluminum block below the SH surface. Five surface types/configurations are investigated: smooth hydrophilic, smooth hydrophobic, SH with ribs perpendicular to the flow, SH with ribs parallel to the flow, and SH with both ribs parallel to the flow and sparse ribs perpendicular to the flow. Depending on the surface type/configuration, large bubbles can form and adversely affect fRe and lead to higher temperatures along the channel. Once bubbles grow large enough, they are expelled from the channel. The channel size greatly effects the residence time of the bubbles and consequently fRe and the channel temperature. This research was supported by the National Science Foundation (NSF) (Grant No. CBET-1235881) and the Utah NASA Space Grant Consortium (NASA Grant NNX15A124H).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hasenkamp, Daren; Sim, Alexander; Wehner, Michael
Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, whilemore » we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.« less
Neotectonic Activity from the Upper Reaches of the Arabian Gulf and Possibilities of New Oil Fields
NASA Astrophysics Data System (ADS)
Sissakian, V. K.; Abdul Ahad, A. D.; Al-Ansari, N.; Knutsson, S.
2018-03-01
Upper reaches of the Arabian Gulf consist of different types of fine sediments including the vast Mesopotamia Plain sediments, tidal flat sediments and estuarine sabkha sediments. The height of the plain starts from zero meter and increases northwards to three meters with extremely gentle gradient. The vast plain to the north of the Arabian Gulf is drained by Shat Al-Arab (Shat means river in Iraqi slang language) and Khor Al-Zubair (Khor means estuary). The former drains the extreme eastern part of the plain; whereas, the latter drains the western part. Shat Al-Arab is the resultant of confluence of the Tigris and Euphrates rivers near Al-Qurna town; about 160 km north of the Arabian Gulf mouth at Al-Fao town; whereas, the length of Khor Al-Zubair is about 50 km; as measured from Um Qasir Harbor. The drainage system around Khor Al-Zubair is extremely fine dendritic; whereas around Shat Al-Arab is almost parallel running from both sides of the river towards the river; almost perpendicularly. The fine dendritic drainage around Khor Al-Zubair shows clear recent erosional activity, beside water divides, abandoned irrigation channels and dislocated irrigational channels and estuarine distributaries; all are good indication for a Neotectonic activity in the region. These may indicate the presence of subsurface anticlines, which may represent oil fields; since tens of subsurface anticlines occur in near surroundings, which are oil fields.
Simulation of sparse matrix array designs
NASA Astrophysics Data System (ADS)
Boehm, Rainer; Heckel, Thomas
2018-04-01
Matrix phased array probes are becoming more prominently used in industrial applications. The main drawbacks, using probes incorporating a very large number of transducer elements, are needed for an appropriate cabling and an ultrasonic device offering many parallel channels. Matrix arrays designed for extended functionality feature at least 64 or more elements. Typical arrangements are square matrices, e.g., 8 by 8 or 11 by 11 or rectangular matrixes, e.g., 8 by 16 or 10 by 12 to fit a 128-channel phased array system. In some phased array systems, the number of simultaneous active elements is limited to a certain number, e.g., 32 or 64. Those setups do not allow running the probe with all elements active, which may cause a significant change in the directivity pattern of the resulting sound beam. When only a subset of elements can be used during a single acquisition, different strategies may be applied to collect enough data for rebuilding the missing information from the echo signal. Omission of certain elements may be one approach, overlay of subsequent shots with different active areas may be another one. This paper presents the influence of a decreased number of active elements on the sound field and their distribution on the array. Solutions using subsets with different element activity patterns on matrix arrays and their advantages and disadvantages concerning the sound field are evaluated using semi-analytical simulation tools. Sound field criteria are discussed, which are significant for non-destructive testing results and for the system setup.
The muon pretrigger system of the HERA-B experiment
NASA Astrophysics Data System (ADS)
Bocker, M.; Adams, M.; Bechtle, P.; Buchholz, P.; Cruse, C.; Husemann, U.; Klaus, E.; Koch, N.; Kolander, M.; Kolotaev, I.; Riege, H.; Schutt, J.; Schwenninger, B.; van Staa, R.; Wegener, D.
2001-08-01
One of the main goals of the HERA-B experiment at DESY in Hamburg, Germany, is to study the properties of B-mesons with the emphasis on CP violation. B-mesons are produced in hadronic interactions of a 920-GeV proton beam with an internal wire target. An effective bunch crossing rate of about 8.5 MHz leads to about 200 charged tracks per event. Therefore, a highly selective and efficient trigger system providing high suppression of background events is required. The HERA-B trigger system consists of four levels. A rate reduction factor of 200 is aimed at by the first-level trigger (FLT). The muon pretrigger system, as a part of the FLT, is a modular system consisting of about 100 large-size VME modules of three different types: the pretrigger link board (PLB), the pretrigger coincidence unit (PCU), and the pretrigger message generator (PMG). The data rate processed by the pretrigger system is about 19.5 GByte/s. The PLBs process digitized hit information in eight independent electronic channels in parallel. Every electronic channel handles 32 bits of hit information received from the front-end driver buffer system. Optical links operating at 800 Mb/s transmit the data after serialization to PCUs, which calculate coincidences using complex programmable logic devices. The PMGs transform this coincidence information into messages for the FLT processors. The concept and design as well as results of the muon pretrigger running at HERA-B are presented.
A Semi-flexible 64-channel Receive-only Phased Array for Pediatric Body MRI at 3T
Zhang, Tao; Grafendorfer, Thomas; Cheng, Joseph Y.; Ning, Peigang; Rainey, Bob; Giancola, Mark; Ortman, Sarah; Robb, Fraser J.; Calderon, Paul D.; Hargreaves, Brian A.; Lustig, Michael; Scott, Greig C.; Pauly, John M.; Vasanawala, Shreyas S.
2015-01-01
Purpose To design, construct, and validate a semi-flexible 64-channel receive-only phased array for pediatric body MRI at 3T. Methods A 64-channel receive-only phased array was developed and constructed. The designed flexible coil can easily conform to different patient sizes with non-overlapping coil elements in the transverse plane. It can cover a field of view of up to 44 × 28 cm2 and removes the need for coil repositioning for body MRI patients with multiple clinical concerns. The 64-channel coil was compared with a 32-channel standard coil for signal-to-noise ratio (SNR) and parallel imaging performances on different phantoms. With IRB approval and informed consent/assent, the designed coil was validated on 21 consecutive pediatric patients. Results The pediatric coil provided higher SNR than the standard coil on different phantoms, with the averaged SNR gain at least 23% over a depth of 7 cm along the cross-section of phantoms. It also achieved better parallel imaging performance under moderate acceleration factors. Good image quality (average score 4.6 out of 5) was achieved using the developed pediatric coil in the clinical studies. Conclusion A 64-channel semi-flexible receive-only phased array has been developed and validated to facilitate high quality pediatric body MRI at 3T. PMID:26418283
Federal Register 2010, 2011, 2012, 2013, 2014
2011-09-30
... special local regulations for the swim portions of ``Beach 2 Battleship Full and Half Iron Distance... is intended to restrict vessel traffic on Banks, Motts, and Wrightsville Channels during the swimming... engage in a three-part race, including run, bike, and swim portions. During the swim portion of the event...
Discrimination of portraits using a hybrid parallel joint transform correlator system
NASA Astrophysics Data System (ADS)
Inaba, Rieko; Hashimoto, Asako; Kodate, Kashiko
1999-05-01
A hybrid parallel joint transform correlation system is demonstrated through the introduction of a five-channel binary zone plate array and is applied to the discrimination of portraits for a presumed criminal investigation. In order to improve performance, we adopt pe-processing of images with white area of 20%. Furthermore, we discuss the robustness.
Variable Swing Optimal Parallel Links - Minimal Power, Maximal Density for Parallel Links
2009-01-01
implemented; it allows controlling the transmitter current by a simple design of a differential pair with a 100 ohms termination resistor. Figure 3.4...optimization. Zuber, P., et al. 2005. 0-7695-2288-2. 21. A 36Gb/s ACCI Multi-Channel Bus using a Fully Differential Pulse Receiver. Wilson, Lei Luo
Competitive Parallel Processing For Compression Of Data
NASA Technical Reports Server (NTRS)
Diner, Daniel B.; Fender, Antony R. H.
1990-01-01
Momentarily-best compression algorithm selected. Proposed competitive-parallel-processing system compresses data for transmission in channel of limited band-width. Likely application for compression lies in high-resolution, stereoscopic color-television broadcasting. Data from information-rich source like color-television camera compressed by several processors, each operating with different algorithm. Referee processor selects momentarily-best compressed output.
SequenceL: Automated Parallel Algorithms Derived from CSP-NT Computational Laws
NASA Technical Reports Server (NTRS)
Cooke, Daniel; Rushton, Nelson
2013-01-01
With the introduction of new parallel architectures like the cell and multicore chips from IBM, Intel, AMD, and ARM, as well as the petascale processing available for highend computing, a larger number of programmers will need to write parallel codes. Adding the parallel control structure to the sequence, selection, and iterative control constructs increases the complexity of code development, which often results in increased development costs and decreased reliability. SequenceL is a high-level programming language that is, a programming language that is closer to a human s way of thinking than to a machine s. Historically, high-level languages have resulted in decreased development costs and increased reliability, at the expense of performance. In recent applications at JSC and in industry, SequenceL has demonstrated the usual advantages of high-level programming in terms of low cost and high reliability. SequenceL programs, however, have run at speeds typically comparable with, and in many cases faster than, their counterparts written in C and C++ when run on single-core processors. Moreover, SequenceL is able to generate parallel executables automatically for multicore hardware, gaining parallel speedups without any extra effort from the programmer beyond what is required to write the sequen tial/singlecore code. A SequenceL-to-C++ translator has been developed that automatically renders readable multithreaded C++ from a combination of a SequenceL program and sample data input. The SequenceL language is based on two fundamental computational laws, Consume-Simplify- Produce (CSP) and Normalize-Trans - pose (NT), which enable it to automate the creation of parallel algorithms from high-level code that has no annotations of parallelism whatsoever. In our anecdotal experience, SequenceL development has been in every case less costly than development of the same algorithm in sequential (that is, single-core, single process) C or C++, and an order of magnitude less costly than development of comparable parallel code. Moreover, SequenceL not only automatically parallelizes the code, but since it is based on CSP-NT, it is provably race free, thus eliminating the largest quality challenge the parallelized software developer faces.
Coaxial microreactor for particle synthesis
Bartsch, Michael; Kanouff, Michael P; Ferko, Scott M; Crocker, Robert W; Wally, Karl
2013-10-22
A coaxial fluid flow microreactor system disposed on a microfluidic chip utilizing laminar flow for synthesizing particles from solution. Flow geometries produced by the mixing system make use of hydrodynamic focusing to confine a core flow to a small axially-symmetric, centrally positioned and spatially well-defined portion of a flow channel cross-section to provide highly uniform diffusional mixing between a reactant core and sheath flow streams. The microreactor is fabricated in such a way that a substantially planar two-dimensional arrangement of microfluidic channels will produce a three-dimensional core/sheath flow geometry. The microreactor system can comprise one or more coaxial mixing stages that can be arranged singly, in series, in parallel or nested concentrically in parallel.
NASA Astrophysics Data System (ADS)
Morita, Yukinori; Mori, Takahiro; Migita, Shinji; Mizubayashi, Wataru; Tanabe, Akihito; Fukuda, Koichi; Matsukawa, Takashi; Endo, Kazuhiko; O'uchi, Shin-ichi; Liu, Yongxun; Masahara, Meishoku; Ota, Hiroyuki
2014-12-01
The performance of parallel electric field tunnel field-effect transistors (TFETs), in which band-to-band tunneling (BTBT) was initiated in-line to the gate electric field was evaluated. The TFET was fabricated by inserting an epitaxially-grown parallel-plate tunnel capacitor between heavily doped source wells and gate insulators. Analysis using a distributed-element circuit model indicated there should be a limit of the drain current caused by the self-voltage-drop effect in the ultrathin channel layer.
The Automated Instrumentation and Monitoring System (AIMS): Design and Architecture. 3.2
NASA Technical Reports Server (NTRS)
Yan, Jerry C.; Schmidt, Melisa; Schulbach, Cathy; Bailey, David (Technical Monitor)
1997-01-01
Whether a researcher is designing the 'next parallel programming paradigm', another 'scalable multiprocessor' or investigating resource allocation algorithms for multiprocessors, a facility that enables parallel program execution to be captured and displayed is invaluable. Careful analysis of such information can help computer and software architects to capture, and therefore, exploit behavioral variations among/within various parallel programs to take advantage of specific hardware characteristics. A software tool-set that facilitates performance evaluation of parallel applications on multiprocessors has been put together at NASA Ames Research Center under the sponsorship of NASA's High Performance Computing and Communications Program over the past five years. The Automated Instrumentation and Monitoring Systematic has three major software components: a source code instrumentor which automatically inserts active event recorders into program source code before compilation; a run-time performance monitoring library which collects performance data; and a visualization tool-set which reconstructs program execution based on the data collected. Besides being used as a prototype for developing new techniques for instrumenting, monitoring and presenting parallel program execution, AIMS is also being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Currently, the execution of FORTRAN and C programs on the Intel Paragon and PALM workstations can be automatically instrumented and monitored. Performance data thus collected can be displayed graphically on various workstations. The process of performance tuning with AIMS will be illustrated using various NAB Parallel Benchmarks. This report includes a description of the internal architecture of AIMS and a listing of the source code.
NASA Astrophysics Data System (ADS)
Yan, Xing-Yu; Gong, Li-Hua; Chen, Hua-Ying; Zhou, Nan-Run
2018-05-01
A theoretical quantum key distribution scheme based on random hybrid quantum channel with EPR pairs and GHZ states is devised. In this scheme, EPR pairs and tripartite GHZ states are exploited to set up random hybrid quantum channel. Only one photon in each entangled state is necessary to run forth and back in the channel. The security of the quantum key distribution scheme is guaranteed by more than one round of eavesdropping check procedures. It is of high capacity since one particle could carry more than two bits of information via quantum dense coding.
Language Classification using N-grams Accelerated by FPGA-based Bloom Filters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jacob, A; Gokhale, M
N-Gram (n-character sequences in text documents) counting is a well-established technique used in classifying the language of text in a document. In this paper, n-gram processing is accelerated through the use of reconfigurable hardware on the XtremeData XD1000 system. Our design employs parallelism at multiple levels, with parallel Bloom Filters accessing on-chip RAM, parallel language classifiers, and parallel document processing. In contrast to another hardware implementation (HAIL algorithm) that uses off-chip SRAM for lookup, our highly scalable implementation uses only on-chip memory blocks. Our implementation of end-to-end language classification runs at 85x comparable software and 1.45x the competing hardware design.
A Parallel Saturation Algorithm on Shared Memory Architectures
NASA Technical Reports Server (NTRS)
Ezekiel, Jonathan; Siminiceanu
2007-01-01
Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-09-20
... end of the headrace where it runs diagonally across the main channel of the river approximately 4,970... not used under normal run-of-river operation. The normal water surface elevation of the project...-3 are vertical-shaft, fixed-blade, Kaplan turbines; unit 4 is a vertical-shaft, manually adjustable...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-06
... it runs diagonally across the main channel of the river approximately 4,970 feet to the west shore of... normal run-of-river operation. The normal water surface elevation of the project impoundment is 276.5... appurtenant equipment. The hydraulic equipment for units 1-3 are vertical-shaft, fixed-blade, Kaplan turbines...
Sierra Structural Dynamics User's Notes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reese, Garth M.
2015-10-19
Sierra/SD provides a massively parallel implementation of structural dynamics finite element analysis, required for high fidelity, validated models used in modal, vibration, static and shock analysis of weapons systems. This document provides a users guide to the input for Sierra/SD. Details of input specifications for the different solution types, output options, element types and parameters are included. The appendices contain detailed examples, and instructions for running the software on parallel platforms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Munday, Lynn Brendon; Day, David M.; Bunting, Gregory
Sierra/SD provides a massively parallel implementation of structural dynamics finite element analysis, required for high fidelity, validated models used in modal, vibration, static and shock analysis of weapons systems. This document provides a users guide to the input for Sierra/SD. Details of input specifications for the different solution types, output options, element types and parameters are included. The appendices contain detailed examples, and instructions for running the software on parallel platforms.
LLMapReduce: Multi-Lingual Map-Reduce for Supercomputing Environments
2015-11-20
1990s. Popularized by Google [36] and Apache Hadoop [37], map-reduce has become a staple technology of the ever- growing big data community...Lexington, MA, U.S.A Abstract— The map-reduce parallel programming model has become extremely popular in the big data community. Many big data ...to big data users running on a supercomputer. LLMapReduce dramatically simplifies map-reduce programming by providing simple parallel programming
Advanced Numerical Techniques of Performance Evaluation. Volume 1
1990-06-01
system scheduling3thread. The scheduling thread then runs any other ready thread that can be found. A thread can only sleep or switch out on itself...Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers C...Kuck 1987] C.D. Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Trans. on Comp
StrAuto: automation and parallelization of STRUCTURE analysis.
Chhatre, Vikram E; Emerson, Kevin J
2017-03-24
Population structure inference using the software STRUCTURE has become an integral part of population genetic studies covering a broad spectrum of taxa including humans. The ever-expanding size of genetic data sets poses computational challenges for this analysis. Although at least one tool currently implements parallel computing to reduce computational overload of this analysis, it does not fully automate the use of replicate STRUCTURE analysis runs required for downstream inference of optimal K. There is pressing need for a tool that can deploy population structure analysis on high performance computing clusters. We present an updated version of the popular Python program StrAuto, to streamline population structure analysis using parallel computing. StrAuto implements a pipeline that combines STRUCTURE analysis with the Evanno Δ K analysis and visualization of results using STRUCTURE HARVESTER. Using benchmarking tests, we demonstrate that StrAuto significantly reduces the computational time needed to perform iterative STRUCTURE analysis by distributing runs over two or more processors. StrAuto is the first tool to integrate STRUCTURE analysis with post-processing using a pipeline approach in addition to implementing parallel computation - a set up ideal for deployment on computing clusters. StrAuto is distributed under the GNU GPL (General Public License) and available to download from http://strauto.popgen.org .
Norman, James J; Desai, Tejal A
2005-01-01
Parallel channels of various dimensions have been shown to cause a monolayer of cells in culture to align in the direction of the channels. For the engineering of complex organ systems to become a reality, similar control over the cellular microenvironment in three dimensions must be achieved. Using microfabrication, a polydimethylsiloxane (PDMS) scaffold (40 microm wide, 70-microm-deep parallel channels separated by 25-microm-wide walls) was created. A fibroblast-seeded collagen matrix was then molded around this PDMS scaffold. The PDMS scaffold served as an internal skeleton to guide the cells to grow in the prescribed three-dimensional pattern. Organization, aspect ratio, and the z diameter of the cells were analyzed by confocal microscopy. Fibroblasts elongated and organized in the direction of the channels throughout the height of the scaffold. The mean angle of the cells off of the long axis of the channels was 4.3 +/- 0.7 degrees as opposed to 32.6 +/- 2.2 degrees in controls. The morphology of the cells was also affected by the PDMS scaffold. The nuclei were longer (1.25x) and thinner (0.75x) than in control gels; however, no changes in diameter of the cells in the z direction were seen.
Parallel Subspace Subcodes of Reed-Solomon Codes for Magnetic Recording Channels
ERIC Educational Resources Information Center
Wang, Han
2010-01-01
Read channel architectures based on a single low-density parity-check (LDPC) code are being considered for the next generation of hard disk drives. However, LDPC-only solutions suffer from the error floor problem, which may compromise reliability, if not handled properly. Concatenated architectures using an LDPC code plus a Reed-Solomon (RS) code…
Mahmood, Zohaib; McDaniel, Patrick; Guérin, Bastien; Keil, Boris; Vester, Markus; Adalsteinsson, Elfar; Wald, Lawrence L; Daniel, Luca
2016-07-01
In a coupled parallel transmit (pTx) array, the power delivered to a channel is partially distributed to other channels because of coupling. This power is dissipated in circulators resulting in a significant reduction in power efficiency. In this study, a technique for designing robust decoupling matrices interfaced between the RF amplifiers and the coils is proposed. The decoupling matrices ensure that most forward power is delivered to the load without loss of encoding capabilities of the pTx array. The decoupling condition requires that the impedance matrix seen by the power amplifiers is a diagonal matrix whose entries match the characteristic impedance of the power amplifiers. In this work, the impedance matrix of the coupled coils is diagonalized by a successive multiplication by its eigenvectors. A general design procedure and software are developed to generate automatically the hardware that implements diagonalization using passive components. The general design method is demonstrated by decoupling two example parallel transmit arrays. Our decoupling matrices achieve better than -20 db decoupling in both cases. A robust framework for designing decoupling matrices for pTx arrays is presented and validated. The proposed decoupling strategy theoretically scales to any arbitrary number of channels. Magn Reson Med 76:329-339, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
A Concept for Run-Time Support of the Chapel Language
NASA Technical Reports Server (NTRS)
James, Mark
2006-01-01
A document presents a concept for run-time implementation of other concepts embodied in the Chapel programming language. (Now undergoing development, Chapel is intended to become a standard language for parallel computing that would surpass older such languages in both computational performance in the efficiency with which pre-existing code can be reused and new code written.) The aforementioned other concepts are those of distributions, domains, allocations, and access, as defined in a separate document called "A Semantic Framework for Domains and Distributions in Chapel" and linked to a language specification defined in another separate document called "Chapel Specification 0.3." The concept presented in the instant report is recognition that a data domain that was invented for Chapel offers a novel approach to distributing and processing data in a massively parallel environment. The concept is offered as a starting point for development of working descriptions of functions and data structures that would be necessary to implement interfaces to a compiler for transforming the aforementioned other concepts from their representations in Chapel source code to their run-time implementations.
Sun, Xiaole; Djordjevic, Ivan B; Neifeld, Mark A
2016-11-28
We investigate a multiple spatial modes based quantum key distribution (QKD) scheme that employs multiple independent parallel beams through a marine free-space optical channel over open ocean. This approach provides the potential to increase secret key rate (SKR) linearly with the number of channels. To improve the SKR performance, we describe a back-propagation mode (BPM) method to mitigate the atmospheric turbulence effects. Our simulation results indicate that the secret key rate can be improved significantly by employing the proposed BPM-based multi-channel QKD scheme.
User's and test case manual for FEMATS
NASA Technical Reports Server (NTRS)
Chatterjee, Arindam; Volakis, John; Nurnberger, Mike; Natzke, John
1995-01-01
The FEMATS program incorporates first-order edge-based finite elements and vector absorbing boundary conditions into the scattered field formulation for computation of the scattering from three-dimensional geometries. The code has been validated extensively for a large class of geometries containing inhomogeneities and satisfying transition conditions. For geometries that are too large for the workstation environment, the FEMATS code has been optimized to run on various supercomputers. Currently, FEMATS has been configured to run on the HP 9000 workstation, vectorized for the Cray Y-MP, and parallelized to run on the Kendall Square Research (KSR) architecture and the Intel Paragon.
Noise effect on fidelity of two-qubit teleportation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hu Xueyuan; Gu Ying; Gong Qihuang
2010-05-15
We investigate the effect of noise on a class of four-qubit entangled channels for two-qubit teleportation from Alice to Bob. These entangled channels include both parallel Bell pairs and inseparable channels with genuine multipartite entanglement. For the situation where only Bob's share of the entangled channel is subject to decoherence, we show by deriving a general expression for the teleported state that teleportation using noisy inseparable channels is equivalent to teleportation using noisy Bell pairs. When Alice's qubits are also subject to noise, we find that the inseparable channels never give a higher teleportation fidelity than Bell pairs, even inmore » the presence of collective noise. Our results can shed some light on practical two-qubit teleportation.« less
Parallel array of independent thermostats for column separations
Foret, Frantisek; Karger, Barry L.
2005-08-16
A thermostat array including an array of two or more capillary columns (10) or two or more channels in a microfabricated device is disclosed. A heat conductive material (12) surrounded each individual column or channel in array, each individual column or channel being thermally insulated from every other individual column or channel. One or more independently controlled heating or cooling elements (14) is positioned adjacent to individual columns or channels within the heat conductive material, each heating or cooling element being connected to a source of heating or cooling, and one or more independently controlled temperature sensing elements (16) is positioned adjacent to the individual columns or channels within the heat conductive material. Each temperature sensing element is connected to a temperature controller.
Spiking neural P systems with multiple channels.
Peng, Hong; Yang, Jinyu; Wang, Jun; Wang, Tao; Sun, Zhang; Song, Xiaoxiao; Luo, Xiaohui; Huang, Xiangnian
2017-11-01
Spiking neural P systems (SNP systems, in short) are a class of distributed parallel computing systems inspired from the neurophysiological behavior of biological spiking neurons. In this paper, we investigate a new variant of SNP systems in which each neuron has one or more synaptic channels, called spiking neural P systems with multiple channels (SNP-MC systems, in short). The spiking rules with channel label are introduced to handle the firing mechanism of neurons, where the channel labels indicate synaptic channels of transmitting the generated spikes. The computation power of SNP-MC systems is investigated. Specifically, we prove that SNP-MC systems are Turing universal as both number generating and number accepting devices. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Stuart, J. A.
2011-12-01
This paper explores the challenges in implementing a message passing interface usable on systems with data-parallel processors, and more specifically GPUs. As a case study, we design and implement the ``DCGN'' API on NVIDIA GPUs that is similar to MPI and allows full access to the underlying architecture. We introduce the notion of data-parallel thread-groups as a way to map resources to MPI ranks. We use a method that also allows the data-parallel processors to run autonomously from user-written CPU code. In order to facilitate communication, we use a sleep-based polling system to store and retrieve messages. Unlike previous systems, our method provides both performance and flexibility. By running a test suite of applications with different communication requirements, we find that a tolerable amount of overhead is incurred, somewhere between one and five percent depending on the application, and indicate the locations where this overhead accumulates. We conclude that with innovations in chipsets and drivers, this overhead will be mitigated and provide similar performance to typical CPU-based MPI implementations while providing fully-dynamic communication.
NASA Astrophysics Data System (ADS)
Wu, Kaihua; Shao, Zhencheng; Chen, Nian; Wang, Wenjie
2018-01-01
The wearing degree of the wheel set tread is one of the main factors that influence the safety and stability of running train. Geometrical parameters mainly include flange thickness and flange height. Line structure laser light was projected on the wheel tread surface. The geometrical parameters can be deduced from the profile image. An online image acquisition system was designed based on asynchronous reset of CCD and CUDA parallel processing unit. The image acquisition was fulfilled by hardware interrupt mode. A high efficiency parallel segmentation algorithm based on CUDA was proposed. The algorithm firstly divides the image into smaller squares, and extracts the squares of the target by fusion of k_means and STING clustering image segmentation algorithm. Segmentation time is less than 0.97ms. A considerable acceleration ratio compared with the CPU serial calculation was obtained, which greatly improved the real-time image processing capacity. When wheel set was running in a limited speed, the system placed alone railway line can measure the geometrical parameters automatically. The maximum measuring speed is 120km/h.
Sanges, Remo; Cordero, Francesca; Calogero, Raffaele A
2007-12-15
OneChannelGUI is an add-on Bioconductor package providing a new set of functions extending the capability of the affylmGUI package. This library provides a graphical interface (GUI) for Bioconductor libraries to be used for quality control, normalization, filtering, statistical validation and data mining for single channel microarrays. Affymetrix 3' expression (IVT) arrays as well as the new whole transcript expression arrays, i.e. gene/exon 1.0 ST, are actually implemented. oneChannelGUI is available for most platforms on which R runs, i.e. Windows and Unix-like machines. http://www.bioconductor.org/packages/2.0/bioc/html/oneChannelGUI.html
Fuel cell cooler assembly and edge seal means therefor
Breault, Richard D.; Roethlein, Richard J.; Congdon, Joseph V.
1980-01-01
A cooler assembly for a stack of fuel cells comprises a fibrous, porous coolant tube holder sandwiched between and bonded to at least one of a pair of gas impervious graphite plates. The tubes are disposed in channels which pass through the holder. The channels are as deep as the holder thickness, which is substantially the same as the outer diameter of the tubes. Gas seals along the edges of the holder parallel to the direction of the channels are gas impervious graphite strips.
Marshall, J. Jr.
1961-10-24
A reactor is described in which natural-uranium bodies are located in parallel channels which extend through the graphite mass in a regular lattice. The graphite mass has additional channels that are out of the lattice and contain no uranium. These additional channels decrease in number per unit volume of graphite from the center of the reactor to the exterior and have the effect of reducing the density of the graphite more at the center than at the exterior, thereby spreading neutron activity throughout the reactor. (AEC)
Fibre Optic Connections And Method For Using Same
Chan, Benson; Cohen, Mitchell S.; Fortier, Paul F.; Freitag, Ladd W.; Hall, Richard R.; Johnson, Glen W.; Lin, How Tzu; Sherman, John H.
2004-03-30
A package is described that couples a twelve channel wide fiber optic cable to a twelve channel Vertical Cavity Surface Emitting Laser (VCSEL) transmitter and a multiple channel Perpendicularly Aligned Integrated Die (PAID) receiver. The package allows for reduction in the height of the assembly package by vertically orienting certain dies parallel to the fiber optic cable and horizontally orienting certain other dies. The assembly allows the vertically oriented optoelectronic dies to be perpendicularly attached to the horizontally oriented laminate via a flexible circuit.
Partitioning in parallel processing of production systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oflazer, K.
1987-01-01
This thesis presents research on certain issues related to parallel processing of production systems. It first presents a parallel production system interpreter that has been implemented on a four-processor multiprocessor. This parallel interpreter is based on Forgy's OPS5 interpreter and exploits production-level parallelism in production systems. Runs on the multiprocessor system indicate that it is possible to obtain speed-up of around 1.7 in the match computation for certain production systems when productions are split into three sets that are processed in parallel. The next issue addressed is that of partitioning a set of rules to processors in a parallel interpretermore » with production-level parallelism, and the extent of additional improvement in performance. The partitioning problem is formulated and an algorithm for approximate solutions is presented. The thesis next presents a parallel processing scheme for OPS5 production systems that allows some redundancy in the match computation. This redundancy enables the processing of a production to be divided into units of medium granularity each of which can be processed in parallel. Subsequently, a parallel processor architecture for implementing the parallel processing algorithm is presented.« less
Parallel transformation of K-SVD solar image denoising algorithm
NASA Astrophysics Data System (ADS)
Liang, Youwen; Tian, Yu; Li, Mei
2017-02-01
The images obtained by observing the sun through a large telescope always suffered with noise due to the low SNR. K-SVD denoising algorithm can effectively remove Gauss white noise. Training dictionaries for sparse representations is a time consuming task, due to the large size of the data involved and to the complexity of the training algorithms. In this paper, an OpenMP parallel programming language is proposed to transform the serial algorithm to the parallel version. Data parallelism model is used to transform the algorithm. Not one atom but multiple atoms updated simultaneously is the biggest change. The denoising effect and acceleration performance are tested after completion of the parallel algorithm. Speedup of the program is 13.563 in condition of using 16 cores. This parallel version can fully utilize the multi-core CPU hardware resources, greatly reduce running time and easily to transplant in multi-core platform.
Parallelization of the FLAPW method
NASA Astrophysics Data System (ADS)
Canning, A.; Mannstadt, W.; Freeman, A. J.
2000-08-01
The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.
Parallel programming with Easy Java Simulations
NASA Astrophysics Data System (ADS)
Esquembre, F.; Christian, W.; Belloni, M.
2018-01-01
Nearly all of today's processors are multicore, and ideally programming and algorithm development utilizing the entire processor should be introduced early in the computational physics curriculum. Parallel programming is often not introduced because it requires a new programming environment and uses constructs that are unfamiliar to many teachers. We describe how we decrease the barrier to parallel programming by using a java-based programming environment to treat problems in the usual undergraduate curriculum. We use the easy java simulations programming and authoring tool to create the program's graphical user interface together with objects based on those developed by Kaminsky [Building Parallel Programs (Course Technology, Boston, 2010)] to handle common parallel programming tasks. Shared-memory parallel implementations of physics problems, such as time evolution of the Schrödinger equation, are available as source code and as ready-to-run programs from the AAPT-ComPADRE digital library.
Parallel evolution of image processing tools for multispectral imagery
NASA Astrophysics Data System (ADS)
Harvey, Neal R.; Brumby, Steven P.; Perkins, Simon J.; Porter, Reid B.; Theiler, James P.; Young, Aaron C.; Szymanski, John J.; Bloch, Jeffrey J.
2000-11-01
We describe the implementation and performance of a parallel, hybrid evolutionary-algorithm-based system, which optimizes image processing tools for feature-finding tasks in multi-spectral imagery (MSI) data sets. Our system uses an integrated spatio-spectral approach and is capable of combining suitably-registered data from different sensors. We investigate the speed-up obtained by parallelization of the evolutionary process via multiple processors (a workstation cluster) and develop a model for prediction of run-times for different numbers of processors. We demonstrate our system on Landsat Thematic Mapper MSI , covering the recent Cerro Grande fire at Los Alamos, NM, USA.
Parallel Gaussian elimination of a block tridiagonal matrix using multiple microcomputers
NASA Technical Reports Server (NTRS)
Blech, Richard A.
1989-01-01
The solution of a block tridiagonal matrix using parallel processing is demonstrated. The multiprocessor system on which results were obtained and the software environment used to program that system are described. Theoretical partitioning and resource allocation for the Gaussian elimination method used to solve the matrix are discussed. The results obtained from running 1, 2 and 3 processor versions of the block tridiagonal solver are presented. The PASCAL source code for these solvers is given in the appendix, and may be transportable to other shared memory parallel processors provided that the synchronization outlines are reproduced on the target system.
The Use of Facebook in National Election Campaigns: Politics as Usual?
NASA Astrophysics Data System (ADS)
Andersen, Kim Normann; Medaglia, Rony
The uptake of online media in election campaigning is leading to speculations about the transformation of politics and cyber-democracy. Politicians running for seats in Parliament are increasingly using online media to disseminate information to potential voters and building dynamic, online communities. Drawing on an online survey of the Facebook networks of the two top candidates running for seats in the 2007 Danish Parliament election, this study suggests that the online sphere is primarily populated by users who already know the candidates through the traditional channels of party organizations, and that they do not expect to influence the policy of their candidates. Instead, users view Facebook mainly as an information channel and as a means to gain social prestige.
Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators
Wang, Wei; Xu, Lifan; Cavazos, John; Huang, Howie H.; Kay, Matthew
2014-01-01
Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to many scientists. This, plus the fact that parallelized code is usually not portable to different architectures, creates major challenges for exploiting the full capabilities of modern computational accelerators. In this work, we sought to overcome these challenges by studying how to achieve both automated parallelization using OpenACC and enhanced portability using OpenCL. We applied our parallelization schemes using GPUs as well as Intel Many Integrated Core (MIC) coprocessor to reduce the run time of wave propagation simulations. We used a well-established 2D cardiac action potential model as a specific case-study. To the best of our knowledge, we are the first to study auto-parallelization of 2D cardiac wave propagation simulations using OpenACC. Our results identify several approaches that provide substantial speedups. The OpenACC-generated GPU code achieved more than speedup above the sequential implementation and required the addition of only a few OpenACC pragmas to the code. An OpenCL implementation provided speedups on GPUs of at least faster than the sequential implementation and faster than a parallelized OpenMP implementation. An implementation of OpenMP on Intel MIC coprocessor provided speedups of with only a few code changes to the sequential implementation. We highlight that OpenACC provides an automatic, efficient, and portable approach to achieve parallelization of 2D cardiac wave simulations on GPUs. Our approach of using OpenACC, OpenCL, and OpenMP to parallelize this particular model on modern computational accelerators should be applicable to other computational models of wave propagation in multi-dimensional media. PMID:24497950
Flow visualization in radial flow through stationary and corotating parallel disks
NASA Astrophysics Data System (ADS)
Mochizuki, S.; Tanaka, M.; Yang, Wen-Jei
Paraffin mist is used here as a tracer to observe the patterns in the radial flow through both stationary and corotating parallel disks. The periodic and alternative generation of separation bubbles on both disks and the resulting flow fluctuation and turbulent flow in the radial channel are studied. Stall cells are visualized around the outer rim of the corotating disks.
10-channel fiber array fabrication technique for parallel optical coherence tomography system
NASA Astrophysics Data System (ADS)
Arauz, Lina J.; Luo, Yuan; Castillo, Jose E.; Kostuk, Raymond K.; Barton, Jennifer
2007-02-01
Optical Coherence Tomography (OCT) shows great promise for low intrusive biomedical imaging applications. A parallel OCT system is a novel technique that replaces mechanical transverse scanning with electronic scanning. This will reduce the time required to acquire image data. In this system an array of small diameter fibers is required to obtain an image in the transverse direction. Each fiber in the array is configured in an interferometer and is used to image one pixel in the transverse direction. In this paper we describe a technique to package 15μm diameter fibers on a siliconsilica substrate to be used in a 2mm endoscopic probe tip. Single mode fibers are etched to reduce the cladding diameter from 125μm to 15μm. Etched fibers are placed into a 4mm by 150μm trench in a silicon-silica substrate and secured with UV glue. Active alignment was used to simplify the lay out of the fibers and minimize unwanted horizontal displacement of the fibers. A 10-channel fiber array was built, tested and later incorporated into a parallel optical coherence system. This paper describes the packaging, testing, and operation of the array in a parallel OCT system.
High-performance computational fluid dynamics: a custom-code approach
NASA Astrophysics Data System (ADS)
Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.
2016-07-01
We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
Parallel detection experiment of fluorescence confocal microscopy using DMD.
Wang, Qingqing; Zheng, Jihong; Wang, Kangni; Gui, Kun; Guo, Hanming; Zhuang, Songlin
2016-05-01
Parallel detection of fluorescence confocal microscopy (PDFCM) system based on Digital Micromirror Device (DMD) is reported in this paper in order to realize simultaneous multi-channel imaging and improve detection speed. DMD is added into PDFCM system, working to take replace of the single traditional pinhole in the confocal system, which divides the laser source into multiple excitation beams. The PDFCM imaging system based on DMD is experimentally set up. The multi-channel image of fluorescence signal of potato cells sample is detected by parallel lateral scanning in order to verify the feasibility of introducing the DMD into fluorescence confocal microscope. In addition, for the purpose of characterizing the microscope, the depth response curve is also acquired. The experimental result shows that in contrast to conventional microscopy, the DMD-based PDFCM system has higher axial resolution and faster detection speed, which may bring some potential benefits in the biology and medicine analysis. SCANNING 38:234-239, 2016. © 2015 Wiley Periodicals, Inc. © Wiley Periodicals, Inc.
A 24-ch Phased-Array System for Hyperpolarized Helium Gas Parallel MRI to Evaluate Lung Functions.
Lee, Ray; Johnson, Glyn; Stefanescu, Cornel; Trampel, Robert; McGuinness, Georgeann; Stoeckel, Bernd
2005-01-01
Hyperpolarized 3He gas MRI has a serious potential for assessing pulmonary functions. Due to the fact that the non-equilibrium of the gas results in a steady depletion of the signal level over the course of the excitations, the signal-tonoise ratio (SNR) can be independent of the number of the data acquisitions under certain circumstances. This provides a unique opportunity for parallel MRI for gaining both temporal and spatial resolution without reducing SNR. We have built a 24-channel receive / 2-channel transmit phased array system for 3He parallel imaging. Our in vivo experimental results proved that the significant temporal and spatial resolution can be gained at no cost to the SNR. With 3D data acquisition, eight fold (2x4) scan time reduction can be achieved without any aliasing in images. Additionally, a rigid analysis using the low impedance preamplifier for decoupling presented evidence of strong coupling.
Brian hears: online auditory processing using vectorization over channels.
Fontaine, Bertrand; Goodman, Dan F M; Benichoux, Victor; Brette, Romain
2011-01-01
The human cochlea includes about 3000 inner hair cells which filter sounds at frequencies between 20 Hz and 20 kHz. This massively parallel frequency analysis is reflected in models of auditory processing, which are often based on banks of filters. However, existing implementations do not exploit this parallelism. Here we propose algorithms to simulate these models by vectorizing computation over frequency channels, which are implemented in "Brian Hears," a library for the spiking neural network simulator package "Brian." This approach allows us to use high-level programming languages such as Python, because with vectorized operations, the computational cost of interpretation represents a small fraction of the total cost. This makes it possible to define and simulate complex models in a simple way, while all previous implementations were model-specific. In addition, we show that these algorithms can be naturally parallelized using graphics processing units, yielding substantial speed improvements. We demonstrate these algorithms with several state-of-the-art cochlear models, and show that they compare favorably with existing, less flexible, implementations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anovitz, Lawrence; Mamontov, Eugene; Ishai, Paul ben
2013-01-01
The properties of fluids can be significantly altered by the geometry of their confining environments. While there has been significant work on the properties of such confined fluids, the properties of fluids under ultraconfinement, environments where, at least in one plane, the dimensions of the confining environment are similar to that of the confined molecule, have not been investigated. This paper investigates the dynamic properties of water in beryl (Be3Al2Si6O18), the structure of which contains approximately 5-A-diam channels parallel to the c axis. Three techniques, inelastic neutron scattering, quasielastic neutron scattering, and dielectric spectroscopy, have been used to quantify thesemore » properties over a dynamic range covering approximately 16 orders of magnitude. Because beryl can be obtained in large single crystals we were able to quantify directional variations, perpendicular and parallel to the channel directions, in the dynamics of the confined fluid. These are significantly anisotropic and, somewhat counterintuitively, show that vibrations parallel to the c-axis channels are significantly more hindered than those perpendicular to the channels. The effective potential for vibrations in the c direction is harder than the potential in directions perpendicular to it. There is evidence of single-file diffusion of water molecules along the channels at higher temperatures, but below 150 K this diffusion is strongly suppressed. No such suppression, however, has been observed in the channel-perpendicular direction. Inelastic neutron scattering spectra include an intramolecular stretching O-H peak at 465 meV. As this is nearly coincident with that known for free water molecules and approximately 30 meV higher than that in liquid water or ice, this suggests that there is no hydrogen bonding constraining vibrations between the channel water and the beryl structure. However, dielectric spectroscopic measurements at higher temperatures and lower frequencies yield an activation energy for the dipole reorientation of 16.4 0.14 kJ/mol, close to the energy required to break a hydrogen bond in bulk water. This may suggest the presence of some other form of bonding between the water molecules and the structure, but the resolution of the apparent contradiction between the inelastic neutron and dielectric spectroscopic results remains uncertain.« less
Micromachined peristaltic pump
NASA Technical Reports Server (NTRS)
Hartley, Frank T. (Inventor)
1998-01-01
A micromachined pump including a channel formed in a semiconductor substrate by conventional processes such as chemical etching. A number of insulating barriers are established in the substrate parallel to one another and transverse to the channel. The barriers separate a series of electrically conductive strips. An overlying flexible conductive membrane is applied over the channel and conductive strips with an insulating layer separating the conductive strips from the conductive membrane. Application of a sequential voltage to the series of strips pulls the membrane into the channel portion of each successive strip to achieve a pumping action. A particularly desirable arrangement employs a micromachined push-pull dual channel cavity employing two substrates with a single membrane sandwiched between them.
Using DoD Maps to Examine the Influence of Large Wood on Channel Morphodynamics
NASA Astrophysics Data System (ADS)
MacKenzie, L. C.; Eaton, B. C.
2012-12-01
Since the advent of logging and slash burning, many streams in British Columbia have experienced changes to the amount of large wood added to or removed from these systems, which has, in turn, influenced the storage and movement of sediment within these channels. This set of flume experiments examines and quantifies the impacts of large wood on the reach-scale morphodynamics. Understanding the relation between the wood load and channel morphodynamics is important when assessing the quality of the aquatic habitat of a stream. The experiments were conducted using a fixed-bank, mobile bed Froude-scaled physical model of Fishtrap Creek, British Columbia, built in a shallow flume that is 1.5 m wide and 11 m long. The stream table was run without wood until it reached equilibrium at which point wood pieces of varying sizes were added to the channel. The bed morphology was surveyed using a laser profiling system at five-hour intervals. The laser profiles were then interpolated to create digital elevation models (DEM) from which DEM of difference (DoD) maps were produced. Analysis of the DoD maps focused on quantifying and locating differences in the distribution of sediment storage, erosion, and deposition between the runs as well as those induced by the addition of large wood into the stream channel. We then assessed the typical influence of individual pieces and of jams on pool frequency, size and distribution along the channels.
Approaches in highly parameterized inversion - GENIE, a general model-independent TCP/IP run manager
Muffels, Christopher T.; Schreuder, Willem A.; Doherty, John E.; Karanovic, Marinko; Tonkin, Matthew J.; Hunt, Randall J.; Welter, David E.
2012-01-01
GENIE is a model-independent suite of programs that can be used to generally distribute, manage, and execute multiple model runs via the TCP/IP infrastructure. The suite consists of a file distribution interface, a run manage, a run executer, and a routine that can be compiled as part of a program and used to exchange model runs with the run manager. Because communication is via a standard protocol (TCP/IP), any computer connected to the Internet can serve in any of the capacities offered by this suite. Model independence is consistent with the existing template and instruction file protocols of the widely used PEST parameter estimation program. This report describes (1) the problem addressed; (2) the approach used by GENIE to queue, distribute, and retrieve model runs; and (3) user instructions, classes, and functions developed. It also includes (4) an example to illustrate the linking of GENIE with Parallel PEST using the interface routine.
Spatiotemporal Responses of Groundwater Flow and Aquifer-River Exchanges to Flood Events
NASA Astrophysics Data System (ADS)
Liang, Xiuyu; Zhan, Hongbin; Schilling, Keith
2018-03-01
Rapidly rising river stages induced by flood events lead to considerable river water infiltration into aquifers and carry surface-borne solutes into hyporheic zones which are widely recognized as an important place for the biogeochemical activity. Existing studies for surface-groundwater exchanges induced by flood events usually limit to a river-aquifer cross section that is perpendicular to river channels, and neglect groundwater flow in parallel with river channels. In this study, surface-groundwater exchanges to a flood event are investigated with specific considerations of unconfined flow in direction that is in parallel with river channels. The groundwater flow is described by a two-dimensional Boussinesq equation and the flood event is described by a diffusive-type flood wave. Analytical solutions are derived and tested using the numerical solution. The results indicate that river water infiltrates into aquifers quickly during flood events, and mostly returns to the river within a short period of time after the flood event. However, the rest river water will stay in aquifers for a long period of time. The residual river water not only flows back to rivers but also flows to downstream aquifers. The one-dimensional model of neglecting flow in the direction parallel with river channels will overestimate heads and discharge in upstream aquifers. The return flow induced by the flood event has a power law form with time and has a significant impact on the base flow recession at early times. The solution can match the observed hydraulic heads in riparian zone wells of Iowa during flood events.
NASA Astrophysics Data System (ADS)
Choi, Charles J.; Chan, Leo L.; Pineda, Maria F.; Cunningham, Brian T.
2007-09-01
Assays used in pharmaceutical research require a system that can not only detect biochemical interactions with high sensitivity, but that can also perform many measurements in parallel while consuming low volumes of reagents. While nearly all label-free biosensor transducers to date have been interfaced with a flow channel, the liquid handling system is typically aligned and bonded to the transducer for supplying analytes to only a few sensors in parallel. In this presentation, we describe a fabrication approach for photonic crystal biosensors that utilizes nanoreplica molding to produce a network of sensors that are automatically self-aligned with a microfluidic network in a single process step. The sensor/fluid network is inexpensively produced on large surface areas upon flexible plastic substrates, allowing the device to be incorporated into standard format 96-well microplates. A simple flow scheme using hydrostatic pressure applied through a single control point enables immobilization of capture ligands upon a large number of sensors with 220 nL of reagent, and subsequent exposure of the sensors to test samples. A high resolution imaging detection instrument is capable of monitoring the binding within parallel channels at rates compatible with determining kinetic binding constants between the immobilized ligands and the analytes. The first implementation of this system is capable of monitoring the kinetic interactions of 11 flow channels at once, and a total of 88 channels within an integrated biosensor microplate in rapid succession. The system was initially tested to characterize the interaction between sets of proteins with known binding behavior.
Local search to improve coordinate-based task mapping
Balzuweit, Evan; Bunde, David P.; Leung, Vitus J.; ...
2015-10-31
We present a local search strategy to improve the coordinate-based mapping of a parallel job’s tasks to the MPI ranks of its parallel allocation in order to reduce network congestion and the job’s communication time. The goal is to reduce the number of network hops between communicating pairs of ranks. Our target is applications with a nearest-neighbor stencil communication pattern running on mesh systems with non-contiguous processor allocation, such as Cray XE and XK Systems. Utilizing the miniGhost mini-app, which models the shock physics application CTH, we demonstrate that our strategy reduces application running time while also reducing the runtimemore » variability. Furthermore, we further show that mapping quality can vary based on the selected allocation algorithm, even between allocation algorithms of similar apparent quality.« less
NASA Technical Reports Server (NTRS)
Horvath, Joan C.; Alkalaj, Leon J.; Schneider, Karl M.; Amador, Arthur V.; Spitale, Joseph N.
1993-01-01
Robotic spacecraft are controlled by sets of commands called 'sequences.' These sequences must be checked against mission constraints. Making our existing constraint checking program faster would enable new capabilities in our uplink process. Therefore, we are rewriting this program to run on a parallel computer. To do so, we had to determine how to run constraint-checking algorithms in parallel and create a new method of specifying spacecraft models and constraints. This new specification gives us a means of representing flight systems and their predicted response to commands which could be used in a variety of applications throughout the command process, particularly during anomaly or high-activity operations. This commonality could reduce operations cost and risk for future complex missions. Lessons learned in applying some parts of this system to the TOPEX/Poseidon mission will be described.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cleveland, Mathew A., E-mail: cleveland7@llnl.gov; Brunner, Thomas A.; Gentile, Nicholas A.
2013-10-15
We describe and compare different approaches for achieving numerical reproducibility in photon Monte Carlo simulations. Reproducibility is desirable for code verification, testing, and debugging. Parallelism creates a unique problem for achieving reproducibility in Monte Carlo simulations because it changes the order in which values are summed. This is a numerical problem because double precision arithmetic is not associative. Parallel Monte Carlo, both domain replicated and decomposed simulations, will run their particles in a different order during different runs of the same simulation because the non-reproducibility of communication between processors. In addition, runs of the same simulation using different domain decompositionsmore » will also result in particles being simulated in a different order. In [1], a way of eliminating non-associative accumulations using integer tallies was described. This approach successfully achieves reproducibility at the cost of lost accuracy by rounding double precision numbers to fewer significant digits. This integer approach, and other extended and reduced precision reproducibility techniques, are described and compared in this work. Increased precision alone is not enough to ensure reproducibility of photon Monte Carlo simulations. Non-arbitrary precision approaches require a varying degree of rounding to achieve reproducibility. For the problems investigated in this work double precision global accuracy was achievable by using 100 bits of precision or greater on all unordered sums which where subsequently rounded to double precision at the end of every time-step.« less
Capacity, cutoff rate, and coding for a direct-detection optical channel
NASA Technical Reports Server (NTRS)
Massey, J. L.
1980-01-01
It is shown that Pierce's pulse position modulation scheme with 2 to the L pulse positions used on a self-noise-limited direct detection optical communication channel results in a 2 to the L-ary erasure channel that is equivalent to the parallel combination of L completely correlated binary erasure channels. The capacity of the full channel is the sum of the capacities of the component channels, but the cutoff rate of the full channel is shown to be much smaller than the sum of the cutoff rates. An interpretation of the cutoff rate is given that suggests a complexity advantage in coding separately on the component channels. It is shown that if short-constraint-length convolutional codes with Viterbi decoders are used on the component channels, then the performance and complexity compare favorably with the Reed-Solomon coding system proposed by McEliece for the full channel. The reasons for this unexpectedly fine performance by the convolutional code system are explored in detail, as are various facets of the channel structure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gong, S.; Labanca, I.; Rech, I.
2014-10-15
Fluorescence correlation spectroscopy (FCS) is a well-established technique to study binding interactions or the diffusion of fluorescently labeled biomolecules in vitro and in vivo. Fast FCS experiments require parallel data acquisition and analysis which can be achieved by exploiting a multi-channel Single Photon Avalanche Diode (SPAD) array and a corresponding multi-input correlator. This paper reports a 32-channel FPGA based correlator able to perform 32 auto/cross-correlations simultaneously over a lag-time ranging from 10 ns up to 150 ms. The correlator is included in a 32 × 1 SPAD array module, providing a compact and flexible instrument for high throughput FCS experiments.more » However, some inherent features of SPAD arrays, namely afterpulsing and optical crosstalk effects, may introduce distortions in the measurement of auto- and cross-correlation functions. We investigated these limitations to assess their impact on the module and evaluate possible workarounds.« less
Tsai, Shang-Yueh; Otazo, Ricardo; Posse, Stefan; Lin, Yi-Ru; Chung, Hsiao-Wen; Wald, Lawrence L; Wiggins, Graham C; Lin, Fa-Hsuan
2008-05-01
Parallel imaging has been demonstrated to reduce the encoding time of MR spectroscopic imaging (MRSI). Here we investigate up to 5-fold acceleration of 2D proton echo planar spectroscopic imaging (PEPSI) at 3T using generalized autocalibrating partial parallel acquisition (GRAPPA) with a 32-channel coil array, 1.5 cm(3) voxel size, TR/TE of 15/2000 ms, and 2.1 Hz spectral resolution. Compared to an 8-channel array, the smaller RF coil elements in this 32-channel array provided a 3.1-fold and 2.8-fold increase in signal-to-noise ratio (SNR) in the peripheral region and the central region, respectively, and more spatial modulated information. Comparison of sensitivity-encoding (SENSE) and GRAPPA reconstruction using an 8-channel array showed that both methods yielded similar quantitative metabolite measures (P > 0.1). Concentration values of N-acetyl-aspartate (NAA), total creatine (tCr), choline (Cho), myo-inositol (mI), and the sum of glutamate and glutamine (Glx) for both methods were consistent with previous studies. Using the 32-channel array coil the mean Cramer-Rao lower bounds (CRLB) were less than 8% for NAA, tCr, and Cho and less than 15% for mI and Glx at 2-fold acceleration. At 4-fold acceleration the mean CRLB for NAA, tCr, and Cho was less than 11%. In conclusion, the use of a 32-channel coil array and GRAPPA reconstruction can significantly reduce the measurement time for mapping brain metabolites. (c) 2008 Wiley-Liss, Inc.
Linux Kernel Co-Scheduling and Bulk Synchronous Parallelism
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jones, Terry R
2012-01-01
This paper describes a kernel scheduling algorithm that is based on coscheduling principles and that is intended for parallel applications running on 1000 cores or more. Experimental results for a Linux implementation on a Cray XT5 machine are presented. The results indicate that Linux is a suitable operating system for this new scheduling scheme, and that this design provides a dramatic improvement in scaling performance for synchronizing collective operations at scale.
Parallel deterministic neutronics with AMR in 3D
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clouse, C.; Ferguson, J.; Hendrickson, C.
1997-12-31
AMTRAN, a three dimensional Sn neutronics code with adaptive mesh refinement (AMR) has been parallelized over spatial domains and energy groups and runs on the Meiko CS-2 with MPI message passing. Block refined AMR is used with linear finite element representations for the fluxes, which allows for a straight forward interpretation of fluxes at block interfaces with zoning differences. The load balancing algorithm assumes 8 spatial domains, which minimizes idle time among processors.
2012-02-17
to be solved. Disclaimer: Reference herein to any specific commercial company , product, process, or service by trade name, trademark...data processing rather than data caching and control flow. To make use of this computational power, NVIDIA introduced a general purpose parallel...GPU implementations were run on an Intel Nehalem Xeon E5520 2.26GHz processor with an NVIDIA Tesla C2070 graphics card for varying numbers of
Unsteady MHD blood flow through porous medium in a parallel plate channel
NASA Astrophysics Data System (ADS)
Latha, R.; Rushi Kumar, B.
2017-11-01
In this study, we have analyzed heat and mass transfer effects on unsteady blood flow through parallel plate channel in a saturated porous medium in the presence of a transverse magnetic field with thermal radiation. The governing higher order nonlinear PDE’S are converted to dimensionless equations using dimensionless variables. The dimensionless equations are then solved analytically using boundary conditions by choosing the axial flow transport and the fields of concentration and temperature apart from the normal velocity as a function of y and t. The effects of different pertinent parameters appeared in this model viz thermal radiation, Prandtl number, Heat source parameter, Hartmann number, Permeability parameter, Decay parameter on axial flow transport and the normal velocity are analyzed in detail.
Computer-Aided Parallelizer and Optimizer
NASA Technical Reports Server (NTRS)
Jin, Haoqiang
2011-01-01
The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.
Crosetto, D.B.
1996-12-31
The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor to a plurality of slave processors to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor`s status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer, a digital signal processor, a parallel transfer controller, and two three-port memory devices. A communication switch within each node connects it to a fast parallel hardware channel through which all high density data arrives or leaves the node. 6 figs.
Crosetto, Dario B.
1996-01-01
The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor (100) to a plurality of slave processors (200) to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor's status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer (104), a digital signal processor (114), a parallel transfer controller (106), and two three-port memory devices. A communication switch (108) within each node (100) connects it to a fast parallel hardware channel (70) through which all high density data arrives or leaves the node.
Fuel cell system configurations
Kothmann, Richard E.; Cyphers, Joseph A.
1981-01-01
Fuel cell stack configurations having elongated polygonal cross-sectional shapes and gaskets at the peripheral faces to which flow manifolds are sealingly affixed. Process channels convey a fuel and an oxidant through longer channels, and a cooling fluid is conveyed through relatively shorter cooling passages. The polygonal structure preferably includes at least two right angles, and the faces of the stack are arranged in opposite parallel pairs.
Gong, Chunye; Bao, Weimin; Tang, Guojian; Jiang, Yuewen; Liu, Jie
2014-01-01
It is very time consuming to solve fractional differential equations. The computational complexity of two-dimensional fractional differential equation (2D-TFDE) with iterative implicit finite difference method is O(M(x)M(y)N(2)). In this paper, we present a parallel algorithm for 2D-TFDE and give an in-depth discussion about this algorithm. A task distribution model and data layout with virtual boundary are designed for this parallel algorithm. The experimental results show that the parallel algorithm compares well with the exact solution. The parallel algorithm on single Intel Xeon X5540 CPU runs 3.16-4.17 times faster than the serial algorithm on single CPU core. The parallel efficiency of 81 processes is up to 88.24% compared with 9 processes on a distributed memory cluster system. We do think that the parallel computing technology will become a very basic method for the computational intensive fractional applications in the near future.
Polarization of low-frequency electromagnetic radiation in the lobes of Jupiter's magnetotail
NASA Technical Reports Server (NTRS)
Moses, S. L.; Kennel, C. F.; Coroniti, F. V.; Scarf, F. L.; Kurth, W. S.
1987-01-01
The plasma wave instruments on the Voyager spacecraft have detected intense electromagnetic radiation within the lobes of Jupiter's magnetic tail down to the lowest frequency of the detector (10 Hz). During a yaw maneuver performed by Voyager 1 in the lobe of the Jovian magnetotail, a modulation appeared in the amplitudes of waves detected in the 10-, 17.8- and 31.1-Hz channels of the plasma wave analyzer, well below the local electron cyclotron frequency of 260 Hz. The lowest amplitudes occurred when the antenna axis was most nearly parallel to the magnetic field. Wave amplitudes in the 56.2-Hz and higher frequency channels remained nearly constant during the maneuver. From the cold-plasma theory of electromagnetic waves, it is concluded that the plasma frequency was between the 56.2- and 31.1-Hz channels where the parallel-polarized component of the spectrum cuts off. This implies a tail-lobe density between 0.000032 and 0.000015/cu cm. The left-hand cutoff frequency would then be below 10 Hz, consistent with either the Z-mode (L, X) or whistlers (R-mode) in the modulated channels.
Skirted projectiles for railguns
Hawke, Ronald S.; Susoeff, Allan R.
1994-01-01
A single skirt projectile (20) having an insulating skirt (22) at its rear, or a dual trailing skirt projectile (30, 40, 50, 60) having an insulating skirt (32, 42, 52, 62) succeeded by an arc extinguishing skirt (34, 44, 54, 64), is accelerated by a railgun accelerator 10 having a pair of parallel conducting rails (1a, 1b) which are separated by insulating wall spacers (11). The insulating skirt (22, 32, 42, 52, 62) includes a plasma channel (38). The arc extinguishing skirt (34, 44, 54, 64) interrupts the conduction that occurs in the insulating skirt channel (38) by blocking the plasma arc (3) from conducting current from rail to rail (1a, 1b) at the rear of the projectile (30, 40, 50, 60). The arc extinguishing skirt may be comprised of two plates (36a, 36b) which form a horseshoe wherein the plates are parallel to the rails (1a, b); a chisel-shape design; cross-shaped, or it may be a cylindrical (64). The length of the insulating skirt channel is selected such that there is sufficient plasma in the channel to enable adequate current conduction between the rails (1a, 1b).
Skirted projectiles for railguns
Hawke, R.S.; Susoeff, A.R.
1994-01-04
A single skirt projectile (20) having an insulating skirt (22) at its rear, or a dual trailing skirt projectile (30, 40, 50, 60) having an insulating skirt (32, 42, 52, 62) succeeded by an arc extinguishing skirt (34, 44, 54, 64), is accelerated by a railgun accelerator 10 having a pair of parallel conducting rails (1a, 1b) which are separated by insulating wall spacers (11). The insulating skirt (22, 32, 42, 52, 62) includes a plasma channel (38). The arc extinguishing skirt (34, 44, 54, 64) interrupts the conduction that occurs in the insulating skirt channel (38) by blocking the plasma arc (3) from conducting current from rail to rail (1a, 1b) at the rear of the projectile (30, 40, 50, 60). The arc extinguishing skirt may be comprised of two plates (36a, 36b) which form a horseshoe wherein the plates are parallel to the rails (1a, b); a chisel-shape design; cross-shaped, or it may be a cylindrical (64). The length of the insulating skirt channel is selected such that there is sufficient plasma in the channel to enable adequate current conduction between the rails (1a, 1b).
NASA Astrophysics Data System (ADS)
Meng, Luming; Sheong, Fu Kit; Zeng, Xiangze; Zhu, Lizhe; Huang, Xuhui
2017-07-01
Constructing Markov state models from large-scale molecular dynamics simulation trajectories is a promising approach to dissect the kinetic mechanisms of complex chemical and biological processes. Combined with transition path theory, Markov state models can be applied to identify all pathways connecting any conformational states of interest. However, the identified pathways can be too complex to comprehend, especially for multi-body processes where numerous parallel pathways with comparable flux probability often coexist. Here, we have developed a path lumping method to group these parallel pathways into metastable path channels for analysis. We define the similarity between two pathways as the intercrossing flux between them and then apply the spectral clustering algorithm to lump these pathways into groups. We demonstrate the power of our method by applying it to two systems: a 2D-potential consisting of four metastable energy channels and the hydrophobic collapse process of two hydrophobic molecules. In both cases, our algorithm successfully reveals the metastable path channels. We expect this path lumping algorithm to be a promising tool for revealing unprecedented insights into the kinetic mechanisms of complex multi-body processes.
Calibrationless parallel magnetic resonance imaging: a joint sparsity model.
Majumdar, Angshul; Chaudhury, Kunal Narayan; Ward, Rabab
2013-12-05
State-of-the-art parallel MRI techniques either explicitly or implicitly require certain parameters to be estimated, e.g., the sensitivity map for SENSE, SMASH and interpolation weights for GRAPPA, SPIRiT. Thus all these techniques are sensitive to the calibration (parameter estimation) stage. In this work, we have proposed a parallel MRI technique that does not require any calibration but yields reconstruction results that are at par with (or even better than) state-of-the-art methods in parallel MRI. Our proposed method required solving non-convex analysis and synthesis prior joint-sparsity problems. This work also derives the algorithms for solving them. Experimental validation was carried out on two datasets-eight channel brain and eight channel Shepp-Logan phantom. Two sampling methods were used-Variable Density Random sampling and non-Cartesian Radial sampling. For the brain data, acceleration factor of 4 was used and for the other an acceleration factor of 6 was used. The reconstruction results were quantitatively evaluated based on the Normalised Mean Squared Error between the reconstructed image and the originals. The qualitative evaluation was based on the actual reconstructed images. We compared our work with four state-of-the-art parallel imaging techniques; two calibrated methods-CS SENSE and l1SPIRiT and two calibration free techniques-Distributed CS and SAKE. Our method yields better reconstruction results than all of them.
Parallel programming of industrial applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heroux, M; Koniges, A; Simon, H
1998-07-21
In the introductory material, we overview the typical MPP environment for real application computing and the special tools available such as parallel debuggers and performance analyzers. Next, we draw from a series of real applications codes and discuss the specific challenges and problems that are encountered in parallelizing these individual applications. The application areas drawn from include biomedical sciences, materials processing and design, plasma and fluid dynamics, and others. We show how it was possible to get a particular application to run efficiently and what steps were necessary. Finally we end with a summary of the lessons learned from thesemore » applications and predictions for the future of industrial parallel computing. This tutorial is based on material from a forthcoming book entitled: "Industrial Strength Parallel Computing" to be published by Morgan Kaufmann Publishers (ISBN l-55860-54).« less
Simplified Parallel Domain Traversal
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erickson III, David J
2011-01-01
Many data-intensive scientific analysis techniques require global domain traversal, which over the years has been a bottleneck for efficient parallelization across distributed-memory architectures. Inspired by MapReduce and other simplified parallel programming approaches, we have designed DStep, a flexible system that greatly simplifies efficient parallelization of domain traversal techniques at scale. In order to deliver both simplicity to users as well as scalability on HPC platforms, we introduce a novel two-tiered communication architecture for managing and exploiting asynchronous communication loads. We also integrate our design with advanced parallel I/O techniques that operate directly on native simulation output. We demonstrate DStep bymore » performing teleconnection analysis across ensemble runs of terascale atmospheric CO{sub 2} and climate data, and we show scalability results on up to 65,536 IBM BlueGene/P cores.« less
An efficient parallel algorithm for matrix-vector multiplication
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hendrickson, B.; Leland, R.; Plimpton, S.
The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in themore » well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.« less
Parallel consistent labeling algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Samal, A.; Henderson, T.
Mackworth and Freuder have analyzed the time complexity of several constraint satisfaction algorithms. Mohr and Henderson have given new algorithms, AC-4 and PC-3, for arc and path consistency, respectively, and have shown that the arc consistency algorithm is optimal in time complexity and of the same order space complexity as the earlier algorithms. In this paper, they give parallel algorithms for solving node and arc consistency. They show that any parallel algorithm for enforcing arc consistency in the worst case must have O(na) sequential steps, where n is number of nodes, and a is the number of labels per node.more » They give several parallel algorithms to do arc consistency. It is also shown that they all have optimal time complexity. The results of running the parallel algorithms on a BBN Butterfly multiprocessor are also presented.« less
Suppressing correlations in massively parallel simulations of lattice models
NASA Astrophysics Data System (ADS)
Kelling, Jeffrey; Ódor, Géza; Gemming, Sibylle
2017-11-01
For lattice Monte Carlo simulations parallelization is crucial to make studies of large systems and long simulation time feasible, while sequential simulations remain the gold-standard for correlation-free dynamics. Here, various domain decomposition schemes are compared, concluding with one which delivers virtually correlation-free simulations on GPUs. Extensive simulations of the octahedron model for 2 + 1 dimensional Kardar-Parisi-Zhang surface growth, which is very sensitive to correlation in the site-selection dynamics, were performed to show self-consistency of the parallel runs and agreement with the sequential algorithm. We present a GPU implementation providing a speedup of about 30 × over a parallel CPU implementation on a single socket and at least 180 × with respect to the sequential reference.
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2002-01-01
Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
Spurrier, Francis R.; Pierce, Bill L.; Wright, Maynard K.
1986-01-01
A plate for a fuel cell has an arrangement of ribs defining an improved configuration of process gas channels and slots on a surface of the plate which provide a modified serpentine gas flow pattern across the plate surface. The channels are generally linear and arranged parallel to one another while the spaced slots allow cross channel flow of process gas in a staggered fashion which creates a plurality of generally mini-serpentine flow paths extending transverse to the longitudinal gas flow along the channels. Adjacent pairs of the channels are interconnected to one another in flow communication. Also, a bipolar plate has the aforementioned process gas channel configuration on one surface and another configuration on the opposite surface. In the other configuration, there are not slots and the gas flow channels have a generally serpentine configuration.
Solak, Murat; Kiliç, Mehmet; Hüseyin, Yazici; Sencan, Aziz
2009-12-15
In this study, removal of suspended solids (SS) and turbidity from marble processing wastewaters by electrocoagulation (EC) process were investigated by using aluminium (Al) and iron (Fe) electrodes which were run in serial and parallel connection systems. To remove these pollutants from the marble processing wastewater, an EC reactor including monopolar electrodes (Al/Fe) in parallel and serial connection system, was utilized. Optimization of differential operation parameters such as pH, current density, and electrolysis time on SS and turbidity removal were determined in this way. EC process with monopolar Al electrodes in parallel and serial connections carried out at the optimum conditions where the pH value was 9, current density was approximately 15 A/m(2), and electrolysis time was 2 min resulted in 100% SS removal. Removal efficiencies of EC process for SS with monopolar Fe electrodes in parallel and serial connection were found to be 99.86% and 99.94%, respectively. Optimum parameters for monopolar Fe electrodes in both of the connection types were found to be for pH value as 8, for electrolysis time as 2 min. The optimum current density value for Fe electrodes used in serial and parallel connections was also obtained at 10 and 20 A/m(2), respectively. Based on the results obtained, it was found that EC process running with each type of the electrodes and the connections was highly effective for the removal of SS and turbidity from marble processing wastewaters, and that operating costs with monopolar Al electrodes in parallel connection were the cheapest than that of the serial connection and all the configurations for Fe electrode.
Echegaray, Sebastian; Bakr, Shaimaa; Rubin, Daniel L; Napel, Sandy
2017-10-06
The aim of this study was to develop an open-source, modular, locally run or server-based system for 3D radiomics feature computation that can be used on any computer system and included in existing workflows for understanding associations and building predictive models between image features and clinical data, such as survival. The QIFE exploits various levels of parallelization for use on multiprocessor systems. It consists of a managing framework and four stages: input, pre-processing, feature computation, and output. Each stage contains one or more swappable components, allowing run-time customization. We benchmarked the engine using various levels of parallelization on a cohort of CT scans presenting 108 lung tumors. Two versions of the QIFE have been released: (1) the open-source MATLAB code posted to Github, (2) a compiled version loaded in a Docker container, posted to DockerHub, which can be easily deployed on any computer. The QIFE processed 108 objects (tumors) in 2:12 (h/mm) using 1 core, and 1:04 (h/mm) hours using four cores with object-level parallelization. We developed the Quantitative Image Feature Engine (QIFE), an open-source feature-extraction framework that focuses on modularity, standards, parallelism, provenance, and integration. Researchers can easily integrate it with their existing segmentation and imaging workflows by creating input and output components that implement their existing interfaces. Computational efficiency can be improved by parallelizing execution at the cost of memory usage. Different parallelization levels provide different trade-offs, and the optimal setting will depend on the size and composition of the dataset to be processed.
A powered prosthetic ankle joint for walking and running.
Grimmer, Martin; Holgate, Matthew; Holgate, Robert; Boehler, Alexander; Ward, Jeffrey; Hollander, Kevin; Sugar, Thomas; Seyfarth, André
2016-12-19
Current prosthetic ankle joints are designed either for walking or for running. In order to mimic the capabilities of an able-bodied, a powered prosthetic ankle for walking and running was designed. A powered system has the potential to reduce the limitations in range of motion and positive work output of passive walking and running feet. To perform the experiments a controller capable of transitions between standing, walking, and running with speed adaptations was developed. In the first case study the system was mounted on an ankle bypass in parallel with the foot of a non-amputee subject. By this method the functionality of hardware and controller was proven. The Walk-Run ankle was capable of mimicking desired torque and angle trajectories in walking and running up to 2.6 m/s. At 4 m/s running, ankle angle could be matched while ankle torque could not. Limited ankle output power resulting from a suboptimal spring stiffness value was identified as a main reason. Further studies have to show to what extent the findings can be transferred to amputees.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gudino, N., E-mail: natalia.gudino@nih.gov; Sonmez, M.; Nielles-Vallespin, S.
2015-01-15
Purpose: To provide a rapid method to reduce the radiofrequency (RF) E-field coupling and consequent heating in long conductors in an interventional MRI (iMRI) setup. Methods: A driving function for device heating (W) was defined as the integration of the E-field along the direction of the wire and calculated through a quasistatic approximation. Based on this function, the phases of four independently controlled transmit channels were dynamically changed in a 1.5 T MRI scanner. During the different excitation configurations, the RF induced heating in a nitinol wire immersed in a saline phantom was measured by fiber-optic temperature sensing. Additionally, amore » minimization of W as a function of phase and amplitude values of the different channels and constrained by the homogeneity of the RF excitation field (B{sub 1}) over a region of interest was proposed and its results tested on the benchtop. To analyze the validity of the proposed method, using a model of the array and phantom setup tested in the scanner, RF fields and SAR maps were calculated through finite-difference time-domain (FDTD) simulations. In addition to phantom experiments, RF induced heating of an active guidewire inserted in a swine was also evaluated. Results: In the phantom experiment, heating at the tip of the device was reduced by 92% when replacing the body coil by an optimized parallel transmit excitation with same nominal flip angle. In the benchtop, up to 90% heating reduction was measured when implementing the constrained minimization algorithm with the additional degree of freedom given by independent amplitude control. The computation of the optimum phase and amplitude values was executed in just 12 s using a standard CPU. The results of the FDTD simulations showed similar trend of the local SAR at the tip of the wire and measured temperature as well as to a quadratic function of W, confirming the validity of the quasistatic approach for the presented problem at 64 MHz. Imaging and heating reduction of the guidewire were successfully performed in vivo with the proposed hardware and phase control. Conclusions: Phantom and in vivo data demonstrated that additional degrees of freedom in a parallel transmission system can be used to control RF induced heating in long conductors. A novel constrained optimization approach to reduce device heating was also presented that can be run in just few seconds and therefore could be added to an iMRI protocol to improve RF safety.« less
Anderson, J.B.
1960-01-01
A reactor is described which comprises a tank, a plurality of coaxial steel sleeves in the tank, a mass of water in the tank, and wire grids in abutting relationship within a plurality of elongated parallel channels within the steel sleeves, the wire being provided with a plurality of bends in the same plane forming adjacent parallel sections between bends, and the sections of adjacent grids being normally disposed relative to each other.
NASA Astrophysics Data System (ADS)
Biswas, Rahul; Blackburn, Lindy; Cao, Junwei; Essick, Reed; Hodge, Kari Alison; Katsavounidis, Erotokritos; Kim, Kyungmin; Kim, Young-Min; Le Bigot, Eric-Olivier; Lee, Chang-Hwan; Oh, John J.; Oh, Sang Hoon; Son, Edwin J.; Tao, Ye; Vaulin, Ruslan; Wang, Xiaoge
2013-09-01
The sensitivity of searches for astrophysical transients in data from the Laser Interferometer Gravitational-wave Observatory (LIGO) is generally limited by the presence of transient, non-Gaussian noise artifacts, which occur at a high enough rate such that accidental coincidence across multiple detectors is non-negligible. These “glitches” can easily be mistaken for transient gravitational-wave signals, and their robust identification and removal will help any search for astrophysical gravitational waves. We apply machine-learning algorithms (MLAs) to the problem, using data from auxiliary channels within the LIGO detectors that monitor degrees of freedom unaffected by astrophysical signals. Noise sources may produce artifacts in these auxiliary channels as well as the gravitational-wave channel. The number of auxiliary-channel parameters describing these disturbances may also be extremely large; high dimensionality is an area where MLAs are particularly well suited. We demonstrate the feasibility and applicability of three different MLAs: artificial neural networks, support vector machines, and random forests. These classifiers identify and remove a substantial fraction of the glitches present in two different data sets: four weeks of LIGO’s fourth science run and one week of LIGO’s sixth science run. We observe that all three algorithms agree on which events are glitches to within 10% for the sixth-science-run data, and support this by showing that the different optimization criteria used by each classifier generate the same decision surface, based on a likelihood-ratio statistic. Furthermore, we find that all classifiers obtain similar performance to the benchmark algorithm, the ordered veto list, which is optimized to detect pairwise correlations between transients in LIGO auxiliary channels and glitches in the gravitational-wave data. This suggests that most of the useful information currently extracted from the auxiliary channels is already described by this model. Future performance gains are thus likely to involve additional sources of information, rather than improvements in the classification algorithms themselves. We discuss several plausible sources of such new information as well as the ways of propagating it through the classifiers into gravitational-wave searches.
Argonne Simulation Framework for Intelligent Transportation Systems
DOT National Transportation Integrated Search
1996-01-01
A simulation framework has been developed which defines a high-level architecture for a large-scale, comprehensive, scalable simulation of an Intelligent Transportation System (ITS). The simulator is designed to run on parallel computers and distribu...
Exercising in a hot environment: which T-shirt to wear?
Sperlich, Billy; Born, Dennis-Peter; Lefter, Marie Denise; Holmberg, Hans-Christer
2013-09-01
The aim of this study was to investigate thermoregulatory, cardiorespiratory, metabolic, and perceptual responses while running in a hot environment (31.7° ± 1.0°C; 42% ± 3% relative humidity) and wearing T-shirts made from different fiber types. Eight well-trained men performed 4 tests wearing either a T-shirt made of 100% polyester with 4, 6, or 8 channels, or one made of 100% cotton. Each test consisted of 30 minutes running at 70% of peak oxygen uptake, followed by a ramp test to exhaustion and 15 minutes of recovery. There were no differences in skin, core, and body temperatures between fiber types during submaximal and high-intensity running (best P = .08). During recovery, body temperature and shivering/sweating sensations were lower when wearing 4- and 6-channel fibers (P ≤ .04) compared with cotton. The relative humidity at the chest and back were lower for all polyester T-shirts compared with cotton during and after submaximal and maximal running (P ≤ .007). Heart rate (best P = .10), oxygen uptake (P = .95), respiratory exchange ratio (best P = .93), ventilation (best P = .99), and blood lactate concentration (best P = .97) did not differ between the fiber types. Nor were any differences in time to exhaustion (best P = .76), ratings of perceived exertion (best P = .09), thermal sensation (best P = .07), or sensation of clothing wetness (best P = .36) discovered. Although statistical analysis revealed lower shivering/sweating sensations while wearing 4- and 6-channel fiber shirts during recovery, with an improved chest and back microenvironment for all polyester T-shirts, the question remains whether these differences are of any practical relevance because the performance of the well-trained men was unaffected. Wilderness Medical Society.
Method and apparatus for combinatorial chemistry
Foote, Robert S.
2007-02-20
A method and apparatus are provided for performing light-directed reactions in spatially addressable channels within a plurality of channels. One aspect of the invention employs photoactivatable reagents in solutions disposed into spatially addressable flow streams to control the parallel synthesis of molecules immobilized within the channels. The reagents may be photoactivated within a subset of channels at the site of immobilized substrate molecules or at a light-addressable site upstream from the substrate molecules. The method and apparatus of the invention find particularly utility in the synthesis of biopolymer arrays, e.g., oligonucleotides, peptides and carbohydrates, and in the combinatorial synthesis of small molecule arrays for drug discovery.
Method and apparatus for combinatorial chemistry
Foote, Robert S [Oak Ridge, TN
2012-06-05
A method and apparatus are provided for performing light-directed reactions in spatially addressable channels within a plurality of channels. One aspect of the invention employs photoactivatable reagents in solutions disposed into spatially addressable flow streams to control the parallel synthesis of molecules immobilized within the channels. The reagents may be photoactivated within a subset of channels at the site of immobilized substrate molecules or at a light-addressable site upstream from the substrate molecules. The method and apparatus of the invention find particularly utility in the synthesis of biopolymer arrays, e.g., oligonucleotides, peptides and carbohydrates, and in the combinatorial synthesis of small molecule arrays for drug discovery.
The Nano-Patch-Clamp Array: Microfabricated Glass Chips for High-Throughput Electrophysiology
NASA Astrophysics Data System (ADS)
Fertig, Niels
2003-03-01
Electrophysiology (i.e. patch clamping) remains the gold standard for pharmacological testing of putative ion channel active drugs (ICADs), but suffers from low throughput. A new ion channel screening technology based on microfabricated glass chip devices will be presented. The glass chips contain very fine apertures, which are used for whole-cell voltage clamp recordings as well as single channel recordings from mammalian cell lines. Chips containing multiple patch clamp wells will be used in a first bench-top device, which will allow perfusion and electrical readout of each well. This scalable technology will allow for automated, rapid and parallel screening on ion channel drug targets.
Coherent UDWDM PON with joint subcarrier reception at OLT.
Kottke, Christoph; Fischer, Johannes Karl; Elschner, Robert; Frey, Felix; Hilt, Jonas; Schubert, Colja; Schmidt, Daniel; Wu, Zifeng; Lankl, Berthold
2014-07-14
In this contribution, we report on the experimental investigation of an ultra-dense wavelength-division multiplexing (UDWDM) upstream link with up to 700 × 2.488 Gb/s polarization-division multiplexing differential quadrature phase-shift keying parallel upstream user channels transmitted over 80 km of standard single-mode fiber. We discuss challenges of the digital signal processing in the optical line terminal arising from the joint reception of several upstream user channels. We present solutions for resource and cost-efficient realization of the required channel separation, matched filtering, down-conversion and decimation as well as realization of the clock recovery and polarization demultiplexing for each individual channel.
First light results from the Hermes spectrograph at the AAT
NASA Astrophysics Data System (ADS)
Sheinis, Andrew; Barden, Sam; Birchall, Michael; Carollo, Daniela; Bland-Hawthorn, Joss; Brzeski, Jurek; Case, Scott; Cannon, Russell; Churilov, Vladimir; Couch, Warrick; Dean, Robert; De Silva, Gayandhi; D'Orazi, Valentina; Farrell, Tony; Fiegert, Kristin; Freeman, Kenneth; Frost, Gabriella; Gers, Luke; Goodwin, Michael; Gray, Doug; Heald, Ron; Heijmans, Jeroen; Jones, Damien; Keller, Stephan; Klauser, Urs; Kondrat, Yuriy; Lawrence, Jon; Lee, Steve; Mali, Slavko; Martell, Sarah; Mathews, Darren; Mayfield, Don; Miziarski, Stan; Muller, Rolf; Pai, Naveen; Patterson, Robert; Penny, Ed; Orr, David; Shortridge, Keith; Simpson, Jeffrey; Smedley, Scott; Smith, Greg; Stafford, Darren; Staszak, Nicholas; Vuong, Minh; Waller, Lewis; Wylie de Boer, Elizabeth; Xavier, Pascal; Zheng, Jessica; Zhelem, Ross; Zucker, Daniel
2014-07-01
The High Efficiency and Resolution Multi Element Spectrograph, HERMES is an facility-class optical spectrograph for the AAT. It is designed primarily for Galactic Archeology [21], the first major attempt to create a detailed understanding of galaxy formation and evolution by studying the history of our own galaxy, the Milky Way. The goal of the GALAH survey is to reconstruct the mass assembly history of the of the Milky Way, through a detailed spatially tagged abundance study of one million stars. The spectrograph is based at the Anglo Australian Telescope (AAT) and is fed by the existing 2dF robotic fiber positioning system. The spectrograph uses VPH-gratings to achieve a spectral resolving power of 28,000 in standard mode and also provides a high-resolution mode ranging between 40,000 to 50,000 using a slit mask. The GALAH survey requires a SNR greater than 100 for a star brightness of V=14. The total spectral coverage of the four channels is about 100nm between 370 and 1000nm for up to 392 simultaneous targets within the 2 degree field of view. Hermes has been commissioned over 3 runs, during bright time in October, November and December 2013, in parallel with the beginning of the GALAH Pilot survey starting in November 2013. In this paper we present the first-light results from the commissioning run and the beginning of the GALAH Survey, including performance results such as throughput and resolution, as well as instrument reliability. We compare the abundance calculations from the pilot survey to those in the literature.
EFFECTS OF HOT HALO GAS ON STAR FORMATION AND MASS TRANSFER DURING DISTANT GALAXY–GALAXY ENCOUNTERS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hwang, Jeong-Sun; Park, Changbom, E-mail: jshwang@kias.re.kr, E-mail: cbp@kias.re.kr
2015-06-01
We use N-body/smoothed particle hydrodynamics simulations of encounters between an early-type galaxy (ETG) and a late-type galaxy (LTG) to study the effects of hot halo gas on the evolution for a case with the mass ratio of the ETG to LTG of 2:1 and the closest approach distance of ∼100 kpc. We find that the dynamics of the cold disk gas in the tidal bridge and the amount of the newly formed stars depend strongly on the existence of a gas halo. In the run of interacting galaxies not having a hot gas halo, the gas and stars accreted into themore » ETG do not include newly formed stars. However, in the run using the ETG with a gas halo and the LTG without a gas halo, a shock forms along the disk gas tidal bridge and induces star formation near the closest approach. The shock front is parallel to a channel along which the cold gas flows toward the center of the ETG. As a result, the ETG can accrete star-forming cold gas and newly born stars at and near its center. When both galaxies have hot gas halos, a shock is formed between the two gas halos somewhat before the closest approach. The shock hinders the growth of the cold gas bridge to the ETG and also ionizes it. Only some of the disk stars transfer through the stellar bridge. We conclude that the hot halo gas can give significant hydrodynamic effects during distant encounters.« less
All solid state mid-infrared dual-comb spectroscopy platform based on QCL technology
NASA Astrophysics Data System (ADS)
Hugi, Andreas; Geiser, Markus; Villares, Gustavo; Cappelli, Francesco; Blaser, Stephane; Faist, Jérôme
2015-01-01
We develop a spectroscopy platform for industrial applications based on semiconductor quantum cascade laser (QCL) frequency combs. The platform's key features will be an unmatched combination of bandwidth of 100 cm-1, resolution of 100 kHz, speed of ten to hundreds of μs as well as size and robustness, opening doors to beforehand unreachable markets. The sensor can be built extremely compact and robust since the laser source is an all-electrically pumped semiconductor optical frequency comb and no mechanical elements are required. However, the parallel acquisition of dual-comb spectrometers comes at the price of enormous data-rates. For system scalability, robustness and optical simplicity we use free-running QCL combs. Therefore no complicated optical locking mechanisms are required. To reach high signal-to-noise ratios, we develop an algorithm, which is based on combination of coherent and non-coherent averaging. This algorithm is specifically optimized for free-running and small footprint, therefore high-repetition rate, comb sources. As a consequence, our system generates data-rates of up to 3.2 GB/sec. These data-rates need to be reduced by several orders of magnitude in real-time in order to be useful for spectral fitting algorithms. We present the development of a data-treatment solution, which reaches a single-channel throughput of 22% using a standard laptop-computer. Using a state-of-the art desktop computer, the throughput is increased to 43%. This is combined with a data-acquisition board to a stand-alone data processing unit, allowing real-time industrial process observation and continuous averaging to achieve highest signal fidelity.
Algal Biomass as an Indicator for Biochemical Oxygen Demand in the San Joaquin River, California.
NASA Astrophysics Data System (ADS)
Volkmar, E. C.; Dalhgren, R. A.
2005-12-01
Episodes of hypoxia (DO < 2 mg/L) occur in the lower San Joaquin River (SJR), California, and are typically most acute in the late summer and fall. The oxygen deficit can stress and kill aquatic organisms, and often inhibits the upstream migration of fall-run Chinook salmon. Hypoxia is most pronounced downstream from the Stockton Deep Water Ship Channel, which has been dredged from a depth of 2-3 m to about 11 m to allow ocean-going ships to reach the Port of Stockton. To protect aquatic organisms and facilitate the upstream migration of fall-run Chinook salmon, the minimum water quality standard for DO is 6 mg/L during September through November, and 5 mg/L for the remainder of the year. A five year study examined components contributing to biochemical oxygen demand (BOD): ammonia, algal biomass, non-algal particulate organic matter, and dissolved organic carbon. BOD shows a significant increase in loading rates as the SJR flows downstream, which parallels the load of algal biomass due to instream growth. BOD loading rates from tributaries accounts for 28% in a wet year and 39% in a dry year. Regression analysis revealed that chlorophyll-a + pheophyton-a was the only significant (p<0.05) predictor for BOD (r2 = 0.71). Less than 20% of the BOD was found in the dissolved fraction (<0.45 μm). The average BOD decomposition rate of the SJR and tributaries is 0.0841 d-1. We conclude that algal biomass is the primary contributor to BOD loads in the San Joaquin River.
User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Earth Sciences Division; Zhang, Keni; Zhang, Keni
TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator ismore » to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used. To familiarize users with the parallel code, illustrative sample problems are presented.« less
Guérin, Bastien; Stockmann, Jason P; Baboli, Mehran; Torrado-Carvajal, Angel; Stenger, Andrew V; Wald, Lawrence L
2016-08-01
To design parallel transmission spokes pulses with time-shifted profiles for joint mitigation of intensity variations due to B1+ effects, signal loss due to through-plane dephasing, and the specific absorption rate (SAR) at 7T. We derived a slice-averaged small tip angle (SA-STA) approximation of the magnetization signal at echo time that depends on the B1+ transmit profiles, the through-slice B0 gradient and the amplitude and time-shifts of the spoke waveforms. We minimize a magnitude least-squares objective based on this signal equation using a fast interior-point approach with analytical expressions of the Jacobian and Hessian. Our algorithm runs in less than three minutes for the design of two-spoke pulses subject to hundreds of local SAR constraints. On a B0/B1+ head phantom, joint optimization of the channel-dependent time-shifts and spoke amplitudes allowed signal recovery in high-B0 regions at no increase of SAR. Although the method creates uniform magnetization profiles (ie, uniform intensity), the flip angle varies across the image, which makes it ill-suited to T1-weighted applications. The SA-STA approach presented in this study is best suited to T2*-weighted applications with long echo times that require signal recovery around high B0 regions. Magn Reson Med 76:540-554, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Terascale Cluster for Advanced Turbulent Combustion Simulations
2008-07-25
the system We have given the name CATS (for Combustion And Turbulence Simulator) to the terascale system that was obtained through this grant. CATS ...lnfiniBand interconnect. CATS includes an interactive login node and a file server, each holding in excess of 1 terabyte of file storage. The 35 active...compute nodes of CATS enable us to run up to 140-core parallel MPI batch jobs; one node is reserved to run the scheduler. CATS is operated and
Coupled Ocean/Atmospheric Mesoscale Prediction System (COAMPS), Version 5.0 (User’s Guide)
2010-03-30
provides tools for common modeling functions, as well as regridding, data decomposition, and communication on parallel computers. NRL/MR/7320--10...specified gncomDir. If running COAMPS at the DSRC (e.g. BABBAGE, DAVINCI , or EINSTEIN), the global NCOM files will be copied to /scr/[user]/COAMPS/data...the site (DSRC or local) and the platform (BABBAGE. DAVINCI , EINSTEIN, or local machine) on which COAMPS is being run. site=navy_dsrc (for DSRC
Demonstration and Commercialization of the Sediment Ecosystem Assessment Protocol (SEAP)
2017-07-09
undergone severe erosion (Peeling 1975). Zuniga Jetty, which runs parallel to Point Loma at the bay’s inlet, was built to control erosion near the inlet...consistent conditions and level of effort required to run the tests. A per site unit cost is less amenable to a field-based deployment, given the many...support in situ tetsing: 1) a standard exposure of spores to a reference toxicant dilutuion series; and 2) exposure of sporophyll blades to a
A 32-Channel Combined RF and B0 Shim Array for 3T Brain Imaging
Stockmann, Jason P.; Witzel, Thomas; Keil, Boris; Polimeni, Jonathan R.; Mareyam, Azma; LaPierre, Cristen; Setsompop, Kawin; Wald, Lawrence L.
2016-01-01
Purpose We add user-controllable direct currents (DC) to the individual elements of a 32-channel radio-frequency (RF) receive array to provide B0 shimming ability while preserving the array’s reception sensitivity and parallel imaging performance. Methods Shim performance using constrained DC current (±2.5A) is simulated for brain arrays ranging from 8 to 128 elements. A 32-channel 3-tesla brain array is realized using inductive chokes to bridge the tuning capacitors on each RF loop. The RF and B0 shimming performance is assessed in bench and imaging measurements. Results The addition of DC currents to the 32-channel RF array is achieved with minimal disruption of the RF performance and/or negative side effects such as conductor heating or mechanical torques. The shimming results agree well with simulations and show performance superior to third-order spherical harmonic (SH) shimming. Imaging tests show the ability to reduce the standard frontal lobe susceptibility-induced fields and improve echo planar imaging geometric distortion. The simulation of 64- and 128-channel brain arrays suggest that even further shimming improvement is possible (equivalent to up to 6th-order SH shim coils). Conclusion Including user-controlled shim currents on the loops of a conventional highly parallel brain array coil is feasible with modest current levels and produces improved B0 shimming performance over standard second-order SH shimming. PMID:25689977
Maji, Kaushik; Kouri, Donald J
2011-03-28
We have developed a new method for solving quantum dynamical scattering problems, using the time-independent Schrödinger equation (TISE), based on a novel method to generalize a "one-way" quantum mechanical wave equation, impose correct boundary conditions, and eliminate exponentially growing closed channel solutions. The approach is readily parallelized to achieve approximate N(2) scaling, where N is the number of coupled equations. The full two-way nature of the TISE is included while propagating the wave function in the scattering variable and the full S-matrix is obtained. The new algorithm is based on a "Modified Cayley" operator splitting approach, generalizing earlier work where the method was applied to the time-dependent Schrödinger equation. All scattering variable propagation approaches to solving the TISE involve solving a Helmholtz-type equation, and for more than one degree of freedom, these are notoriously ill-behaved, due to the unavoidable presence of exponentially growing contributions to the numerical solution. Traditionally, the method used to eliminate exponential growth has posed a major obstacle to the full parallelization of such propagation algorithms. We stabilize by using the Feshbach projection operator technique to remove all the nonphysical exponentially growing closed channels, while retaining all of the propagating open channel components, as well as exponentially decaying closed channel components.
Running Neuroimaging Applications on Amazon Web Services: How, When, and at What Cost?
Madhyastha, Tara M; Koh, Natalie; Day, Trevor K M; Hernández-Fernández, Moises; Kelley, Austin; Peterson, Daniel J; Rajan, Sabreena; Woelfer, Karl A; Wolf, Jonathan; Grabowski, Thomas J
2017-01-01
The contribution of this paper is to identify and describe current best practices for using Amazon Web Services (AWS) to execute neuroimaging workflows "in the cloud." Neuroimaging offers a vast set of techniques by which to interrogate the structure and function of the living brain. However, many of the scientists for whom neuroimaging is an extremely important tool have limited training in parallel computation. At the same time, the field is experiencing a surge in computational demands, driven by a combination of data-sharing efforts, improvements in scanner technology that allow acquisition of images with higher image resolution, and by the desire to use statistical techniques that stress processing requirements. Most neuroimaging workflows can be executed as independent parallel jobs and are therefore excellent candidates for running on AWS, but the overhead of learning to do so and determining whether it is worth the cost can be prohibitive. In this paper we describe how to identify neuroimaging workloads that are appropriate for running on AWS, how to benchmark execution time, and how to estimate cost of running on AWS. By benchmarking common neuroimaging applications, we show that cloud computing can be a viable alternative to on-premises hardware. We present guidelines that neuroimaging labs can use to provide a cluster-on-demand type of service that should be familiar to users, and scripts to estimate cost and create such a cluster.
NASA Technical Reports Server (NTRS)
Eberhardt, D. S.; Baganoff, D.; Stevens, K.
1984-01-01
Implicit approximate-factored algorithms have certain properties that are suitable for parallel processing. A particular computational fluid dynamics (CFD) code, using this algorithm, is mapped onto a multiple-instruction/multiple-data-stream (MIMD) computer architecture. An explanation of this mapping procedure is presented, as well as some of the difficulties encountered when trying to run the code concurrently. Timing results are given for runs on the Ames Research Center's MIMD test facility which consists of two VAX 11/780's with a common MA780 multi-ported memory. Speedups exceeding 1.9 for characteristic CFD runs were indicated by the timing results.
NASA Technical Reports Server (NTRS)
Davidson, J.; Ottey, H. R.; Sawitz, P.; Zusman, F. S.
1985-01-01
The appendixes of the user manual are presented. Input forms which may be used to prepare data for the SOUP5V3.4 of the R2BCSAT-83 data base are given. The IBM job control language which can be used to run the SOUP5 system from a magnetic tape is described. Copies of a run using the delivered tape and IBM OS/MVS Job Control Language card deck are illustrated. Numerical limits on scenario data requests are listed. Error handling, error messages and editing procedures are also listed. Instructions as to how to enter a protection ratio template are given. And relation between PARC prameter, channelization, channel families, and interference categories are also listed.
DNA Assembly with De Bruijn Graphs Using an FPGA Platform.
Poirier, Carl; Gosselin, Benoit; Fortier, Paul
2018-01-01
This paper presents an FPGA implementation of a DNA assembly algorithm, called Ray, initially developed to run on parallel CPUs. The OpenCL language is used and the focus is placed on modifying and optimizing the original algorithm to better suit the new parallelization tool and the radically different hardware architecture. The results show that the execution time is roughly one fourth that of the CPU and factoring energy consumption yields a tenfold savings.
Multiprocessor graphics computation and display using transputers
NASA Technical Reports Server (NTRS)
Ellis, Graham K.
1988-01-01
A package of two-dimensional graphics routines was developed to run on a transputer-based parallel processing system. These routines were designed to enable applications programmers to easily generate and display results from the transputer network in a graphic format. The graphics procedures were designed for the lowest possible network communication overhead for increased performance. The routines were designed for ease of use and to present an intuitive approach to generating graphics on the transputer parallel processing system.
An object-oriented approach to nested data parallelism
NASA Technical Reports Server (NTRS)
Sheffler, Thomas J.; Chatterjee, Siddhartha
1994-01-01
This paper describes an implementation technique for integrating nested data parallelism into an object-oriented language. Data-parallel programming employs sets of data called 'collections' and expresses parallelism as operations performed over the elements of a collection. When the elements of a collection are also collections, then there is the possibility for 'nested data parallelism.' Few current programming languages support nested data parallelism however. In an object-oriented framework, a collection is a single object. Its type defines the parallel operations that may be applied to it. Our goal is to design and build an object-oriented data-parallel programming environment supporting nested data parallelism. Our initial approach is built upon three fundamental additions to C++. We add new parallel base types by implementing them as classes, and add a new parallel collection type called a 'vector' that is implemented as a template. Only one new language feature is introduced: the 'foreach' construct, which is the basis for exploiting elementwise parallelism over collections. The strength of the method lies in the compilation strategy, which translates nested data-parallel C++ into ordinary C++. Extracting the potential parallelism in nested 'foreach' constructs is called 'flattening' nested parallelism. We show how to flatten 'foreach' constructs using a simple program transformation. Our prototype system produces vector code which has been successfully run on workstations, a CM-2, and a CM-5.
HYDROGEN ELECTROLYZER FLOW DISTRIBUTOR MODEL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shadday, M
2006-09-28
The hybrid sulfur process (HyS) hydrogen electrolyzer consists of a proton exchange membrane (PEM) sandwiched between two porous graphite layers. An aqueous solution of sulfuric acid with dissolved SO{sub 2} gas flows parallel to the PEM through the porous graphite layer on the anode side of the electrolyzer. A flow distributor, consisting of a number of parallel channels acting as headers, promotes uniform flow of the anolyte fluid through the porous graphite layer. A numerical model of the hydraulic behavior of the flow distributor is herein described. This model was developed to be a tool to aid the design ofmore » flow distributors. The primary design objective is to minimize spatial variations in the flow through the porous graphite layer. The hydraulic data from electrolyzer tests consists of overall flowrate and pressure drop. Internal pressure and flow distributions are not measured, but these details are provided by the model. The model has been benchmarked against data from tests of the current electrolyzer. The model reasonably predicts the viscosity effect of changing the fluid from water to an aqueous solution of 30 % sulfuric acid. The permeability of the graphite layer was the independent variable used to fit the model to the test data, and the required permeability for a good fit is within the range literature values for carbon paper. The model predicts that reducing the number of parallel channels by 50 % will substantially improve the uniformity of the flow in the porous graphite layer, while maintaining an acceptable pressure drop across the electrolyzer. When the size of the electrolyzer is doubled from 2.75 inches square to 5.5 inches square, the same number of channels as in the current design will be adequate, but it is advisable to increase the channel cross-sectional flow area. This is due to the increased length of the channels.« less
PEM Water Electrolysis: Preliminary Investigations Using Neutron Radiography
NASA Astrophysics Data System (ADS)
de Beer, Frikkie; van der Merwe, Jan-Hendrik; Bessarabov, Dmitri
The quasi-dynamic water distribution and performance of a proton exchange membrane (PEM) electrolyzer at both a small fuel cell's anode and cathode was observed and quantitatively measured in the in-plane imaging geometry direction(neutron beam parallel to membrane and with channels parallel to the beam) by applying the neutron radiography principle at the neutron imaging facility (NIF) of NIST, Gaithersburg, USA. The test section had 6 parallel channels with an active area of 5 cm2 and in-situ neutron radiography observation entails the liquid water content along the total length of each of the channels. The acquisition was made with a neutron cMOS-camera system with performance of 10 sec per frame to achieve a relatively good pixel dynamic range and at a pixel resolution of 10 x 10 μm2. A relatively high S/N ratio was achieved in the radiographs to observe in quasi real time the water management as well as quantification of water / gas within the channels. The water management has been observed at increased steps (0.2A/cm2) of current densities until 2V potential has been achieved. These observations were made at 2 different water flow rates, at 3 temperatures for each flow rate and repeated for both the vertical and horizontal electrolyzer orientation geometries. It is observed that there is water crossover from the anode through the membrane to the cathode. A first order quantification (neutron scattering correction not included) shows that the physical vertical and horizontal orientation of the fuel cell as well as the temperature of the system up to 80 °C has no significant influence on the percentage water (∼18%) that crossed over into the cathode. Additionally, a higher water content was observed in the Gas Diffusion Layer at the position of the channels with respect to the lands.
An 8-channel skin impedance measurement system for acupuncture research.
Thong, Tran; Colbert, Agatha P; Larsen, Adrian P
2009-01-01
An 8-channel skin impedance measurement system for acupuncture research has been developed. The underlying model of the skin used is a parallel R & C network. Pulses are used to measure the R and C values. The measurement circuit is time multiplexed across the 8 channels at the rate of 2 measurements per second, leading to a complete set of measurements every 4 seconds. In static tests, the system has been operational for over 2 days of continuous measurements. In preliminary human tests, measurements over 2 hours have been collected per subject.
A mechanism for efficient debugging of parallel programs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miller, B.P.; Choi, J.D.
1988-01-01
This paper addresses the design and implementation of an integrated debugging system for parallel programs running on shared memory multi-processors (SMMP). The authors describe the use of flowback analysis to provide information on causal relationships between events in a program's execution without re-executing the program for debugging. The authors introduce a mechanism called incremental tracing that, by using semantic analyses of the debugged program, makes the flowback analysis practical with only a small amount of trace generated during execution. The extend flowback analysis to apply to parallel programs and describe a method to detect race conditions in the interactions ofmore » the co-operating processes.« less
A Framework for Load Balancing of Tensor Contraction Expressions via Dynamic Task Partitioning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lai, Pai-Wei; Stock, Kevin; Rajbhandari, Samyam
In this paper, we introduce the Dynamic Load-balanced Tensor Contractions (DLTC), a domain-specific library for efficient task parallel execution of tensor contraction expressions, a class of computation encountered in quantum chemistry and physics. Our framework decomposes each contraction into smaller unit of tasks, represented by an abstraction referred to as iterators. We exploit an extra level of parallelism by having tasks across independent contractions executed concurrently through a dynamic load balancing run- time. We demonstrate the improved performance, scalability, and flexibility for the computation of tensor contraction expressions on parallel computers using examples from coupled cluster methods.
Kawai, Takayuki; Sueyoshi, Kenji; Kitagawa, Fumihiko; Otsuka, Koji
2010-08-01
The applicability of an online preconcentration technique, large-volume sample stacking with an electroosmotic flow pump (LVSEP), to microchip zone electrophoresis (MCZE) for the analysis of oligosaccharides was investigated. Since the sample stacking and separation proceeded continuously without polarity switching in LVSEP, a single "straight" channel microchip could be employed. In the MCZE analysis of oligosaccharides, sample adsorption onto the channel surface should be suppressed, so the straight microchannel was modified with poly(vinyl alcohol) (PVA). So far, the mechanism of LVSEP in the polymer-coated capillary or microchannel has not been reported, and thus, the LVSEP process in the PVA-coated channel was investigated by fluorescence imaging. Although it is well-known that the PVA coating can suppress the electroosmotic flow (EOF), an enhanced EOF with a mobility of 4.4 x 10(-4) cm(2)/(V x s) was observed in a low ionic strength sample solution. It was revealed that such temporarily enhanced EOF in the sample zone worked as the driving force to remove the sample matrix in LVSEP. To evaluate the analytical performance of LVSEP-MCZE, oligosaccharides were analyzed in the PVA-coated straight channel. As a result, both the glucose ladder and oligosaccharides obtained from bovine ribonuclease B were well enriched and separated with up to 2200-2900-fold sensitivity enhancement compared to those in a conventional MCZE analysis. The run-to-run repeatabilities of the migration time and peak height were good with relative standard deviations of 1.1% and 7.2%, respectively, which were better than those of normal MCZE. By applying the LVSEP technique to MCZE, a complicated voltage program for fluidic control could be simplified from four channels for two steps to two channels for one step.
Liter-scale production of uniform gas bubbles via parallelization of flow-focusing generators.
Jeong, Heon-Ho; Yadavali, Sagar; Issadore, David; Lee, Daeyeon
2017-07-25
Microscale gas bubbles have demonstrated enormous utility as versatile templates for the synthesis of functional materials in medicine, ultra-lightweight materials and acoustic metamaterials. In many of these applications, high uniformity of the size of the gas bubbles is critical to achieve the desired properties and functionality. While microfluidics have been used with success to create gas bubbles that have a uniformity not achievable using conventional methods, the inherently low volumetric flow rate of microfluidics has limited its use in most applications. Parallelization of liquid droplet generators, in which many droplet generators are incorporated onto a single chip, has shown great promise for the large scale production of monodisperse liquid emulsion droplets. However, the scale-up of monodisperse gas bubbles using such an approach has remained a challenge because of possible coupling between parallel bubbles generators and feedback effects from the downstream channels. In this report, we systematically investigate the effect of factors such as viscosity of the continuous phase, capillary number, and gas pressure as well as the channel uniformity on the size distribution of gas bubbles in a parallelized microfluidic device. We show that, by optimizing the flow conditions, a device with 400 parallel flow focusing generators on a footprint of 5 × 5 cm 2 can be used to generate gas bubbles with a coefficient of variation of less than 5% at a production rate of approximately 1 L h -1 . Our results suggest that the optimization of flow conditions using a device with a small number (e.g., 8) of parallel FFGs can facilitate large-scale bubble production.
Simulation of ozone production in a complex circulation region using nested grids
NASA Astrophysics Data System (ADS)
Taghavi, M.; Cautenet, S.; Foret, G.
2003-07-01
During ESCOMPTE precampaign (15 June to 10 July 2000), three days of intensive pollution (IOP0) have been observed and simulated. The comprehensive RAMS model, version 4.3, coupled online with a chemical module including 29 species, has been used to follow the chemistry of the zone polluted over southern France. This online method can be used because the code is paralleled and the SGI 3800 computer is very powerful. Two runs have been performed: run1 with one grid and run2 with two nested grids. The redistribution of simulated chemical species (ozone, carbon monoxide, sulphur dioxide and nitrogen oxides) was compared to aircraft measurements and surface stations. The 2-grid run has given substantially better results than the one-grid run only because the former takes the outer pollutants into account. This online method helps to explain dynamics and to retrieve the chemical species redistribution with a good agreement.
Simulation of ozone production in a complex circulation region using nested grids
NASA Astrophysics Data System (ADS)
Taghavi, M.; Cautenet, S.; Foret, G.
2004-06-01
During the ESCOMPTE precampaign (summer 2000, over Southern France), a 3-day period of intensive observation (IOP0), associated with ozone peaks, has been simulated. The comprehensive RAMS model, version 4.3, coupled on-line with a chemical module including 29 species, is used to follow the chemistry of the polluted zone. This efficient but time consuming method can be used because the code is installed on a parallel computer, the SGI 3800. Two runs are performed: run 1 with a single grid and run 2 with two nested grids. The simulated fields of ozone, carbon monoxide, nitrogen oxides and sulfur dioxide are compared with aircraft and surface station measurements. The 2-grid run looks substantially better than the run with one grid because the former takes the outer pollutants into account. This on-line method helps to satisfactorily retrieve the chemical species redistribution and to explain the impact of dynamics on this redistribution.
NASA Astrophysics Data System (ADS)
Enomoto, Ayano; Hirata, Hiroshi
2014-02-01
This article describes a feasibility study of parallel image-acquisition using a two-channel surface coil array in continuous-wave electron paramagnetic resonance (CW-EPR) imaging. Parallel EPR imaging was performed by multiplexing of EPR detection in the frequency domain. The parallel acquisition system consists of two surface coil resonators and radiofrequency (RF) bridges for EPR detection. To demonstrate the feasibility of this method of parallel image-acquisition with a surface coil array, three-dimensional EPR imaging was carried out using a tube phantom. Technical issues in the multiplexing method of EPR detection were also clarified. We found that degradation in the signal-to-noise ratio due to the interference of RF carriers is a key problem to be solved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tang, Guoping; D'Azevedo, Ed F; Zhang, Fan
2010-01-01
Calibration of groundwater models involves hundreds to thousands of forward solutions, each of which may solve many transient coupled nonlinear partial differential equations, resulting in a computationally intensive problem. We describe a hybrid MPI/OpenMP approach to exploit two levels of parallelisms in software and hardware to reduce calibration time on multi-core computers. HydroGeoChem 5.0 (HGC5) is parallelized using OpenMP for direct solutions for a reactive transport model application, and a field-scale coupled flow and transport model application. In the reactive transport model, a single parallelizable loop is identified to account for over 97% of the total computational time using GPROF.more » Addition of a few lines of OpenMP compiler directives to the loop yields a speedup of about 10 on a 16-core compute node. For the field-scale model, parallelizable loops in 14 of 174 HGC5 subroutines that require 99% of the execution time are identified. As these loops are parallelized incrementally, the scalability is found to be limited by a loop where Cray PAT detects over 90% cache missing rates. With this loop rewritten, similar speedup as the first application is achieved. The OpenMP-parallelized code can be run efficiently on multiple workstations in a network or multiple compute nodes on a cluster as slaves using parallel PEST to speedup model calibration. To run calibration on clusters as a single task, the Levenberg Marquardt algorithm is added to HGC5 with the Jacobian calculation and lambda search parallelized using MPI. With this hybrid approach, 100 200 compute cores are used to reduce the calibration time from weeks to a few hours for these two applications. This approach is applicable to most of the existing groundwater model codes for many applications.« less
Colas, Fanny; Archaimbault, Virginie; Devin, Simon
2011-03-01
Due to their nutrient recycling function and their importance in food-webs, macroinvertebrates are essential for the functioning of aquatic ecosystems. These organisms also constitute an important component of biodiversity. Sediment evaluation and monitoring is an essential aspect of ecosystem monitoring since sediments represent an important component of aquatic habitats and are also a potential source of contamination. In this study, we focused on macroinvertebrate communities within run-of-river dams, that are prime areas for sediment and pollutant accumulation. Little is known about littoral macroinvertebrate communities within run-of-river dam or their response to sediment levels and pollution. We therefore aimed to evaluate the following aspects: the functional and structural composition of macroinvertebrate communities in run-of-river dams; the impact of pollutant accumulation on such communities, and the most efficient scales and tools needed for the biomonitoring of contaminated sediments in such environments. Two run-of-river dams located in the French alpine area were selected and three spatial scales were examined: transversal (banks and channel), transversal x longitudinal (banks/channel x tail/middle/dam) and patch scale (erosion, sedimentation and vegetation habitats). At the patch scale, we noted that the heterogeneity of littoral habitats provided many available niches that allow for the development of diversified macroinvertebrate communities. This implies highly variable responses to contamination. Once combined on a global 'banks' spatial scale, littoral habitats can highlight the effects of toxic disturbances. Copyright © 2011 Elsevier B.V. All rights reserved.
Irausquin, Roelof A.; Scavelli, Thomas D.; Corti, Lisa; Stefanacci, Joseph D.; DeMarco, Joann; Flood, Shannon; Rohrbach, Barton W.
2008-01-01
Evaluation of dogs with splenic masses to better educate owners as to the extent of the disease is a goal of many research studies. We compared the use of ultrasonography (US) and contrast-enhanced computed tomography (CT) to evaluate the accuracy of detecting hepatic neoplasia in dogs with splenic masses, independently, in series, or in parallel. No significant difference was found between US and CT. If the presence or absence of ascites, as detected with US, was used as a pretest probability of disease in our population, the positive predictive value increased to 94% if the tests were run in series, and the negative predictive value increased to 95% if the tests were run in parallel. The study showed that CT combined with US could be a valuable tool in evaluation of dogs with splenic masses. PMID:18320977