Science.gov

Sample records for parallel blade-vortex interaction

  1. Vortex dynamics during blade-vortex interactions

    NASA Astrophysics Data System (ADS)

    Peng, Di; Gregory, James W.

    2015-05-01

    Vortex dynamics during parallel blade-vortex interactions (BVIs) were investigated in a subsonic wind tunnel using particle image velocimetry (PIV). Vortices were generated by applying a rapid pitch-up motion to an airfoil through a pneumatic system, and the subsequent interactions with a downstream, unloaded target airfoil were studied. The blade-vortex interactions may be classified into three categories in terms of vortex behavior: close interaction, very close interaction, and collision. For each type of interaction, the vortex trajectory and strength variation were obtained from phase-averaged PIV data. The PIV results revealed the mechanisms of vortex decay and the effects of several key parameters on vortex dynamics, including separation distance (h/c), Reynolds number, and vortex sense. Generally, BVI has two main stages: interaction between vortex and leading edge (vortex-LE interaction) and interaction between vortex and boundary layer (vortex-BL interaction). Vortex-LE interaction, with its small separation distance, is dominated by inviscid decay of vortex strength due to pressure gradients near the leading edge. Therefore, the decay rate is determined by separation distance and vortex strength, but it is relatively insensitive to Reynolds number. Vortex-LE interaction will become a viscous-type interaction if there is enough separation distance. Vortex-BL interaction is inherently dominated by viscous effects, so the decay rate is dependent on Reynolds number. Vortex sense also has great impact on vortex-BL interaction because it changes the velocity field and shear stress near the surface.

  2. Rotorcraft Blade-Vortex Interaction Controller

    NASA Technical Reports Server (NTRS)

    Schmitz, Fredric H. (Inventor)

    1995-01-01

    Blade-vortex interaction noises, sometimes referred to as 'blade slap', are avoided by increasing the absolute value of inflow to the rotor system of a rotorcraft. This is accomplished by creating a drag force which causes the angle of the tip-path plane of the rotor system to become more negative or more positive.

  3. An analysis of blade vortex interaction aerodynamics and acoustics

    NASA Technical Reports Server (NTRS)

    Lee, D. J.

    1985-01-01

    The impulsive noise associated with helicopter flight due to Blade-Vortex Interaction, sometimes called blade slap is analyzed especially for the case of a close encounter of the blade-tip vortex with a following blade. Three parts of the phenomena are considered: the tip-vortex structure generated by the rotating blade, the unsteady pressure produced on the following blade during the interaction, and the acoustic radiation due to the unsteady pressure field. To simplify the problem, the analysis was confined to the situation where the vortex is aligned parallel to the blade span in which case the maximum acoustic pressure results. Acoustic radiation due to the interaction is analyzed in space-fixed coordinates and in the time domain with the unsteady pressure on the blade surface as the source of chordwise compact, but spanwise non-compact radiation. Maximum acoustic pressure is related to the vortex core size and Reynolds number which are in turn functions of the blade-tip aerodynamic parameters. Finally noise reduction and performance are considered.

  4. Rotating hot-wire investigation of the vortex responsible for blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Fontana, Richard Remo

    1988-01-01

    This distribution of the circumferential velocity of the vortex responsible for blade-vortex interaction noise was measured using a rotating hot-wire rake synchronously meshed with a model helicopter rotor at the blade passage frequency. Simultaneous far-field acoustic data and blade differential pressure measurements were obtained. Results show that the shape of the measured far-field acoustic blade-vortex interaction signature depends on the blade-vortex interaction geometry. The experimental results are compared with the Widnall-Wolf model for blade-vortex interaction noise.

  5. Rotor blade system with reduced blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Leishman, John G. (Inventor); Han, Yong Oun (Inventor)

    2005-01-01

    A rotor blade system with reduced blade-vortex interaction noise includes a plurality of tube members embedded in proximity to a tip of each rotor blade. The inlets of the tube members are arrayed at the leading edge of the blade slightly above the chord plane, while the outlets are arrayed at the blade tip face. Such a design rapidly diffuses the vorticity contained within the concentrated tip vortex because of enhanced flow mixing in the inner core, which prevents the development of a laminar core region.

  6. Flow visualizations of perpendicular blade vortex interactions

    NASA Technical Reports Server (NTRS)

    Rife, Michael C.; Davenport, William J.

    1992-01-01

    Helium bubble flow visualizations have been performed to study perpendicular interaction of a turbulent trailing vortex and a rectangular wing in the Virginia Tech Stability Tunnel. Many combinations of vortex strength, vortex-blade separation (Z(sub s)) and blade angle of attack were studied. Photographs of representative cases are presented. A range of phenomena were observed. For Z(sub s) greater than a few percent chord the vortex is deflected as it passes the blade under the influence of the local streamline curvature and its image in the blade. Initially the interaction appears to have no influence on the core. Downstream, however, the vortex core begins to diffuse and grow, presumably as a consequence of its interaction with the blade wake. The magnitude of these effects increases with reduction in Z(sub s). For Z(sub s) near zero the form of the interaction changes and becomes dependent on the vortex strength. For lower strengths the vortex appears to split into two filaments on the leading edge of the blade, one passing on the pressure and one passing on the suction side. At higher strengths the vortex bursts in the vicinity of the leading edge. In either case the core of its remnants then rapidly diffuse with distance downstream. Increase in Reynolds number did not qualitatively affect the flow apart from decreasing the amplitude of the small low-frequency wandering motions of the vortex. Changes in wing tip geometry and boundary layer trip had very little effect.

  7. A Novel Method for Reducing Rotor Blade-Vortex Interaction

    NASA Technical Reports Server (NTRS)

    Glinka, A. T.

    2000-01-01

    One of the major hindrances to expansion of the rotorcraft market is the high-amplitude noise they produce, especially during low-speed descent, where blade-vortex interactions frequently occur. In an attempt to reduce the noise levels caused by blade-vortex interactions, the flip-tip rotor blade concept was devised. The flip-tip rotor increases the miss distance between the shed vortices and the rotor blades, reducing BVI noise. The distance is increased by rotating an outboard portion of the rotor tip either up or down depending on the flight condition. The proposed plan for the grant consisted of a computational simulation of the rotor aerodynamics and its wake geometry to determine the effectiveness of the concept, coupled with a series of wind tunnel experiments exploring the value of the device and validating the computer model. The computational model did in fact show that the miss distance could be increased, giving a measure of the effectiveness of the flip-tip rotor. However, the wind experiments were not able to be conducted. Increased outside demand for the 7'x lO' wind tunnel at NASA Ames and low priority at Ames for this project forced numerous postponements of the tests, eventually pushing the tests beyond the life of the grant. A design for the rotor blades to be tested in the wind tunnel was completed and an analysis of the strength of the model blades based on predicted loads, including dynamic forces, was done.

  8. Recent studies of rotorcraft blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Preisser, J. S.; Brooks, T. F.; Martin, R. M.

    1994-01-01

    Recent results are presented from several research efforts aimed at the understanding of rotorcraft blade-vortex interaction (BVI) in terms of the noise generation, directivity, and control. The results are based on work performed by NASA Langley Research Center researchers, both alone and in collaboration with other research organizations. Based on analysis of a simplified physical model, the critical parameters controlling BVI noise generation have been identified. The detailed mapping of the acoustic radiation field of a model rotor in a wind tunnel has revealed the extreme sensitivity of directivity to rotor advance ratio and disk attitude. The control and reduction of BVI noise through the use of higher harmonic pitch control is discussed.

  9. A parametric study of transonic blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Lyrintzis, A. S.

    1991-01-01

    Several parameters of transonic blade-vortex interactions (BVI) are being studied and some ideas for noise reduction are introduced and tested using numerical simulation. The model used is the two-dimensional high frequency transonic small disturbance equation with regions of distributed vorticity (VTRAN2 code). The far-field noise signals are obtained by using the Kirchhoff method with extends the numerical 2-D near-field aerodynamic results to the linear acoustic 3-D far-field. The BVI noise mechanisms are explained and the effects of vortex type and strength, and angle of attack are studied. Particularly, airfoil shape modifications which lead to noise reduction are investigated. The results presented are expected to be helpful for better understanding of the nature of the BVI noise and better blade design.

  10. Helicopter tail rotor blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    George, Albert R.; Chou, S.-T.

    1987-01-01

    A study is made of helicopter tail rotor noise, particularly that due to the interactions with main rotor tip vortices. Summarized here are present analysis, the computer codes, and the results of several test cases. Amiet's unsteady thin airfoil theory is used to calculate the acoustics of blade-vortex interaction. The noise source is modelled as a force dipole resulting from an airfoil of infinite span chopping through a skewed line vortex. To analyze the interactions between helicopter tail rotor and main rotor tip vortices, we developed a two-step approach: (1) the main rotor tip vortex system is obtained through a free wake geometry calculation of the main rotor using CAMRAD code; (2) acoustic analysis takes the results from the aerodynamic interaction analysis and calculates the farfield pressure signatures for the interactions. It is found that under a wide range of helicopter flight conditions, acoustic pressure fluctuations of significant magnitude can be generated by tail rotors due to a series of interactions with main rotor tip vortices. This noise mechanism depends strongly on the helicopter flight conditions and the relative location and phasing of the main and tail rotors. fluctuations of significant magnitude can be generated by tail rotors due to a series of interactions with main rotor tip vortices. This noise mechanism depends strongly upon the helicopter flight conditions and the relative location and phasing of the main and tail rotors.

  11. Transonic blade-vortex interactions noise: A parametric study

    NASA Technical Reports Server (NTRS)

    Lyrintzis, A. S.; Xue, Y.

    1990-01-01

    Transonic Blade-Vortex Interactions (BVI) are simulated numerically and the noise mechanisms are investigated. The 2-D high frequency transonic small disturbance equation is solved numerically (VTRAN2 code). An Alternating Direction Implicit (ADI) scheme with monotone switches is used; viscous effects are included on the boundary and the vortex is simulated by the cloud-in-cell method. The Kirchoff method is used for the extension of the numerical 2-D near field aerodynamic results to the linear acoustic 3-D far field. The viscous effect (shock/boundary layer interaction) on BVI is investigated. The different types of shock motion are identified and compared. Two important disturbances with different directivity exist in the pressure signal and are believed to be related to the fluctuating lift and drag forces. Noise directivity for different cases is shown. The maximum radiation occurs at an angle between 60 and 90 deg below the horizontal for an airfoil fixed coordinate system and depends on the details of the airfoil shape. Different airfoil shapes are studied and classified according to the BVI noise produced.

  12. HART-II: Prediction of Blade-Vortex Interaction Loading

    NASA Technical Reports Server (NTRS)

    Lim, Joon W.; Tung, Chee; Yu, Yung H.; Burley, Casey L.; Brooks, Thomas; Boyd, Doug; vanderWall, Berend; Schneider, Oliver; Richard, Hugues; Raffel, Markus

    2003-01-01

    During the HART-I data analysis, the need for comprehensive wake data was found including vortex creation and aging, and its re-development after blade-vortex interaction. In October 2001, US Army AFDD, NASA Langley, German DLR, French ONERA and Dutch DNW performed the HART-II test as an international joint effort. The main objective was to focus on rotor wake measurement using a PIV technique along with the comprehensive data of blade deflections, airloads, and acoustics. Three prediction teams made preliminary correlation efforts with HART-II data: a joint US team of US Army AFDD and NASA Langley, German DLR, and French ONERA. The predicted results showed significant improvements over the HART-I predicted results, computed about several years ago, which indicated that there has been better understanding of complicated wake modeling in the comprehensive rotorcraft analysis. All three teams demonstrated satisfactory prediction capabilities, in general, though there were slight deviations of prediction accuracies for various disciplines.

  13. Neural control of helicopter blade-vortex interaction noise

    NASA Astrophysics Data System (ADS)

    Glaessel, Holger; Kloeppel, Valentin; Rudolph, Stephan

    2001-06-01

    Significant reduction of helicopter blade-vortex interaction (BVI) noise is currently one of the most advanced research topics in the helicopter industry. This is due to the complex flow, the close aerodynamic and structural coupling, and the interaction of the blades with the trailing edge vortices. Analytical and numerical modeling techniques are therefore currently still far from a sufficient degree of accuracy to obtain satisfactory results using classical model based control concepts. Neural networks with a proven potential to learn nonlinear relationships implicitly encoded in a training data set are therefore an appropriate and complementary technique for the alternative design of a nonlinear controller for BVI noise reduction. For nonlinear and adaptive control different neural control strategies have been proposed. Two possible approaches, a direct and an indirect neural controller are described. In indirect neural control, the plant has to be identified first by training a network with measured data. The plant network is then used to train the controller network. On the other hand the direct control approach does not rely on an explicit plant model, instead a specific training algorithm (like reinforcement learning) uses the information gathered from interactions with the environment. In the investigation of the BVI noise phenomena, helicopter developers have undertaken substantial efforts in full scale flight tests and wind tunnel experiments. Data obtained in these experiments have been adequately preprocessed using wavelet analysis and filtering techniques and are then used in the design of a neural controller. Neural open-loop control and neural closed-loop control concepts for the BVI noise reduction problem are conceived, simulated and compared against each other in this work in the above mentioned framework.

  14. Euler solutions for self-generated rotor blade-vortex interactions

    NASA Technical Reports Server (NTRS)

    Hassan, A. A.; Tung, C.; Sankar, L. N.

    1990-01-01

    A finite-difference procedure was developed, on the basis of the conservation form of the unsteady three-dimensional Euler equations, for the prediction of rotor blade-vortex interactions (BVIs). Numerical solution procedures were obtained for the analysis of the model parallel BVIs and the more realistic helicopter self-generated-rotor BVIs. It was found that, for self-generated subcritical interactions, the accuracy of the predicted leading edge pressures relied heavily on the user-specified vortex core radius and on the CAMRAD-code-predicted geometry of the interaction vortex elements and their relative orientation with respect to the blade. It was also found that the free-wake model used in CAMRAD to predict the tip vortex trajectory for use in the Euler solution yields lower streamwise and higher axial wake convective velocities than those inferred from the experimental data.

  15. The effect of tip vortex structure on helicopter noise due to blade/vortex interaction

    NASA Technical Reports Server (NTRS)

    Wolf, T. L.; Widnall, S. E.

    1978-01-01

    A potential cause of helicopter impulsive noise, commonly called blade slap, is the unsteady lift fluctuation on a rotor blade due to interaction with the vortex trailed from another blade. The relationship between vortex structure and the intensity of the acoustic signal is investigated. The analysis is based on a theoretical model for blade/vortex interaction. Unsteady lift on the blades due to blade/vortex interaction is calculated using linear unsteady aerodynamic theory, and expressions are derived for the directivity, frequency spectrum, and transient signal of the radiated noise. An inviscid rollup model is used to calculate the velocity profile in the trailing vortex from the spanwise distribution of blade tip loading. A few cases of tip loading are investigated, and numerical results are presented for the unsteady lift and acoustic signal due to blade/vortex interaction. The intensity of the acoustic signal is shown to be quite sensitive to changes in tip vortex structure.

  16. Reduction of Helicopter Blade-Vortex Interaction Noise by Active Rotor Control Technology

    NASA Technical Reports Server (NTRS)

    Yu, Yung H.; Gmelin, Bernd; Splettstoesser, Wolf; Brooks, Thomas F.; Philippe, Jean J.; Prieur, Jean

    1997-01-01

    Helicopter blade-vortex interaction noise is one of the most severe noise sources and is very important both in community annoyance and military detection. Research over the decades has substantially improved basic physical understanding of the mechanisms generating rotor blade-vortex interaction noise and also of controlling techniques, particularly using active rotor control technology. This paper reviews active rotor control techniques currently available for rotor blade vortex interaction noise reduction, including higher harmonic pitch control, individual blade control, and on-blade control technologies. Basic physical mechanisms of each active control technique are reviewed in terms of noise reduction mechanism and controlling aerodynamic or structural parameters of a blade. Active rotor control techniques using smart structures/materials are discussed, including distributed smart actuators to induce local torsional or flapping deformations, Published by Elsevier Science Ltd.

  17. Rotorcraft blade/vortex interaction noise - Its generation, radiation, and control

    NASA Technical Reports Server (NTRS)

    Preisser, J. S.; Brooks, T. F.; Martin, R. M.

    1990-01-01

    Recent results are presented from several research efforts aimed at the understanding of rotorcraft blade-vortex interaction noise generation, directivity, and control. The results are based on work performed by researches at the NASA Langley Research Center, both alone and in collaboration with other research organizations. Based on analysis of a simplified physical model, the critical parameters controlling the noise generation are identified. Detailed mapping of the acoustic radiation field reveals the extreme sensitivity of directivity to rotor advance ratio and disk attitude. A means of controlling blade-vortex interaction noise by higher harmonic pitch control is discussed.

  18. A comparison of model helicopter rotor Primary and Secondary blade/vortex interaction blade slap

    NASA Technical Reports Server (NTRS)

    Hubbard, J. E., Jr.; Leighton, K. P.

    1983-01-01

    A study of the relative importance of blade/vortex interactions which occur on the retreating side of a model helicopter rotor disk is described. Some of the salient characteristics of this phenomenon are presented and discussed. It is shown that the resulting Secondary blade slap may be of equal or greater intensity than the advancing side (Primary) blade slap. Instrumented model helicopter rotor data is presented which reveals the nature of the retreating blade/vortex interaction. The importance of Secondary blade slap as it applies to predictive techniques or approaches is discussed. When Secondary blade slap occurs it acts to enlarge the window of operating conditions for which blade slap exists.

  19. Helicopter Blade-Vortex Interaction Noise with Comparisons to CFD Calculations

    NASA Technical Reports Server (NTRS)

    McCluer, Megan S.

    1996-01-01

    A comparison of experimental acoustics data and computational predictions was performed for a helicopter rotor blade interacting with a parallel vortex. The experiment was designed to examine the aerodynamics and acoustics of parallel Blade-Vortex Interaction (BVI) and was performed in the Ames Research Center (ARC) 80- by 120-Foot Subsonic Wind Tunnel. An independently generated vortex interacted with a small-scale, nonlifting helicopter rotor at the 180 deg azimuth angle to create the interaction in a controlled environment. Computational Fluid Dynamics (CFD) was used to calculate near-field pressure time histories. The CFD code, called Transonic Unsteady Rotor Navier-Stokes (TURNS), was used to make comparisons with the acoustic pressure measurement at two microphone locations and several test conditions. The test conditions examined included hover tip Mach numbers of 0.6 and 0.7, advance ratio of 0.2, positive and negative vortex rotation, and the vortex passing above and below the rotor blade by 0.25 rotor chords. The results show that the CFD qualitatively predicts the acoustic characteristics very well, but quantitatively overpredicts the peak-to-peak sound pressure level by 15 percent in most cases. There also exists a discrepancy in the phasing (about 4 deg) of the BVI event in some cases. Additional calculations were performed to examine the effects of vortex strength, thickness, time accuracy, and directionality. This study validates the TURNS code for prediction of near-field acoustic pressures of controlled parallel BVI.

  20. Full-Potential Modeling of Blade-Vortex Interactions. Degree awarded by George Washington Univ., Feb. 1987

    NASA Technical Reports Server (NTRS)

    Jones, Henry E.

    1997-01-01

    A study of the full-potential modeling of a blade-vortex interaction was made. A primary goal of this study was to investigate the effectiveness of the various methods of modeling the vortex. The model problem restricts the interaction to that of an infinite wing with an infinite line vortex moving parallel to its leading edge. This problem provides a convenient testing ground for the various methods of modeling the vortex while retaining the essential physics of the full three-dimensional interaction. A full-potential algorithm specifically tailored to solve the blade-vortex interaction (BVI) was developed to solve this problem. The basic algorithm was modified to include the effect of a vortex passing near the airfoil. Four different methods of modeling the vortex were used: (1) the angle-of-attack method, (2) the lifting-surface method, (3) the branch-cut method, and (4) the split-potential method. A side-by-side comparison of the four models was conducted. These comparisons included comparing generated velocity fields, a subcritical interaction, and a critical interaction. The subcritical and critical interactions are compared with experimentally generated results. The split-potential model was used to make a survey of some of the more critical parameters which affect the BVI.

  1. A mechanism for mitigation of blade-vortex interaction using leading edge blowing flow control

    NASA Astrophysics Data System (ADS)

    Weiland, Chris; Vlachos, Pavlos P.

    2009-09-01

    The interaction of a vortical unsteady flow with structures is often encountered in engineering applications. Such flow structure interactions (FSI) can be responsible for generating significant loads and can have many detrimental structural and acoustic side effects, such as structural fatigue, radiated noise and even catastrophic results. Amongst the different types of FSI, the parallel blade-vortex interaction (BVI) is the most common, often encountered in helicopters and propulsors. In this work, we report on the implementation of leading edge blowing (LEB) active flow control for successfully minimizing the parallel BVI. Our results show reduction of the airfoil vibrations up to 38% based on the root-mean-square of the vibration velocity amplitude. This technique is based on displacing an incident vortex using a jet issued from the leading edge of a sharp airfoil effectively increasing the stand-off distance of the vortex from the body. The effectiveness of the method was experimentally analyzed using time-resolved digital particle image velocimetry (TRDPIV) recorded at an 800 Hz rate, which is sufficient to resolve the spatio-temporal dynamics of the flow field and it was combined with simultaneous accelerometer measurements of the airfoil, which was free to oscillate in a direction perpendicular to the freestream. Analysis of the flow field spectra and a Proper Orthogonal Decomposition (POD) of the TRDPIV data of the temporally resolved planar flow fields indicate that the LEB effectively modified the flow field surrounding the airfoil and increased the convecting vortices stand-off distance for over half of the airfoil chord length. It is shown that LEB also causes a redistribution of the flow field spectral energy over a larger range of frequencies.

  2. Experimental blade vortex interaction noise characteristics of a utility helicopter at 1/4 scale

    NASA Technical Reports Server (NTRS)

    Conner, D. A.; Hoad, D. R.

    1984-01-01

    Models of both the advanced main rotor system and the standard or "baseline" UH-1 main rotor system were tested at one-quarter scale in the Langley 4- by 7-Meter (V/STOL) Tunnel using the general rotor model system. Tests were conducted over a range of descent angles which bracketed the blade-vortex interaction phenomenon for a range of simulated forward speeds. The tunnel was operated in the open-throat configuration with acoustic treatment to improve the semi-anechoic characteristics of the test chamber. Acoustical data obtained for these two rotor systems operating at similar flight conditions are presented without analysis or discussion.

  3. An Euler code calculation of blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Hardin, J. C.; Lamkin, S. L.

    1987-01-01

    An Euler code has been developed for calculation of noise radiation due to the interaction of a distributed vortex with a Joukowski airfoil. THe time-dependent incompressible flow field is first determined and then integrated to yield the resulting sound production through use of the elegant low-frequency Green's function approach. This code has several interesting numerical features involved in the vortex motion and in continuous satisfaction of the Kutta condition. In addition, it removes the limitations on Reynolds number and is much more efficient than an earlier Navier-Stokes code. Results indicate that the noise production is due to the deceleration and subsequent acceleration of the vortex as it approaches and passes the airfoil. Predicted acoustic levels and frequencies agree with measured data although a precise comparison would require the strength, size, and position of the incoming vortex to be known.

  4. Perspective: Numerical simulation of wakes and blade-vortex interaction

    SciTech Connect

    Dong, B. . Dept. of Engineering Science and Mechanics); Mook, D.T.

    1994-03-01

    A method for simulating incompressible flows past airfoils and their wakes is described. Vorticity panels are used to represent the body, and vortex blobs (vortex points with their singularities removed) are used to represent the wake. The procedure can be applied to the simulation of completely attached flow past an oscillating airfoil. The rate at which vorticity is shed from the trailing edge of the airfoil into the wake is determined by simultaneously requiring the pressure along the upper and lower surface streamlines to approach the same value at the trailing edge and the circulation around both the airfoil and its wake to remain constant. The motion of the airfoils is discretized, and a vortex is shed from the trailing edge at each time step. The vortices are convected at the local velocity of fluid particles, a procedure that renders the pressure continuous in an inviscid fluid. When the vortices in the wake begin to separate they are split into more vortices, and when they begin to collect they are combined. The numerical simulation reveals that the wake, which is originally smooth, eventually coils, or wraps, around itself, primarily under the influence of the velocity it induces on itself, and forms regions of relatively concentrated vorticity. Although discrete vortices are used to represent the wake, the spatial density of the vortices is so high that the computed velocity profiles across a typical region of concentrated vorticity are quite smooth. Although the computed wake evolves in an entirely inviscid model of the flowfield, these profiles appear to have a viscous core. As an application, a simulation of the interaction between vorticity in the oncoming stream and a stationary airfoil is also discussed.

  5. A study of blade-vortex interaction sound generation and directionality

    NASA Technical Reports Server (NTRS)

    Ringler, Todd D.; George, Albert R.; Steele, James B.

    1991-01-01

    The directionality and strength of blade-vortex interactions (BVI) is explained through the radiation cone concept. BVI acoustic radiation is primarily the result of two sound mechanisms: the tip effect, and the radiation cone effect. The radiation cone effect is a highly directional mechanism which results when a lift distribution moves supersonically with respect to the fluid. After a physical explanation of the BVI mechanisms, sample cases using translating and rotating blades interacting with a straight line vortex are shown. The radiation cone concept is then applied to specific rotorcraft cases where it helps to explain zones of intense sound pressure level found in experimental results for the XV-15 tiltrotor and for a BO-105 helicopter scale model.

  6. Blade-vortex interaction noise predictions using measured blade surface pressures

    NASA Technical Reports Server (NTRS)

    Ziegenbein, Perry R.; Oh, Byung K.

    1987-01-01

    The generation of helicopter noise by blade-vortex interactions during descent under impulsive conditions is investigated analytically. A noise-prediction technique is developed on the basis of the dipole source term of the Ffowcs-Williams/Hawkings equation and applied to data from simultaneous blade-pressure and acoustic measurements obtained by Cowan et al. (1986) on a 10-ft-diameter 4-blade rotor model in a wind tunnel. Preliminary results show that input-blade-airload azimuth resolution of 1 deg or better and computational azimuth step size of 2 deg or less are required to achieve good agreement between predicted and recorded acoustic time histories. The need for more sophisticated methods to model chordwise input data and for a more extensive experimental data base is indicated.

  7. Reduction of Blade-Vortex Interaction (BVI) noise through X-force control

    NASA Technical Reports Server (NTRS)

    Schmitz, Fredric H.

    1995-01-01

    Momentum theory and the longitudinal force balance equations of a single rotor helicopter are used to develop simple expressions to describe tip-path-plane tilt and uniform inflow to the rotor. The uniform inflow is adjusted to represent the inflow at certain azimuthal locations where strong Blade-Vortex Interaction (BVI) is likely to occur. This theoretical model is then used to describe the flight conditions where BVI is likely to occur and to explore those flight variables that can be used to minimize BVI noise radiation. A new X-force control is introduced to help minimize BVI noise. Several methods of generating the X-force are presented that can be used to alter the inflow to the rotor and thus increasing the likelihood of avoiding BVI during approaches to a landing.

  8. Helicopter Model Rotor-Blade Vortex Interaction Impulsive Noise: Scalability and Parametric Variations

    NASA Technical Reports Server (NTRS)

    Boxwell, D. A.; Schmitz, F. H.; Splettstoesser, W. R.; Schultz, K. J.

    1987-01-01

    Acoustic data taken in the anechoic Deutsch-Niederlaendischer Windkanal (DNW) have documented the blade-vortex interaction (BVI) impulsive noise radiated from a 1/7-scale model main rotor of the AH-1 series helicopter. Averaged model-scale data were compared with averaged full-scale, in-flight acoustic data under similar non-dimensional test conditions using an improved data analysis technique. At low advance ratios (mu = 0.164 - 0.194), the BVI impulsive noise data scale remarkably well in level, waveform, and directivity patterns. At moderate advance ratios (mu = 0.224 - 0.270), the scaling deteriorates, suggesting that the model-scale rotor is not adequately simulating the full-scale BVI noise. Presently, no proved explanation of this discrepancy exists. Measured BVI noise radiation is highly sensitive to all of the four governing nondimensional parameters--hover tip Mach number, advance ratio, local inflow ratio, and thrust coefficient.

  9. Mach number scaling of helicopter rotor blade/vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Leighton, Kenneth P.; Harris, Wesley L.

    1985-01-01

    A parametric study of model helicopter rotor blade slap due to blade vortex interaction (BVI) was conducted in a 5 by 7.5-foot anechoic wind tunnel using model helicopter rotors with two, three, and four blades. The results were compared with a previously developed Mach number scaling theory. Three- and four-bladed rotor configurations were found to show very good agreement with the Mach number to the sixth power law for all conditions tested. A reduction of conditions for which BVI blade slap is detected was observed for three-bladed rotors when compared to the two-bladed baseline. The advance ratio boundaries of the four-bladed rotor exhibited an angular dependence not present for the two-bladed configuration. The upper limits for the advance ratio boundaries of the four-bladed rotors increased with increasing rotational speed.

  10. Correlation of helicopter impulsive noise from blade-vortex interaction with rotor mean inflow

    NASA Technical Reports Server (NTRS)

    Connor, Andrew B.; Martin, R. M.

    1987-01-01

    Data from a test made in the Langley 4 x 7 Meter Tunnel were parametrically studied with respect to the occurrence of blade-vortex interaction (BVI) as a function of tunnel speed and rotor angle of attack. Three microphones on the tunnel centerline forward of the model and one microphone forward and 45 degrees to the right provided the data. The rotor model was tested with a set of high-twist blades (-10 degrees) and a set of low-twist blades (-5 degrees) over the midspeed range (50 to 80 knots) at angles of attack ranging from -6 degrees (shallow climb) to 10 degrees (steep descent). The data from all four microphones indicated that the most probable time of occurrence of BVI is when the rotor descent is approximately equal to the rotor mean inflow velocity. However, some of the data showed no conclusive relationship to the mean inflow velocity.

  11. Blade-Vortex Interaction (BVI) Noise and Airload Prediction Using Loose Aerodynamic/Structural Coupling

    NASA Technical Reports Server (NTRS)

    Sim, B. W.; Lim, J. W.

    2007-01-01

    Predictions of blade-vortex interaction (BVI) noise, using blade airloads obtained from a coupled aerodynamic and structural methodology, are presented. This methodology uses an iterative, loosely-coupled trim strategy to cycle information between the OVERFLOW-2 (CFD) and CAMRAD-II (CSD) codes. Results are compared to the HART-II baseline, minimum noise and minimum vibration conditions. It is shown that this CFD/CSD state-of-the-art approach is able to capture blade airload and noise radiation characteristics associated with BVI. With the exception of the HART-II minimum noise condition, predicted advancing and retreating side BVI for the baseline and minimum vibration conditions agrees favorably with measured data. Although the BVI airloads and noise amplitudes are generally under-predicted, this CFD/CSD methodology provides an overall noteworthy improvement over the lifting line aerodynamics and free-wake models typically used in CSD comprehensive analysis codes.

  12. Studies of blade-vortex interaction noise reduction by rotor blade modification

    NASA Technical Reports Server (NTRS)

    Brooks, Thomas F.

    1993-01-01

    Blade-vortex interaction (BVI) noise is one of the most objectionable types of helicopter noise. This impulsive blade-slap noise can be particularly intense during low-speed landing approach and maneuvers. Over the years, a number of flight and model rotor tests have examined blade tip modification and other blade design changes to reduce this noise. Many times these tests have produced conflicting results. In the present paper, a number of these studies are reviewed in light of the current understanding of the BVI noise problem. Results from one study in particular are used to help establish the noise reduction potential and to shed light on the role of blade design. Current blade studies and some new concepts under development are also described.

  13. Acoustic measurements from a rotor blade-vortex interaction noise experiment in the German-Dutch Wind Tunnel (DNW)

    NASA Technical Reports Server (NTRS)

    Martin, Ruth M.; Splettstoesser, W. R.; Elliott, J. W.; Schultz, K.-J.

    1988-01-01

    Acoustic data are presented from a 40 percent scale model of the 4-bladed BO-105 helicopter main rotor, measured in the large European aeroacoustic wind tunnel, the DNW. Rotor blade-vortex interaction (BVI) noise data in the low speed flight range were acquired using a traversing in-flow microphone array. The experimental apparatus, testing procedures, calibration results, and experimental objectives are fully described. A large representative set of averaged acoustic signals is presented.

  14. A study of the noise mechanisms of transonic blade-vortex interactions

    NASA Technical Reports Server (NTRS)

    Lyrintzis, Anastasios S.; Xue, Y.

    1990-01-01

    Transonic blade-vortex interactions (BVI) are simulated numerically and the noise mechanisms are investigated. The two-dimensional high frequency transonic small disturbance equation is solved numerically (VTRAN2 code). An ADI scheme with monotone switches is used; viscous effects are included on the boundary, and the vortex is simulated by the cloud in cell method. The Kirchhoff method is used for the extension of the numerical two-dimensional near-field aerodynamic results to the linear acoustic three dimensional far field. The viscous effects (shock/boundary layer interactions) on BVI is investigated. The different types of shock motion are identified and compared. Two important disturbances with different directivity exist in the pressure signal and are believed to be related to the fluctuating lift and drag forces. Noise directivity for different cases is shown. The maximum radiation occurs at an angle between 60 and 90 degrees below the horizontal for an airfoil-fixed coordinate system and depends on the details of the airfoil shape. Different airfoil shapes are studied and classified according to the BVI noise produced.

  15. Reduction of blade-vortex interaction noise using higher harmonic pitch control

    NASA Technical Reports Server (NTRS)

    Brooks, Thomas F.; Booth, Earl R., Jr.; Jolly, J. Ralph, Jr.; Yeager, William T., Jr.; Wilbur, Matthew L.

    1989-01-01

    An acoustics test using an aeroelastically scaled rotor was conducted to examine the effectiveness of higher harmonic blade pitch control for the reduction of impulsive blade-vortex interaction (BVI) noise. A four-bladed, 110 in. diameter, articulated rotor model was tested in a heavy gas (Freon-12) medium in Langley's Transonic Dynamics Tunnel. Noise and vibration measurements were made for a range of matched flight conditions, where prescribed (open-loop) higher harmonic pitch was superimposed on the normal (baseline) collective and cyclic trim pitch. For the inflow-microphone noise measurements, advantage was taken of the reverberance in the hard walled tunnel by using a sound power determination approach. Initial findings from on-line data processing for three of the test microphones are reported for a 4/rev (4P) collective pitch control for a range of input amplitudes and phases. By comparing these results to corresponding baseline (no control) conditions, significant noise reductions (4 to 5 dB) were found for low-speed descent conditions, where helicopter BVI noise is most intense. For other rotor flight conditions, the overall noise was found to increase. All cases show increased vibration levels.

  16. Helicopter model rotor-blade vortex interaction impulsive noise: Scalability and parametric variations

    NASA Technical Reports Server (NTRS)

    Splettstoesser, W. R.; Schultz, K. J.; Boxwell, D. A.; Schmitz, F. H.

    1984-01-01

    Acoustic data taken in the anechoic Deutsch-Niederlaendischer Windkanal (DNW) have documented the blade vortex interaction (BVI) impulsive noise radiated from a 1/7-scale model main rotor of the AH-1 series helicopter. Averaged model scale data were compared with averaged full scale, inflight acoustic data under similar nondimensional test conditions. At low advance ratios (mu = 0.164 to 0.194), the data scale remarkable well in level and waveform shape, and also duplicate the directivity pattern of BVI impulsive noise. At moderate advance ratios (mu = 0.224 to 0.270), the scaling deteriorates, suggesting that the model scale rotor is not adequately simulating the full scale BVI noise; presently, no proved explanation of this discrepancy exists. Carefully performed parametric variations over a complete matrix of testing conditions have shown that all of the four governing nondimensional parameters - tip Mach number at hover, advance ratio, local inflow ratio, and thrust coefficient - are highly sensitive to BVI noise radiation.

  17. Rotorcraft acoustic radiation prediction based on a refined blade-vortex interaction model

    NASA Astrophysics Data System (ADS)

    Rule, John Allen

    1997-08-01

    The analysis of rotorcraft aerodynamics and acoustics is a challenging problem, primarily due to the fact that a rotorcraft continually flies through its own wake. The generation mechanism for a rotorcraft wake, which is dominated by strong, concentrated blade-tip trailing vortices, is similar to that in fixed wing aerodynamics. However, following blades encounter shed vortices from previous blades before they are swept downstream, resulting in sharp, impulsive loading on the blades. The blade/wake encounter, known as Blade-Vortex Interaction, or BVI, is responsible for a significant amount of vibratory loading and the characteristic rotorcraft acoustic signature in certain flight regimes. The present work addressed three different aspects of this interaction at a fundamental level. First, an analytical model for the prediction of trailing vortex structure is discussed. The model as presented is the culmination of a lengthy research effort to isolate the key physical mechanisms which govern vortex sheet rollup. Based on the Betz model, properties of the flow such as mass flux, axial momentum flux, and axial flux of angular momentum are conserved on either a differential or integral basis during the rollup process. The formation of a viscous central core was facilitated by the assumption of a turbulent mixing process with final vortex velocity profiles chosen to be consistent with a rotational flow mixing model and experimental observation. A general derivation of the method is outlined, followed by a comparison of model predictions with experimental vortex measurements, and finally a viscous blade drag model to account for additional effects of aerodynamic drag on vortex structure. The second phase of this program involved the development of a new formulation of lifting surface theory with the ultimate goal of an accurate, reduced order hybrid analytical/numerical model for fast rotorcraft load calculations. Currently, accurate rotorcraft airload analyses are limited by the massive computational power required to capture the small time scale events associated with BVI. This problem has two primary facets: accurate knowledge of the wake geometry, and accurate resolution of the impulsive loading imposed by a tip vortex on a blade. The present work addressed the second facet, providing a mathematical framework for solving the impulsive loading problem analytically, then asymptotically matching this solution to a low-resolution numerical calculation. A method was developed which uses continuous sheets of integrated boundary elements to model the lifting surface and wake. Special elements were developed to capture local behavior in high-gradient regions of the flow, thereby reducing the burden placed on the surrounding numerical method. Unsteady calculations for several classical cases were made in both frequency and time domain to demonstrate the performance of the method. Finally, a new unsteady, compressible boundary element method was applied to the problem of BVI acoustic radiation prediction. This numerical method, combined with the viscous core trailing vortex model, was used to duplicate the geometry and flight configuration of a detailed experimental BVI study carried out at NASA Ames Research Center. Blade surface pressure and near- and far-field acoustic radiation calculations were made. All calculations were shown to compare favorably with experimentally measured values. The linear boundary element method with non-linear corrections proved sufficient over most of the rotor azimuth, and particular in the region of the blade vortex interaction, suggesting that full non-linear CFD schemes are not necessary for rotorcraft noise prediction.

  18. Signal Analysis of Helicopter Blade-Vortex-Interaction Acoustic Noise Data

    NASA Technical Reports Server (NTRS)

    Rogers, James C.; Dai, Renshou

    1998-01-01

    Blade-Vortex-Interaction (BVI) produces annoying high-intensity impulsive noise. NASA Ames collected several sets of BVI noise data during in-flight and wind tunnel tests. The goal of this work is to extract the essential features of the BVI signals from the in-flight data and examine the feasibility of extracting those features from BVI noise recorded inside a large wind tunnel. BVI noise generating mechanisms and BVI radiation patterns an are considered and a simple mathematical-physical model is presented. It allows the construction of simple synthetic BVI events that are comparable to free flight data. The boundary effects of the wind tunnel floor and ceiling are identified and more complex synthetic BVI events are constructed to account for features observed in the wind tunnel data. It is demonstrated that improved recording of BVI events can be attained by changing the geometry of the rotor hub, floor, ceiling and microphone. The Euclidean distance measure is used to align BVI events from each blade and improved BVI signals are obtained by time-domain averaging the aligned data. The differences between BVI events for individual blades are then apparent. Removal of wind tunnel background noise by optimal Wiener-filtering is shown to be effective provided representative noise-only data have been recorded. Elimination of wind tunnel reflections by cepstral and optimal filtering deconvolution is examined. It is seen that the cepstral method is not applicable but that a pragmatic optimal filtering approach gives encouraging results. Recommendations for further work include: altering measurement geometry, real-time data observation and evaluation, examining reflection signals (particularly those from the ceiling) and performing further analysis of expected BVI signals for flight conditions of interest so that microphone placement can be optimized for each condition.

  19. New techniques for experimental generation of two-dimensional blade-vortex interaction at low Reynolds numbers

    NASA Technical Reports Server (NTRS)

    Booth, E., Jr.; Yu, J. C.

    1986-01-01

    An experimental investigation of two dimensional blade vortex interaction was held at NASA Langley Research Center. The first phase was a flow visualization study to document the approach process of a two dimensional vortex as it encountered a loaded blade model. To accomplish the flow visualization study, a method for generating two dimensional vortex filaments was required. The numerical study used to define a new vortex generation process and the use of this process in the flow visualization study were documented. Additionally, photographic techniques and data analysis methods used in the flow visualization study are examined.

  20. Parametric Investigation of the Effect of Hub Pitching Moment on Blade Vortex Interaction (BVI) Noise of an Isolated Rotor

    NASA Technical Reports Server (NTRS)

    Malpica, Carlos; Greenwood, Eric; Sim, Ben

    2016-01-01

    At the most fundamental level, main rotor loading noise is caused by the harmonically-varying aerodynamic loads (acoustic pressures) exerted by the rotating blades on the air. Rotorcraft main rotor noise is therefore, in principle, a function of rotor control inputs, and thus the forces and moments required to achieve steady, or "trim", flight equilibrium. In certain flight conditions, the ensuing aerodynamic loading on the rotor(s) can result in highly obtrusive harmonic noise. The effect of the propulsive force, or X-force, on Blade-Vortex Interaction (BVI) noise is well documented. This paper presents an acoustics parametric sensitivity analysis of the effect of varying rotor aerodynamic pitch hub trim moments on BVI noise radiated by an S-70 helicopter main rotor. Results show that changing the hub pitching moment for an isolated rotor, trimmed in nominal 80 knot, 6 and 12 deg descent, flight conditions, alters the miss distance between the blades and the vortex in ways that have varied and noticeable effects on the BVI radiated-noise directionality. Peak BVI noise level is however not significantly altered. The application of hub pitching moment allows the attitude of the fuselage to be controlled; for example, to compensate for the uncomfortable change in fuselage pitch attitude introduced by a fuselage-mounted X-force controller.

  1. Comparison of Full-Scale XV-15 Wind Tunnel and In-Flight Blade-Vortex Interaction Noise

    NASA Technical Reports Server (NTRS)

    Kitaplioglu, Cahit; McCluer, M.; Acree, C. W., Jr.; Warmbrodt, William (Technical Monitor)

    1997-01-01

    An isolated full-scale XV-15 rotor was tested in helicopter mode in the NASA Ames 80 by 120-Foot Wind Tunnel. Extensive acoustic data were obtained to define the rotor operating condition for maximum blade-vortex interaction (BVI) noise. Additional data were obtained at operating conditions simulating flight up to 80 knots. An XV-15 aircraft was also tested under operating conditions corresponding to landing approaches for which BVI is expected to be a maximum. In-flight acoustic data were obtained using the YO-3A acoustic research aircraft. An attempt was made to closely match wind tunnel and flight test operating conditions. Details of the two tests are described and some representative acoustic results are presented. Comparisons are shown between the wind tunnel data and corresponding flight test data. Preliminary results indicate very good correlation of the BVI-related features. However, some differences between flight test and wind tunnel results exist away from the BVI event, thought to arise from differences in the two flow environments.

  2. Effects of a trailing edge flap on the aerodynamics and acoustics of rotor blade-vortex interactions

    NASA Technical Reports Server (NTRS)

    Charles, B. D.; Tadghighi, H.; Hassan, A. A.

    1992-01-01

    The use of a trailing edge flap on a helicopter rotor has been numerically simulated to determine if such a device can mitigate the acoustics of blade vortex interactions (BVI). The numerical procedure employs CAMRAD/JA, a lifting-line helicopter rotor trim code, in conjunction with RFS2, an unsteady transonic full-potential flow solver, and WOPWOP, an acoustic model based on Farassat's formulation 1A. The codes were modified to simulate trailing edge flap effects. The CAMRAD/JA code was used to compute the far wake inflow effects and the vortex wake trajectories and strengths which are utilized by RFS2 to predict the blade surface pressure variations. These pressures were then analyzed using WOPWOP to determine the high frequency acoustic response at several fixed observer locations below the rotor disk. Comparisons were made with different flap deflection amplitudes and rates to assess flap effects on BVI. Numerical experiments were carried out using a one-seventh scale AH-1G rotor system for flight conditions simulating BVI encountered during low speed descending flight with and without flaps. Predicted blade surface pressures and acoustic sound pressure levels obtained have shown good agreement with the baseline no-flap test data obtained in the DNW wind tunnel. Numerical results indicate that the use of flaps is beneficial in reducing BVI noise.

  3. Flow structure generated by perpendicular blade-vortex interaction and implications for helicopter noise prediction. Volume 1: Measurements

    NASA Technical Reports Server (NTRS)

    Wittmer, Kenneth S.; Devenport, William J.

    1996-01-01

    The perpendicular interaction of a streamwise vortex with an infinite span helicopter blade was modeled experimentally in incompressible flow. Three-component velocity and turbulence measurements were made using a sub-miniature four sensor hot-wire probe. Vortex core parameters (radius, peak tangential velocity, circulation, and centerline axial velocity deficit) were determined as functions of blade-vortex separation, streamwise position, blade angle of attack, vortex strength, and vortex size. The downstream development of the flow shows that the interaction of the vortex with the blade wake is the primary cause of the changes in the core parameters. The blade sheds negative vorticity into its wake as a result of the induced angle of attack generated by the passing vortex. Instability in the vortex core due to its interaction with this negative vorticity region appears to be the catalyst for the magnification of the size and intensity of the turbulent flowfield downstream of the interaction. In general, the core radius increases while peak tangential velocity decreases with the effect being greater for smaller separations. These effects are largely independent of blade angle of attack; and if these parameters are normalized on their undisturbed values, then the effects of the vortex strength appear much weaker. Two theoretical models were developed to aid in extending the results to other flow conditions. An empirical model was developed for core parameter prediction which has some rudimentary physical basis, implying usefulness beyond a simple curve fit. An inviscid flow model was also created to estimate the vorticity shed by the interaction blade, and to predict the early stages of its incorporation into the interacting vortex.

  4. Perpendicular blade vortex interaction and its implications for helicopter noise prediction: Wave-number frequency spectra in a trailing vortex for BWI noise prediction

    NASA Technical Reports Server (NTRS)

    Devenport, William J.; Glegg, Stewart A. L.

    1993-01-01

    Perpendicular blade vortex interactions are a common occurrence in helicopter rotor flows. Under certain conditions they produce a substantial proportion of the acoustic noise. However, the mechanism of noise generation is not well understood. Specifically, turbulence associated with the trailing vortices shed from the blade tips appears insufficient to account for the noise generated. The hypothesis that the first perpendicular interaction experienced by a trailing vortex alters its turbulence structure in such a way as to increase the acoustic noise generated by subsequent interactions is examined. To investigate this hypothesis a two-part investigation was carried out. In the first part, experiments were performed to examine the behavior of a streamwise vortex as it passed over and downstream of a spanwise blade in incompressible flow. Blade vortex separations between +/- one eighth chord were studied for at a chord Reynolds number of 200,000. Three-component velocity and turbulence measurements were made in the flow from 4 chord lengths upstream to 15 chordlengths downstream of the blade using miniature 4-sensor hot wire probes. These measurements show that the interaction of the vortex with the blade and its wake causes the vortex core to loose circulation and diffuse much more rapidly than it otherwise would. Core radius increases and peak tangential velocity decreases with distance downstream of the blade. True turbulence levels within the core are much larger downstream than upstream of the blade. The net result is a much larger and more intense region of turbulent flow than that presented by the original vortex and thus, by implication, a greater potential for generating acoustic noise. In the second part, the turbulence measurements described above were used to derive the necessary inputs to a Blade Wake Interaction (BWI) noise prediction scheme. This resulted in significantly improved agreement between measurements and calculations of the BWI noise spectrum especially for the spectral peak at low frequencies, which previously was poorly predicted.

  5. A parametric study of blade vortex interaction noise for two, three, and four-bladed model rotors at moderate tip speeds Theory and experiment

    NASA Technical Reports Server (NTRS)

    Leighton, K. P.; Harris, W. L.

    1984-01-01

    An investigation of blade slap due to blade vortex interaction (BVI) has been conducted. This investigation consisted of an examination of BVI blade slap for two, three, and four-bladed model rotors at tip Mach numbers ranging from 0.20 to 0.50. Blade slap contours have been obtained for each configuration tested. Differences in blade slap contours, peak sound pressure level, and directivity for each configuration tested are noted. Additional fundamental differences, such as multiple interaction BVI, are observed and occur for only specific rotor blade configurations. The effect of increasing the Mach number on the BVI blade slap for various rotor blade combinations has been quantified. A peak blade slap Mach number scaling law is proposed. Comparison of measured BVI blade slap with theory is made.

  6. Numerical simulation and validation of helicopter blade-vortex interaction using coupled CFD/CSD and three levels of aerodynamic modeling

    NASA Astrophysics Data System (ADS)

    Amiraux, Mathieu

    Rotorcraft Blade-Vortex Interaction (BVI) remains one of the most challenging flow phenomenon to simulate numerically. Over the past decade, the HART-II rotor test and its extensive experimental dataset has been a major database for validation of CFD codes. Its strong BVI signature, with high levels of intrusive noise and vibrations, makes it a difficult test for computational methods. The main challenge is to accurately capture and preserve the vortices which interact with the rotor, while predicting correct blade deformations and loading. This doctoral dissertation presents the application of a coupled CFD/CSD methodology to the problem of helicopter BVI and compares three levels of fidelity for aerodynamic modeling: a hybrid lifting-line/free-wake (wake coupling) method, with modified compressible unsteady model; a hybrid URANS/free-wake method; and a URANS-based wake capturing method, using multiple overset meshes to capture the entire flow field. To further increase numerical correlation, three helicopter fuselage models are implemented in the framework. The first is a high resolution 3D GPU panel code; the second is an immersed boundary based method, with 3D elliptic grid adaption; the last one uses a body-fitted, curvilinear fuselage mesh. The main contribution of this work is the implementation and systematic comparison of multiple numerical methods to perform BVI modeling. The trade-offs between solution accuracy and computational cost are highlighted for the different approaches. Various improvements have been made to each code to enhance physical fidelity, while advanced technologies, such as GPU computing, have been employed to increase efficiency. The resulting numerical setup covers all aspects of the simulation creating a truly multi-fidelity and multi-physics framework. Overall, the wake capturing approach showed the best BVI phasing correlation and good blade deflection predictions, with slightly under-predicted aerodynamic loading magnitudes. However, it proved to be much more expensive than the other two methods. Wake coupling with RANS solver had very good loading magnitude predictions, and therefore good acoustic intensities, with acceptable computational cost. The lifting-line based technique often had over-predicted aerodynamic levels, due to the degree of empiricism of the model, but its very short run-times, thanks to GPU technology, makes it a very attractive approach.

  7. Effect of higher harmonic control on helicopter rotor blade-vortex interaction noise: Prediction and initial validation

    NASA Technical Reports Server (NTRS)

    Beaumier, P.; Prieur, J.; Rahier, G.; Spiegel, P.; Demargne, A.; Tung, C.; Gallman, J. M.; Yu, Y. H.; Kube, R.; Vanderwall, B. G.

    1995-01-01

    The paper presents a status of theoretical tools of AFDD, DLR, NASA and ONERA for prediction of the effect of HHC on helicopter main rotor BVI noise. Aeroacoustic predictions from the four research centers, concerning a wind tunnel simulation of a typical descent flight case without and with HHC are presented and compared. The results include blade deformation, geometry of interacting vortices, sectional loads and noise. Acoustic predictions are compared to experimental data. An analysis of the results provides a first insight of the mechanisms by which HHC may affect BVI noise.

  8. Analysis of helicopter blade vortex structure by laser velocimetry

    NASA Astrophysics Data System (ADS)

    Boutier, A.; Lefèvre, J.; Micheli, F.

    1996-05-01

    In descent flight, helicopter external noise is mainly generated by the Blade Vortex Interaction (BVI). To under-stand the dynamics of this phenomenon, the vortex must be characterized before its interaction with the blade, which means that its viscous core radius, its strength and its distance to the blade have to be determined by non-intrusive measurement techniques. As part of the HART program (Higher Harmonic Control Aeroacoustic Rotor Test, jointly conducted by US Army, NASA, DLR, DNW and ONERA), a series of tests have been made in the German Dutch Wind Tunnel (DNW) on a helicopter rotor with 2 m long blades, rotating at 1040 rpm; several flight configurations, with an advance ratio of 0.15 and a shaft angle of 5.3°, have been studied with different higher harmonic blade pitch angles superposed on the conventional one (corresponding to the baseline case). The flow on the retreating side has been analyzed with an especially designed 3D laser velocimeter, and, simultaneously, the blade tip attitude has been determined in order to get the blade-vortex miss distance, which is a crucial parameter in the noise reduction. A 3D laser velocimeter, in backscatter mode with a working distance of 5 m, was installed on a platform 9 m high, and flow seeding with submicron incense smoke was achieved in the settling chamber using a remotely controlled displacement device. Acquisition of instantaneous velocity vectors by an IFA 750 yielded mean velocity and turbulence maps across the vortex as well as the vortex position, intensity and viscous radius. The blade tip attitude (altitude, jitter, angle of incidence) was recorded by the TART method (Target Attitude in Real Time) which makes use of a CCD camera on which is formed the image of two retroreflecting targets attached to the blade tip and lighted by a flash lamp. In addition to the mean values of the aforementioned quantities, spectra of their fluctuations have been established up to 8 Hz.

  9. Noise Generation of BLADE-VORTEX Resonance

    NASA Astrophysics Data System (ADS)

    LEUNG, R. C. K.; SO, R. M. C.

    2001-08-01

    A numerical study of the aerodynamic noise generated when an airfoil/blade in a uniform flow is excited by an oncoming vortical flow is reported. The vortical flow is modelled by a series of flow convected discrete vortices representative of a Karman vortex street. Such noise generation problems due to fluid-blade interaction occur in helicopter rotor and turbomachinery blades. Interactions with both rigid and elastic airfoil/blade are considered. Under a vortical excitation, aerodynamic resonance of the airfoil/blade at certain excitation frequencies is found to occur and loading noise is generated due to the fluctuations of the aerodynamic loading on the airfoil/blade. For an elastic blade, due the occurrence of structural resonance incited by the flow-induced vibration of the airfoil/blade, a stronger loading noise is generated. The associated thickness effect due to the airfoil/blade vibration is extremely weak. The magnitude of the noise was found to depend on the frequency of the oncoming vortical flow and the geometry and rigidity of the blade.

  10. An Eulerian/Lagrangian method for computing blade/vortex impingement

    NASA Technical Reports Server (NTRS)

    Steinhoff, John; Senge, Heinrich; Yonghu, Wenren

    1991-01-01

    A combined Eulerian/Lagrangian approach to calculating helicopter rotor flows with concentrated vortices is described. The method computes a general evolving vorticity distribution without any significant numerical diffusion. Concentrated vortices can be accurately propagated over long distances on relatively coarse grids with cores only several grid cells wide. The method is demonstrated for a blade/vortex impingement case in 2D and 3D where a vortex is cut by a rotor blade, and the results are compared to previous 2D calculations involving a fifth-order Navier-Stokes solver on a finer grid.

  11. Interactive parallel visualization framework for distributed data

    NASA Astrophysics Data System (ADS)

    Perrine, Kenneth A.; Jones, Donald R.; Hochschild, Peter; Swetz, Richard A.

    2002-03-01

    A framework for parallel visualization at Pacific Northwest National Laboratory (PNNL) is being developed that utilizes the IBM Scaleable Graphics Engine (SGE) and IBM SP parallel computers. Parallel visualization resources are discussed, including display technologies, data handling, rendering, and interactivity. Several of these resources have been developed, while others are under development. These framework resources will be utilized by programmers in custom parallel visualization applications.

  12. Rotor system having alternating length rotor blades for reducing blade-vortex interaction (BVI) noise

    NASA Technical Reports Server (NTRS)

    Moffitt, Robert C. (Inventor); Visintainer, Joseph A. (Inventor)

    1997-01-01

    A rotor system (4) having odd and even blade assemblies (O.sub.b, E.sub.b) mounting to and rotating with a rotor hub assembly (6) wherein the odd blade assemblies (O.sub.b) define a radial length R.sub.O, and the even blade assemblies (E.sub.b) define a radial length R.sub.E and wherein the radial length R.sub.E is between about 70% to about 95% of the radial length R.sub.O. Other embodiments of the invention are directed to a Variable Diameter Rotor system (4) which may be configured for operating in various operating modes for optimizing aerodynamic and acoustic performance. The Variable Diameter Rotor system (4) includes odd and even blade assemblies (O.sub.b, E.sub.b) having inboard and outboard blade sections (10, 12) wherein the outboard blade sections (12) telescopically mount to the inboard blade sections (10). The outboard blade sections (12) are positioned with respect to the inboard blade sections (10 such that the radial length R.sub.E of the even blade assemblies (E.sub.b) is equal to the radial length R.sub.O of the odd blade assemblies (O.sub.b) in a first operating mode, and such that the radial length R.sub.E is between about 70% to about 95% of the length R.sub.O in a second operating mode.

  13. Interactions between flames on parallel solid surfaces

    NASA Technical Reports Server (NTRS)

    Urban, David L.

    1995-01-01

    The interactions between flames spreading over parallel solid sheets of paper are being studied in normal gravity and in microgravity. This geometry is of practical importance since in most heterogeneous combustion systems, the condensed phase is non-continuous and spatially distributed. This spatial distribution can strongly affect burning and/or spread rate. This is due to radiant and diffusive interactions between the surface and the flames above the surfaces. Tests were conducted over a variety of pressures and separation distances to expose the influence of the parallel sheets on oxidizer transport and on radiative feedback.

  14. Parallel Vegetation Stripe Formation Through Hydrologic Interactions

    NASA Astrophysics Data System (ADS)

    Cheng, Yiwei; Stieglitz, Marc; Turk, Greg; Engel, Victor

    2010-05-01

    It has long been a challenge to theoretical ecologists to describe vegetation pattern formations such as the "tiger bush" stripes and "leopard bush" spots in Niger, and the regular maze patterns often observed in bogs in North America and Eurasia. To date, most of simulation models focus on reproducing the spot and labyrinthine patterns, and on the vegetation bands which form perpendicular to surface and groundwater flow directions. Various hypotheses have been invoked to explain the formation of vegetation patterns: selective grazing by herbivores, fire, and anisotropic environmental conditions such as slope. Recently, short distance facilitation and long distance competition between vegetation (a.k.a scale dependent feedback) has been proposed as a generic mechanism for vegetation pattern formation. In this paper, we test the generality of this mechanism by employing an existing, spatially explicit, advection-reaction-diffusion type model to describe the formation of regularly spaced vegetation bands, including those that are parallel to flow direction. Such vegetation patterns are, for example, characteristic of the ridge and slough habitat in the Florida Everglades and which are thought to have formed parallel to the prevailing surface water flow direction. To our knowledge, this is the first time that a simple model encompassing a nutrient accumulation mechanism along with biomass development and flow is used to demonstrate the formation of parallel stripes. We also explore the interactive effects of plant transpiration, slope and anisotropic hydraulic conductivity on the resulting vegetation pattern. Our results highlight the ability of the short distance facilitation and long distance competition mechanism to explain the formation of the different vegetation patterns beyond semi-arid regions. Therefore, we propose that the parallel stripes, like the other periodic patterns observed in both isotropic and anisotropic environments, are self-organized and form as a result of scale dependent feedback. Results from this study improve upon the current understanding on the formation of parallel stripes and provide a more general theoretical framework for future empirical and modeling efforts.

  15. Parallel Mean Shift for Interactive Volume Segmentation

    NASA Astrophysics Data System (ADS)

    Zhou, Fangfang; Zhao, Ying; Ma, Kwan-Liu

    In this paper we present a parallel dynamic mean shift algorithm based on path transmission for medical volume data segmentation. The algorithm first translates the volume data into a joint position-color feature space subdivided uniformly by bandwidths, and then clusters points in feature space in parallel by iteratively finding its peak point. Over iterations it improves the convergent rate by dynamically updating data points via path transmission and reduces the amount of data points by collapsing overlapping points into one point. The GPU implementation of the algorithm can segment 256x256x256 volume in 6 seconds using an NVIDIA GeForce 8800 GTX card for interactive processing, which is hundreds times faster than its CPU implementation. We also introduce an interactive interface to segment volume data based on this GPU implementation. This interface not only provides the user with the capability to specify segmentation resolution, but also allows the user to operate on the segmented tissues and create desired visualization results.

  16. Parallelized Stochastic Cutoff Method for Long-Range Interacting Systems

    NASA Astrophysics Data System (ADS)

    Endo, Eishin; Toga, Yuta; Sasaki, Munetaka

    2015-07-01

    We present a method of parallelizing the stochastic cutoff (SCO) method, which is a Monte-Carlo method for long-range interacting systems. After interactions are eliminated by the SCO method, we subdivide a lattice into noninteracting interpenetrating sublattices. This subdivision enables us to parallelize the Monte-Carlo calculation in the SCO method. Such subdivision is found by numerically solving the vertex coloring of a graph created by the SCO method. We use an algorithm proposed by Kuhn and Wattenhofer to solve the vertex coloring by parallel computation. This method was applied to a two-dimensional magnetic dipolar system on an L × L square lattice to examine its parallelization efficiency. The result showed that, in the case of L = 2304, the speed of computation increased about 102 times by parallel computation with 288 processors.

  17. Interactive Imaging Science on Parallel Computers: Getting Immediate Results

    SciTech Connect

    Perrine, Kenneth A.; Jones, Donald R.

    2003-04-01

    Gigapixel-size images are used in calculations on parallel machines using the Pacific Northwest National Laboratory (PNNL) Parallel Computational Environment for Imaging Science (PiCEIS). The PiCEIS image browser allows the user to view real-time images as calculations are performed. The user can interact with the images, assign regions of interest to accelerate feedback, and alter algorithm parameters. The images may be displayed on an X11 terminal or parallel compositing hardware. The fast feedback and interactive features available within the image browser component of PiCEIS are valuable tools for imaging science.

  18. An Interactive Parallel Visualization Framework for Distributed Data

    SciTech Connect

    Perrine, Kenneth A.; Jones, Donald R.; Hochschild, Peter; Swetz, Richard A.

    2002-01-20

    A framework for parallel visualization at Pacific Northwest National Laboratory (PNNL) is being developed that utilizes the IBM Scaleable Graphics Engine (SGE) and IBM SP parallel computers. The SGE allows disjoint regions of pixel data to be transferred simultaneously from multiple compute nodes into a unified frame buffer. The joined graphics data is displayed on monitors attached to the SGE. Three parallel applications have been developed that write pixel data directly to local buffers and transfer the buffers to the SGE. A library is being developed to allow OpenGL applications to run in parallel and utilize the SGE. The library and SGE hardware will be an interactive framework for parallel visualization applications.

  19. Nicotine and cannabinoids: parallels, contrasts and interactions.

    PubMed

    Viveros, Maria-Paz; Marco, Eva M; File, Sandra E

    2006-01-01

    After a brief outline of the nicotinic and cannabinoid systems, we review the interactions between the pharmacological effects of nicotine and cannabis, two of the most widely used drugs of dependence. These drugs are increasingly taken in combination, particularly among the adolescents and young adults. The review focuses on addiction-related processes, gateway and reverse gateway theories of addiction and therapeutic implications. It then reviews studies on the important period of adolescence, an area that is in urgent need of further investigation and in which the importance of sex differences is emerging. Three other areas of research, which might be particularly relevant to the onset and/or maintenance of dependence, are then reviewed. Firstly, the effects of the two drugs on anxiety-related behaviours are discussed and then their effects on food intake and cognition, two areas in which they have contrasting effects. Certain animal studies suggest that reinforcing effects are likely to be enhanced by joint consumption of nicotine and cannabis, as also may be anxiolytic effects. If this was the case in humans, the latter might be viewed as an advantage particularly by adolescent girls, although the increased weight gain associated with cannabis would be a disadvantage. The two drugs also have opposite effects on cognition and the possibility of long-lasting cognitive impairments resulting from adolescent consumption of cannabis is of particular concern. PMID:17049986

  20. An interactive parallel programming environment applied in atmospheric science

    NASA Technical Reports Server (NTRS)

    vonLaszewski, G.

    1996-01-01

    This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

  1. An interactive parallel programming environment applied in atmospheric science

    SciTech Connect

    Laszewski, G. von

    1996-12-31

    This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

  2. Protein interaction discovery using parallel analysis of translated ORFs (PLATO)

    PubMed Central

    Gao, Geng; Somwar, Romel; Zhang, Zijuan; Laserson, Uri; Ciccia, Alberto; Pavlova, Natalya; Church, George; Zhang, Wei; Kesari, Santosh; Elledge, Stephen J.

    2014-01-01

    Identifying physical interactions between proteins and other molecules is a critical aspect of biological analysis. Here we describe PLATO, an in vitro method for mapping such interactions by affinity enrichment of a library of full-length open reading frames displayed on ribosomes, followed by massively parallel analysis using DNA sequencing. We demonstrate the broad utility of the method for human proteins by identifying known and previously unidentified interacting partners of LYN kinase, patient autoantibodies, and the small-molecules gefitinib and dasatinib. PMID:23503679

  3. Parallel Graphics and Interactivity with the Scaleable Graphics Engine

    SciTech Connect

    Perrine, Kenneth A.; Jones, Donald R.

    2001-11-10

    A parallel rendering environment is being developed to utilize the IBM Scaleable Graphics Engine (SGE), a hardware frame buffer for parallel computers. Goals of this software development effort include finding efficient ways of producing and displaying graphics generated on SP nodes and of assisting programmers in adapting or creating scientific simulation applications to use the SGE. Four software development phases are discussed utilize the SGE: tunneling, SMP Rendering, graphics API development using an OpenGL API implementation which utilizes the SGE in the parallel environment, and additions to the SGE-enabled OpenGL API implementation that uses threads. The SGE's ability to accept pixel data from multiple nodes simultaneously makes it a viable tool for use. With the performance observed in the test applications and performance optimizations gained programmers writing applications for IBM SPs and Linux clusters will be able to support high-speed output of graphics and be able to interact with data.

  4. IPython: components for interactive and parallel computing across disciplines. (Invited)

    NASA Astrophysics Data System (ADS)

    Perez, F.; Bussonnier, M.; Frederic, J. D.; Froehle, B. M.; Granger, B. E.; Ivanov, P.; Kluyver, T.; Patterson, E.; Ragan-Kelley, B.; Sailer, Z.

    2013-12-01

    Scientific computing is an inherently exploratory activity that requires constantly cycling between code, data and results, each time adjusting the computations as new insights and questions arise. To support such a workflow, good interactive environments are critical. The IPython project (http://ipython.org) provides a rich architecture for interactive computing with: 1. Terminal-based and graphical interactive consoles. 2. A web-based Notebook system with support for code, text, mathematical expressions, inline plots and other rich media. 3. Easy to use, high performance tools for parallel computing. Despite its roots in Python, the IPython architecture is designed in a language-agnostic way to facilitate interactive computing in any language. This allows users to mix Python with Julia, R, Octave, Ruby, Perl, Bash and more, as well as to develop native clients in other languages that reuse the IPython clients. In this talk, I will show how IPython supports all stages in the lifecycle of a scientific idea: 1. Individual exploration. 2. Collaborative development. 3. Production runs with parallel resources. 4. Publication. 5. Education. In particular, the IPython Notebook provides an environment for "literate computing" with a tight integration of narrative and computation (including parallel computing). These Notebooks are stored in a JSON-based document format that provides an "executable paper": notebooks can be version controlled, exported to HTML or PDF for publication, and used for teaching.

  5. Investigation of helicopter rotor blade/wake interactive impulsive noise

    NASA Technical Reports Server (NTRS)

    Miley, S. J.; Hall, G. F.; Vonlavante, E.

    1987-01-01

    An analysis of the Tip Aerodynamic/Aeroacoustic Test (TAAT) data was performed to identify possible aerodynamic sources of blade/vortex interaction (BVI) impulsive noise. The identification is based on correlation of measured blade pressure time histories with predicted blade/vortex intersections for the flight condition(s) where impulsive noise was detected. Due to the location of the recording microphones, only noise signatures associated with the advancing blade were available, and the analysis was accordingly restricted to the first and second azimuthal quadrants. The results show that the blade tip region is operating transonically in the azimuthal range where previous BVI experiments indicated the impulsive noise to be. No individual blade/vortex encounter is identifiable in the pressure data; however, there is indication of multiple intersections in the roll-up region which could be the origin of the noise. Discrete blade/vortex encounters are indicated in the second quadrant; however, if impulsive noise were produced here, the directivity pattern would be such that it was not recorded by the microphones. It is demonstrated that the TAAT data base is a valuable resource in the investigation of rotor aerodynamic/aeroacoustic behavior.

  6. Framework for Interactive Parallel Dataset Analysis on the Grid

    SciTech Connect

    Alexander, David A.; Ananthan, Balamurali; Johnson, Tony; Serbo, Victor; /SLAC

    2007-01-10

    We present a framework for use at a typical Grid site to facilitate custom interactive parallel dataset analysis targeting terabyte-scale datasets of the type typically produced by large multi-institutional science experiments. We summarize the needs for interactive analysis and show a prototype solution that satisfies those needs. The solution consists of desktop client tool and a set of Web Services that allow scientists to sign onto a Grid site, compose analysis script code to carry out physics analysis on datasets, distribute the code and datasets to worker nodes, collect the results back to the client, and to construct professional-quality visualizations of the results.

  7. Parallel algorithms for interactive manipulation of digital terrain models

    NASA Technical Reports Server (NTRS)

    Davis, E. W.; Mcallister, D. F.; Nagaraj, V.

    1988-01-01

    Interactive three-dimensional graphics applications, such as terrain data representation and manipulation, require extensive arithmetic processing. Massively parallel machines are attractive for this application since they offer high computational rates, and grid connected architectures provide a natural mapping for grid based terrain models. Presented here are algorithms for data movement on the massive parallel processor (MPP) in support of pan and zoom functions over large data grids. It is an extension of earlier work that demonstrated real-time performance of graphics functions on grids that were equal in size to the physical dimensions of the MPP. When the dimensions of a data grid exceed the processing array size, data is packed in the array memory. Windows of the total data grid are interactively selected for processing. Movement of packed data is needed to distribute items across the array for efficient parallel processing. Execution time for data movement was found to exceed that for arithmetic aspects of graphics functions. Performance figures are given for routines written in MPP Pascal.

  8. Long-range interactions and parallel scalability in molecular simulations

    NASA Astrophysics Data System (ADS)

    Patra, Michael; Hyvönen, Marja T.; Falck, Emma; Sabouri-Ghomi, Mohsen; Vattulainen, Ilpo; Karttunen, Mikko

    2007-01-01

    Typical biomolecular systems such as cellular membranes, DNA, and protein complexes are highly charged. Thus, efficient and accurate treatment of electrostatic interactions is of great importance in computational modeling of such systems. We have employed the GROMACS simulation package to perform extensive benchmarking of different commonly used electrostatic schemes on a range of computer architectures (Pentium-4, IBM Power 4, and Apple/IBM G5) for single processor and parallel performance up to 8 nodes—we have also tested the scalability on four different networks, namely Infiniband, GigaBit Ethernet, Fast Ethernet, and nearly uniform memory architecture, i.e. communication between CPUs is possible by directly reading from or writing to other CPUs' local memory. It turns out that the particle-mesh Ewald method (PME) performs surprisingly well and offers competitive performance unless parallel runs on PC hardware with older network infrastructure are needed. Lipid bilayers of sizes 128, 512 and 2048 lipid molecules were used as the test systems representing typical cases encountered in biomolecular simulations. Our results enable an accurate prediction of computational speed on most current computing systems, both for serial and parallel runs. These results should be helpful in, for example, choosing the most suitable configuration for a small departmental computer cluster.

  9. A multimodal parallel architecture: A cognitive framework for multimodal interactions.

    PubMed

    Cohn, Neil

    2016-01-01

    Human communication is naturally multimodal, and substantial focus has examined the semantic correspondences in speech-gesture and text-image relationships. However, visual narratives, like those in comics, provide an interesting challenge to multimodal communication because the words and/or images can guide the overall meaning, and both modalities can appear in complicated "grammatical" sequences: sentences use a syntactic structure and sequential images use a narrative structure. These dual structures create complexity beyond those typically addressed by theories of multimodality where only a single form uses combinatorial structure, and also poses challenges for models of the linguistic system that focus on single modalities. This paper outlines a broad theoretical framework for multimodal interactions by expanding on Jackendoff's (2002) parallel architecture for language. Multimodal interactions are characterized in terms of their component cognitive structures: whether a particular modality (verbal, bodily, visual) is present, whether it uses a grammatical structure (syntax, narrative), and whether it "dominates" the semantics of the overall expression. Altogether, this approach integrates multimodal interactions into an existing framework of language and cognition, and characterizes interactions between varying complexity in the verbal, bodily, and graphic domains. The resulting theoretical model presents an expanded consideration of the boundaries of the "linguistic" system and its involvement in multimodal interactions, with a framework that can benefit research on corpus analyses, experimentation, and the educational benefits of multimodality. PMID:26491835

  10. Parallel Force Assay for Protein-Protein Interactions

    PubMed Central

    Aschenbrenner, Daniela; Pippig, Diana A.; Klamecka, Kamila; Limmer, Katja; Leonhardt, Heinrich; Gaub, Hermann E.

    2014-01-01

    Quantitative proteome research is greatly promoted by high-resolution parallel format assays. A characterization of protein complexes based on binding forces offers an unparalleled dynamic range and allows for the effective discrimination of non-specific interactions. Here we present a DNA-based Molecular Force Assay to quantify protein-protein interactions, namely the bond between different variants of GFP and GFP-binding nanobodies. We present different strategies to adjust the maximum sensitivity window of the assay by influencing the binding strength of the DNA reference duplexes. The binding of the nanobody Enhancer to the different GFP constructs is compared at high sensitivity of the assay. Whereas the binding strength to wild type and enhanced GFP are equal within experimental error, stronger binding to superfolder GFP is observed. This difference in binding strength is attributed to alterations in the amino acids that form contacts according to the crystal structure of the initial wild type GFP-Enhancer complex. Moreover, we outline the potential for large-scale parallelization of the assay. PMID:25546146

  11. Highly parallel characterization of IgG Fc binding interactions.

    PubMed

    Boesch, Austin W; Brown, Eric P; Cheng, Hao D; Ofori, Maame Ofua; Normandin, Erica; Nigrovic, Peter A; Alter, Galit; Ackerman, Margaret E

    2014-01-01

    Because the variable ability of the antibody constant (Fc) domain to recruit innate immune effector cells and complement is a major factor in antibody activity in vivo, convenient means of assessing these binding interactions is of high relevance to the development of enhanced antibody therapeutics, and to understanding the protective or pathogenic antibody response to infection, vaccination, and self. Here, we describe a highly parallel microsphere assay to rapidly assess the ability of antibodies to bind to a suite of antibody receptors. Fc and glycan binding proteins such as FcγR and lectins were conjugated to coded microspheres and the ability of antibodies to interact with these receptors was quantified. We demonstrate qualitative and quantitative assessment of binding preferences and affinities across IgG subclasses, Fc domain point mutants, and antibodies with variant glycosylation. This method can serve as a rapid proxy for biophysical methods that require substantial sample quantities, high-end instrumentation, and serial analysis across multiple binding interactions, thereby offering a useful means to characterize monoclonal antibodies, clinical antibody samples, and antibody mimics, or alternatively, to investigate the binding preferences of candidate Fc receptors. PMID:24927273

  12. Interaction of a turbulent vortex with a lifting surface

    NASA Technical Reports Server (NTRS)

    Lee, D. J.; Roberts, L.

    1985-01-01

    The impulsive noise due to blade-vortex-interaction is analyzing in the time domain for the extreme case when the blade cuts through the center of the vortex core with the assumptions of no distortion of the vortex path or of the vortex core. An analytical turbulent vortex core model, described in terms of the tip aerodynamic parameters, is used and its effects on the unsteady loading and maximum acoustic pressure during the interaction are determined.

  13. Mutual interaction between parallel Gaussian electromagnetic beams in plasmas

    SciTech Connect

    Sodha, Mahendra Singh; Agarwal, Sujeet Kumar; Sharma, Ashutosh

    2006-10-15

    In this paper, the interaction between two Gaussian electromagnetic beams in a plasma has been investigated, when the axes of the two beams are initially (z=0) parallel along the z axis in the x-z plane; the beams are initially propagating in the z direction. For the three types of nonlinearities (viz., collisional, ponderomotive, and relativistic) the dielectric function has been expressed as a function of the irradiances of the two beams; this expression for the dielectric function has been substituted in the wave equation and a solution of the resulting nonlinear equation obtained in the paraxial approximation. The paraxial approximation is justified since the phenomena of interest occur when the beams are initially close ({radical}(2)x{sub 0}{<=}r{sub 0}). Further, the absorption of the beam in the plasma has been neglected, which is justified when the electron collision frequency is much less than the frequencies of the beams. Second-order coupled ordinary differential equations have been obtained for the distance between the centers of the beams and the beam widths in the x and y directions as a function of the distance of propagation along the z axis. The equations have been solved numerically for a range of parameters and a discussion of the results is presented.

  14. Interactive animation of fault-tolerant parallel algorithms

    SciTech Connect

    Apgar, S.W.

    1992-02-01

    Animation of algorithms makes understanding them intuitively easier. This paper describes the software tool Raft (Robust Animator of Fault Tolerant Algorithms). The Raft system allows the user to animate a number of parallel algorithms which achieve fault tolerant execution. In particular, we use it to illustrate the key Write-All problem. It has an extensive user-interface which allows a choice of the number of processors, the number of elements in the Write-All array, and the adversary to control the processor failures. The novelty of the system is that the interface allows the user to create new on-line adversaries as the algorithm executes.

  15. Bayesian seismic tomography by parallel interacting Markov chains

    NASA Astrophysics Data System (ADS)

    Gesret, Alexandrine; Bottero, Alexis; Romary, Thomas; Noble, Mark; Desassis, Nicolas

    2014-05-01

    The velocity field estimated by first arrival traveltime tomography is commonly used as a starting point for further seismological, mineralogical, tectonic or similar analysis. In order to interpret quantitatively the results, the tomography uncertainty values as well as their spatial distribution are required. The estimated velocity model is obtained through inverse modeling by minimizing an objective function that compares observed and computed traveltimes. This step is often performed by gradient-based optimization algorithms. The major drawback of such local optimization schemes, beyond the possibility of being trapped in a local minimum, is that they do not account for the multiple possible solutions of the inverse problem. They are therefore unable to assess the uncertainties linked to the solution. Within a Bayesian (probabilistic) framework, solving the tomography inverse problem aims at estimating the posterior probability density function of velocity model using a global sampling algorithm. Markov chains Monte-Carlo (MCMC) methods are known to produce samples of virtually any distribution. In such a Bayesian inversion, the total number of simulations we can afford is highly related to the computational cost of the forward model. Although fast algorithms have been recently developed for computing first arrival traveltimes of seismic waves, the complete browsing of the posterior distribution of velocity model is hardly performed, especially when it is high dimensional and/or multimodal. In the latter case, the chain may even stay stuck in one of the modes. In order to improve the mixing properties of classical single MCMC, we propose to make interact several Markov chains at different temperatures. This method can make efficient use of large CPU clusters, without increasing the global computational cost with respect to classical MCMC and is therefore particularly suited for Bayesian inversion. The exchanges between the chains allow a precise sampling of the high probability zones of the model space while avoiding the chains to end stuck in a probability maximum. This approach supplies thus a robust way to analyze the tomography imaging uncertainties. The interacting MCMC approach is illustrated on two synthetic examples of tomography of calibration shots such as encountered in induced microseismic studies. On the second application, a wavelet based model parameterization is presented that allows to significantly reduce the dimension of the problem, making thus the algorithm efficient even for a complex velocity model.

  16. Highly parallel measurements of interaction kinetic constants with a microfabricated optomechanical device

    NASA Astrophysics Data System (ADS)

    Bates, Steven R.; Quake, Stephen R.

    2009-08-01

    We used mechanical trapping of molecular interactions to demonstrate a highly parallel approach to measure the kinetics of biomolecular interactions. This approach consumes 25 fmol of material per measurement and permits 320 measurements in a single experiment. We measured association and dissociation curves for the interactions of 6-His and T7 epitope tags with their antibodies, from which we determined the off rates, on rates, and dissociation constants.

  17. Cell interaction with graphene microsheets: near-orthogonal cutting versus parallel attachment

    NASA Astrophysics Data System (ADS)

    Yi, Xin; Gao, Huajian

    2015-03-01

    Recent experiments indicate that graphene microsheets can either undergo a near-orthogonal cutting or a parallel attachment mode of interaction with cell membranes. Here we perform a theoretical analysis to characterize the deformed membrane microstructure and investigate how these two interaction modes are influenced by the splay, tilt, compression, tension, bending and adhesion energies of the membrane. Our analysis indicates that, driven by the membrane splay and tension energies, a two-dimensional microsheet such as graphene would adopt a near-perpendicular configuration with respect to the membrane in the transmembrane penetration mode, whereas the membrane bending and tension energies would lead to parallel attachment in the absence of cross membrane penetration. These interaction modes may have broad implications in applications involving drug delivery, cell encapsulation and protection, and the measurement of the dynamic cell response.

  18. Towards a high performance parallel library to compute fluid and flexible structures interactions

    NASA Astrophysics Data System (ADS)

    Nagar, Prateek

    LBM-IB method is useful and popular simulation technique that is adopted ubiquitously to solve Fluid-Structure interaction problems in computational fluid dynamics. These problems are known for utilizing computing resources intensively while solving mathematical equations involved in simulations. Problems involving such interactions are omnipresent, therefore, it is eminent that a faster and accurate algorithm exists for solving these equations, to reproduce a real-life model of such complex analytical problems in a shorter time period. LBM-IB being inherently parallel, proves to be an ideal candidate for developing a parallel software. This research focuses on developing a parallel software library, LBM-IB based on the algorithm proposed by [1] which is first of its kind that utilizes the high performance computing abilities of supercomputers procurable today. An initial sequential version of LBM-IB is developed that is used as a benchmark for correctness and performance evaluation of shared memory parallel versions. Two shared memory parallel versions of LBM-IB have been developed using OpenMP and Pthread library respectively. The OpenMP version is able to scale well enough, as good as 83% speedup on multicore machines for 8 cores. Based on the profiling and instrumentation done on this version, to improve the data-locality and increase the degree of parallelism, Pthread based data centric version is developed which is able to outperform the OpenMP version by 53% on manycore machines. A distributed version using the MPI interfaces on top of the cube based Pthread version has also been designed to be used by extreme scale distributed memory manycore systems.

  19. Interactive Parallel Data Analysis within Data-Centric Cluster Facilities using the IPython Notebook

    NASA Astrophysics Data System (ADS)

    Pascoe, S.; Lansdowne, J.; Iwi, A.; Stephens, A.; Kershaw, P.

    2012-12-01

    The data deluge is making traditional analysis workflows for many researchers obsolete. Support for parallelism within popular tools such as matlab, IDL and NCO is not well developed and rarely used. However parallelism is necessary for processing modern data volumes on a timescale conducive to curiosity-driven analysis. Furthermore, for peta-scale datasets such as the CMIP5 archive, it is no longer practical to bring an entire dataset to a researcher's workstation for analysis, or even to their institutional cluster. Therefore, there is an increasing need to develop new analysis platforms which both enable processing at the point of data storage and which provides parallelism. Such an environment should, where possible, maintain the convenience and familiarity of our current analysis environments to encourage curiosity-driven research. We describe how we are combining the interactive python shell (IPython) with our JASMIN data-cluster infrastructure. IPython has been specifically designed to bridge the gap between the HPC-style parallel workflows and the opportunistic curiosity-driven analysis usually carried out using domain specific languages and scriptable tools. IPython offers a web-based interactive environment, the IPython notebook, and a cluster engine for parallelism all underpinned by the well-respected Python/Scipy scientific programming stack. JASMIN is designed to support the data analysis requirements of the UK and European climate and earth system modeling community. JASMIN, with its sister facility CEMS focusing the earth observation community, has 4.5 PB of fast parallel disk storage alongside over 370 computing cores provide local computation. Through the IPython interface to JASMIN, users can make efficient use of JASMIN's multi-core virtual machines to perform interactive analysis on all cores simultaneously or can configure IPython clusters across multiple VMs. Larger-scale clusters can be provisioned through JASMIN's batch scheduling system. Outputs can be summarised and visualised using the full power of Python's many scientific tools, including Scipy, Matplotlib, Pandas and CDAT. This rich user experience is delivered through the user's web browser; maintaining the interactive feel of a workstation-based environment with the parallel power of a remote data-centric processing facility.

  20. Parallel implementation of molecular dynamics simulation for short-ranged interaction

    NASA Astrophysics Data System (ADS)

    Wu, Jong-Shinn; Hsu, Yu-Lin; Lee, Yun-Min

    2005-08-01

    A parallel molecular dynamics simulation method, designed for large-scale problems, employing dynamic spatial domain decomposition for short-ranged molecular interactions is proposed. In this parallel cellular molecular dynamics (PCMD) simulation method, the link-cell data structure is used to reduce the searching time required for forming the cut-off neighbor list as well as for domain decomposition, which utilizes the multi-level graph-partitioning technique. A simple threshold scheme (STS), in which workload imbalance is monitored and compared with some threshold value during the runtime, is proposed to decide the proper time for repartitioning the domain. The simulation code is implemented and tested on the memory-distributed parallel machine, e.g., PC-cluster system. Parallel performance is studied using approximately one million L-J atoms in the condensed, vaporized and supercritical states. Results show that fairly good parallel efficiency at 49 processors can be obtained for the condensed and supercritical states (˜60%), while it is comparably lower for the vaporized state (˜40%).

  1. A Theory of Interactive Parallel Processing: New Capacity Measures and Predictions for a Response Time Inequality Series

    ERIC Educational Resources Information Center

    Townsend, James T.; Wenger, Michael J.

    2004-01-01

    The authors present a theory of stochastic interactive parallel processing with special emphasis on channel interactions and their relation to system capacity. The approach is based both on linear systems theory augmented with stochastic elements and decisional operators and on a metatheory of parallel channels' dependencies that incorporates…

  2. Interactions between glide dislocations and parallel interfacial dislocations in nanoscale strained layers

    SciTech Connect

    Akasheh, F.; Zbib, H. M.; Hirth, J. P.; Hoagland, R. G.; Misra, A.

    2007-08-01

    Plastic deformation in nanoscale multilayered structures is thought to proceed by the successive propagation of single dislocation loops at the interfaces. Based on this view, we simulate the effect of predeposited interfacial dislocation on the stress (channeling stress) needed to propagate a new loop parallel to existing loops. Single interfacial dislocations as well as finite parallel arrays are considered in the computation. When the gliding dislocation and the predeposited interfacial array have collinear Burgers vectors, the channeling stress increases monotonically as the density of dislocations in the array increases. In the case when their Burgers vectors are inclined at 60 deg. , a regime of perfect plasticity is observed which can be traced back to an instability in the flow stress arising from the interaction between the glide dislocation and a single interfacial dislocation dipole. This interaction leads to a tendency for dislocations of alternating Burgers vectors to propagate during deformation leading to nonuniform arrays. Inclusion of these parallel interactions in the analysis improves the strength predictions as compared with the measured strength of a Cu-Ni multilayered system in the regime where isolated glide dislocation motion controls flow, but does not help to explain the observed strength saturation when the individual layer thickness is in the few nanometer range.

  3. Parallel implementation of three-dimensional molecular dynamic simulation for laser-cluster interaction

    SciTech Connect

    Holkundkar, Amol R.

    2013-11-15

    The objective of this article is to report the parallel implementation of the 3D molecular dynamic simulation code for laser-cluster interactions. The benchmarking of the code has been done by comparing the simulation results with some of the experiments reported in the literature. Scaling laws for the computational time is established by varying the number of processor cores and number of macroparticles used. The capabilities of the code are highlighted by implementing various diagnostic tools. To study the dynamics of the laser-cluster interactions, the executable version of the code is available from the author.

  4. Propeller tip vortex interactions

    NASA Technical Reports Server (NTRS)

    Johnston, Robert T.; Sullivan, John P.

    1990-01-01

    Propeller wakes interacting with aircraft aerodynamic surfaces are a source of noise and vibration. For this reason, flow visualization work on the motion of the helical tip vortex over a wing and through the second stage of a counterrotation propeller (CRP) has been pursued. Initially, work was done on the motion of a propeller helix as it passes over the center of a 9.0 aspect ratio wing. The propeller tip vortex experiences significant spanwise displacements when passing across a lifting wing. A stationary propeller blade or stator was installed behind the rotating propeller to model the blade vortex interaction in a CRP. The resulting vortex interaction was found to depend on the relative vortex strengths and vortex sign.

  5. Engineering of parallel plasmonic-photonic interactions for on-chip refractive index sensors

    NASA Astrophysics Data System (ADS)

    Lin, Linhan; Zheng, Yuebing

    2015-07-01

    Ultra-narrow linewidth in the extinction spectrum of noble metal nanoparticle arrays induced by the lattice plasmon resonances (LPRs) is of great significance for applications in plasmonic lasers and plasmonic sensors. However, the challenge of sustaining LPRs in an asymmetric environment greatly restricts their practical applications, especially for high-performance on-chip plasmonic sensors. Herein, we fully study the parallel plasmonic-photonic interactions in both the Au nanodisk arrays (NDAs) and the core/shell SiO2/Au nanocylinder arrays (NCAs). Different from the dipolar interactions in the conventionally studied orthogonal coupling, the horizontal propagating electric field introduces the out-of-plane ``hot spots'' and results in electric field delocalization. Through controlling the aspect ratio to manipulate the ``hot spot'' distributions of the localized surface plasmon resonances (LSPRs) in the NCAs, we demonstrate a high-performance refractive index sensor with a wide dynamic range of refractive indexes ranging from 1.0 to 1.5. Both high figure of merit (FOM) and high signal-to-noise ratio (SNR) can be maintained under these detectable refractive indices. Furthermore, the electromagnetic field distributions confirm that the high FOM in the wide dynamic range is attributed to the parallel coupling between the superstrate diffraction orders and the height-induced LSPR modes. Our study on the near-field ``hot-spot'' engineering and far-field parallel coupling paves the way towards improved understanding of the parallel LPRs and the design of high-performance on-chip refractive index sensors.Ultra-narrow linewidth in the extinction spectrum of noble metal nanoparticle arrays induced by the lattice plasmon resonances (LPRs) is of great significance for applications in plasmonic lasers and plasmonic sensors. However, the challenge of sustaining LPRs in an asymmetric environment greatly restricts their practical applications, especially for high-performance on-chip plasmonic sensors. Herein, we fully study the parallel plasmonic-photonic interactions in both the Au nanodisk arrays (NDAs) and the core/shell SiO2/Au nanocylinder arrays (NCAs). Different from the dipolar interactions in the conventionally studied orthogonal coupling, the horizontal propagating electric field introduces the out-of-plane ``hot spots'' and results in electric field delocalization. Through controlling the aspect ratio to manipulate the ``hot spot'' distributions of the localized surface plasmon resonances (LSPRs) in the NCAs, we demonstrate a high-performance refractive index sensor with a wide dynamic range of refractive indexes ranging from 1.0 to 1.5. Both high figure of merit (FOM) and high signal-to-noise ratio (SNR) can be maintained under these detectable refractive indices. Furthermore, the electromagnetic field distributions confirm that the high FOM in the wide dynamic range is attributed to the parallel coupling between the superstrate diffraction orders and the height-induced LSPR modes. Our study on the near-field ``hot-spot'' engineering and far-field parallel coupling paves the way towards improved understanding of the parallel LPRs and the design of high-performance on-chip refractive index sensors. Electronic supplementary information (ESI) available. See DOI: 10.1039/c5nr03159a

  6. Formation of electron kappa distributions due to interactions with parallel propagating whistler waves

    SciTech Connect

    Tao, X. Lu, Q.; Mengcheng National Geophysical Observatory, School of Earth and Space Sciences, University of Science and Technology of China, Hefei, Anhui 230026

    2014-02-15

    In space plasmas, charged particles are frequently observed to possess a high-energy tail, which is often modeled by a kappa-type distribution function. In this work, the formation of the electron kappa distribution in generation of parallel propagating whistler waves is investigated using fully nonlinear particle-in-cell (PIC) simulations. A previous research concluded that the bi-Maxwellian character of electron distributions is preserved in PIC simulations. We now demonstrate that for interactions between electrons and parallel propagating whistler waves, a non-Maxwellian high-energy tail can be formed, and a kappa distribution can be used to fit the electron distribution in time-asymptotic limit. The κ-parameter is found to decrease with increasing initial temperature anisotropy or decreasing ratio of electron plasma frequency to cyclotron frequency. The results might be helpful to understanding the origin of electron kappa distributions observed in space plasmas.

  7. Parallel algorithms and applications of configuration-interaction shell-model code BIGSTICK

    NASA Astrophysics Data System (ADS)

    Krastev, Plamen; Johnson, Calvin; Ormand, Erich

    2010-11-01

    Nuclear shell-model, together with two- and three-body interactions, is a powerful tool for gaining insight for properties of light nuclei. The aid of advanced computer resources is of major importance in such calculations. We report on the latest developments and applications of configuration-interaction shell-model code BIGSTICK -- an efficient parallel on-the-fly code which solves the nuclear many-body problem with both two- and three-body interactions. The US Department of Energy supported this investigation through Contract Nos. DE-FG02-96ER40985 and DE-FC02- 09ER41587 and through Subcontract No. B576152 of the Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344.

  8. Orbital-based insights into parallel-displaced and twisted conformations in π-π interactions.

    PubMed

    Lutz, Patricia B; Bayse, Craig A

    2013-06-21

    Dispersion and electrostatics are known to stabilize π-π interactions, but the preference for parallel-displaced (PD) and/or twisted (TW) over sandwiched (S) conformations is not well understood. Orbital interactions are generally believed to play little to no role in π-stacking. However, orbital analysis of the dimers of benzene, pyridine, cytosine and several polyaromatic hydrocarbons demonstrates that PD and/or TW structures convert one or more π-type dimer MOs with out-of-phase or antibonding inter-ring character at the S stack to in-phase or bonding in the PD/TW stack. This change in dimer MO character can be described in terms of a qualitative stack bond order (SBO) defined as the difference between the number of occupied in-phase/bonding and out-of-phase/antibonding inter-ring π-type MOs. The concept of an SBO is introduced here in analogy to the bond order in molecular orbital theory. Thus, whereas the SBO of the S structure is zero, parallel displacement or twisting the stack results in a non-zero SBO and overall bonding character. The shift in bonding/antibonding character found at optimal PD/TW structures maximizes the inter-ring density, as measured by intermolecular Wiberg bond indices (WBIs). Values of WBIs calculated as a function of the parallel-displacement are found to correlate with the dispersion and other contributions to the π-π interaction energy determined by the highly accurate density-fitting DFT symmetry adapted perturbation theory (DF-DFT-SAPT) method. These DF-DFT-SAPT calculations also suggest that the dispersion and other contributions are maximized at the PD conformation rather than the S when conducted on a potential energy curve where the inter-ring distance is optimized at fixed slip distances. From these results of this study, we conclude that descriptions of the qualitative manner in which orbitals interact within π-stacking interactions can supplement high-level calculations of the interaction energy and provide an intuitive tool for applications to crystal design, molecular recognition and other fields where non-covalent interactions are important. PMID:23665910

  9. Bioactive Hydrogel Substrates: Probing Leukocyte Receptor–Ligand Interactions in Parallel Plate Flow Chamber Studies

    PubMed Central

    Taite, Lakeshia J.; Rowland, Maude L.; Ruffino, Katie A.; Smith, Bryan R. E.; Lawrence, Michael B.

    2006-01-01

    The binding of activated integrins on the surface of leukocytes facilitates the adhesion of leukocytes to vascular endothelium during inflammation. Interactions between selectins and their ligands mediate rolling, and are believed to play an important role in leukocyte adhesion, though the minimal recognition motif required for physiologic interactions is not known. We have developed a novel system using poly(ethylene glycol) (PEG) hydrogels modified with either integrin-binding peptide sequences or the selectin ligand sialyl Lewis X (SLeX) within a parallel plate flow chamber to examine the dynamics of leukocyte adhesion to specific ligands. The adhesive peptide sequences arginine–glycine–aspartic acid–serine (RGDS) and leucine–aspartic acid–valine (LDV) as well as sialyl Lewis X were bound to the surface of photopolymerized PEG diacrylate hydrogels. Leukocytes perfused over these gels in a parallel plate flow chamber at physiological shear rates demonstrate both rolling and firm adhesion, depending on the identity and concentration of ligand bound to the hydrogel substrate. This new system provides a unique polymer-based model for the study of interactions between leukocytes and endothelium as well as a platform to develop improved scaffolds for cardiovascular tissue engineering. PMID:17031598

  10. Parallel PIC Simulations of Short-Pulse High Intensity Laser Plasma Interactions.

    NASA Astrophysics Data System (ADS)

    Lasinski, B. F.; Still, C. H.; Langdon, A. B.

    2001-10-01

    We extend our previous simulations of high intensity short pulse laser plasma interactions footnote B. F. Lasinski, A. B. Langdon, S. P. Hatchett, M. H. Key, and M. Tabak, Phys. Plasmas 6, 2041 (1999); S. C. Wilks and W. L. Kruer, IEEE Journal of Quantum Electronics 11, 1954 (1997). to 3D and to much larger systems in 2D using our new, modern, 3D, electromagnetic, fully relativistic, massively parallel PIC code. We study the generation of hot electrons and energetic ions and the associated complex phenomena. Laser light filamentation and the formation of high static magnetic fields are described.

  11. Dynamical interaction effects on an electric dipole moving parallel to a flat solid surface

    SciTech Connect

    Villo-Perez, Isidro; Abril, Isabel; Garcia-Molina, Rafael; Arista, Nestor R.

    2005-05-15

    The interaction experienced by a fast electric dipole moving parallel and close to a flat solid surface is studied using the dielectric formalism. Analytical expressions for the force acting on the dipole, for random and for particular orientations, are obtained. Several features related to the dynamical effects on the induced forces are discussed, and numerical values are obtained for the different cases. The calculated energy loss of the electric dipole provides useful estimations which could be of interest for small-angle scattering experiments using polar molecules.

  12. Fluid/Structure Interaction Studies of Aircraft Using High Fidelity Equations on Parallel Computers

    NASA Technical Reports Server (NTRS)

    Guruswamy, Guru; VanDalsem, William (Technical Monitor)

    1994-01-01

    Abstract Aeroelasticity which involves strong coupling of fluids, structures and controls is an important element in designing an aircraft. Computational aeroelasticity using low fidelity methods such as the linear aerodynamic flow equations coupled with the modal structural equations are well advanced. Though these low fidelity approaches are computationally less intensive, they are not adequate for the analysis of modern aircraft such as High Speed Civil Transport (HSCT) and Advanced Subsonic Transport (AST) which can experience complex flow/structure interactions. HSCT can experience vortex induced aeroelastic oscillations whereas AST can experience transonic buffet associated structural oscillations. Both aircraft may experience a dip in the flutter speed at the transonic regime. For accurate aeroelastic computations at these complex fluid/structure interaction situations, high fidelity equations such as the Navier-Stokes for fluids and the finite-elements for structures are needed. Computations using these high fidelity equations require large computational resources both in memory and speed. Current conventional super computers have reached their limitations both in memory and speed. As a result, parallel computers have evolved to overcome the limitations of conventional computers. This paper will address the transition that is taking place in computational aeroelasticity from conventional computers to parallel computers. The paper will address special techniques needed to take advantage of the architecture of new parallel computers. Results will be illustrated from computations made on iPSC/860 and IBM SP2 computer by using ENSAERO code that directly couples the Euler/Navier-Stokes flow equations with high resolution finite-element structural equations.

  13. A Parallel Monolithic Approach for Fluid-Structure Interaction in a Cerebral Aneurysm

    NASA Astrophysics Data System (ADS)

    Sahin, Mehmet; Eken, Ali

    2014-11-01

    A parallel fully-coupled approach has been developed for the fluid-structure interaction problem in a cerebral artery with aneurysm. An Arbitrary Lagrangian-Eulerian formulation based on the side-centered unstructured finite volume method is employed for the governing incompressible Navier-Stokes equations and the classical Galerkin finite element formulation is used to discretize the constitutive law for the Saint Venant-Kirchhoff material in a Lagrangian frame for the solid domain. The time integration method for the structure domain is based on the energy conserving mid-point method while the second-order backward difference is used within the fluid domain. The resulting large-scale algebraic linear equations are solved using a one-level restricted additive Schwarz preconditioner with a block-incomplete factorization within each partitioned sub-domains. The parallel implementation of the present fully coupled unstructured fluid-structure solver is based on the PETSc library. The proposed numerical algorithm is initially validated for several classical benchmark problems and then applied to a more complicated problem involving unsteady pulsatile blood flow in a cerebral artery with aneurysm as a realistic fluid-structure interaction problem encountered in biomechanics. The authors acknowledge financial support from Turkish National Scientific and Technical Research Council through Project Number 112M107.

  14. Parallel changes of taxonomic interaction networks in lacustrine bacterial communities induced by a polymetallic perturbation

    PubMed Central

    Laplante, Karine; Sébastien, Boutin; Derome, Nicolas

    2013-01-01

    Heavy metals released by anthropogenic activities such as mining trigger profound changes to bacterial communities. In this study we used 16S SSU rRNA gene high-throughput sequencing to characterize the impact of a polymetallic perturbation and other environmental parameters on taxonomic networks within five lacustrine bacterial communities from sites located near Rouyn-Noranda, Quebec, Canada. The results showed that community equilibrium was disturbed in terms of both diversity and structure. Moreover, heavy metals, especially cadmium combined with water acidity, induced parallel changes among sites via the selection of resistant OTUs (Operational Taxonomic Unit) and taxonomic dominance perturbations favoring the Alphaproteobacteria. Furthermore, under a similar selective pressure, covariation trends between phyla revealed conservation and parallelism within interphylum interactions. Our study sheds light on the importance of analyzing communities not only from a phylogenetic perspective but also including a quantitative approach to provide significant insights into the evolutionary forces that shape the dynamic of the taxonomic interaction networks in bacterial communities. PMID:23789031

  15. Electromagnetic semitransparent δ-function plate: Casimir interaction energy between parallel infinitesimally thin plates

    NASA Astrophysics Data System (ADS)

    Parashar, Prachi; Milton, Kimball A.; Shajesh, K. V.; Schaden, M.

    2012-10-01

    We derive boundary conditions for electromagnetic fields on a δ-function plate. The optical properties of such a plate are shown to necessarily be anisotropic in that they only depend on the transverse properties of the plate. We unambiguously obtain the boundary conditions for a perfectly conducting δ-function plate in the limit of infinite dielectric response. We show that a material does not “optically vanish” in the thin-plate limit. The thin-plate limit of a plasma slab of thickness d with plasma frequency ωp2=ζp/d reduces to a δ-function plate for frequencies (ω=iζ) satisfying ζd≪ζpd≪1. We show that the Casimir interaction energy between two parallel perfectly conducting δ-function plates is the same as that for parallel perfectly conducting slabs. Similarly, we show that the interaction energy between an atom and a perfect electrically conducting δ-function plate is the usual Casimir-Polder energy, which is verified by considering the thin-plate limit of dielectric slabs. The “thick” and “thin” boundary conditions considered by Bordag are found to be identical in the sense that they lead to the same electromagnetic fields.

  16. A Force-Based, Parallel Assay for the Quantification of Protein-DNA Interactions

    PubMed Central

    Limmer, Katja; Pippig, Diana A.; Aschenbrenner, Daniela; Gaub, Hermann E.

    2014-01-01

    Analysis of transcription factor binding to DNA sequences is of utmost importance to understand the intricate regulatory mechanisms that underlie gene expression. Several techniques exist that quantify DNA-protein affinity, but they are either very time-consuming or suffer from possible misinterpretation due to complicated algorithms or approximations like many high-throughput techniques. We present a more direct method to quantify DNA-protein interaction in a force-based assay. In contrast to single-molecule force spectroscopy, our technique, the Molecular Force Assay (MFA), parallelizes force measurements so that it can test one or multiple proteins against several DNA sequences in a single experiment. The interaction strength is quantified by comparison to the well-defined rupture stability of different DNA duplexes. As a proof-of-principle, we measured the interaction of the zinc finger construct Zif268/NRE against six different DNA constructs. We could show the specificity of our approach and quantify the strength of the protein-DNA interaction. PMID:24586920

  17. Structure and effective interactions in parallel monolayers of charged spherical colloids.

    PubMed

    Contreras-Aburto, C; Méndez-Alcaraz, J M; Castañeda-Priego, R

    2010-05-01

    We study the microstructure and the effective interactions of model suspensions consisting of Yukawa-like colloidal particles homogeneously distributed in equally spaced parallel planar monolayers. All the particles interact with each other, but particle transfer between monolayers is not allowed. The spacing between the layers defines the effective system dimensionality. When the layer spacing is comparable to the particle size, the system shows quasi-three-dimensional behavior, whereas for large distances the layers behave as effective two-dimensional systems. We find that effective attractions between like-charged particles can be triggered by adjusting the interlayer spacing, showing that the distance between adjacent layers is an excellent control parameter for the effective interparticle interactions. Our study is based on Brownian dynamics simulations and the integral equations theory of liquids. The effective potentials are accounted for by exploiting the invariance of the Ornstein-Zernike matrix equation under contractions of the description, and on assuming that the difference between bare and effective bridge functions can be neglected. We find that the hypernetted chain approximation does not account properly for the effective interactions in layered systems. PMID:20459160

  18. Porting the ion-solid interaction code TRIM-RC to a parallel computer

    NASA Astrophysics Data System (ADS)

    Knapp, J. A.; Brice, D. K.; Doyle, B. L.

    1999-10-01

    TRIM-RC is a Monte Carlo ion-solid interactions code, widely used for calculations of ion implantation profiles and in ion beam analysis problems It has heretofore been used on single-processor machines. A new ion beam analysis project at Sandia involves measuring hydrogen profiles in neutron tube components using forward recoil scattering. To calibrate the analyses, TRIM calculations with large numbers of ion trajectories were needed, well beyond what was practical with a single processor. To meet this requirement, TRIM-RC was re-written to run on the Teraflop parallel computer at Sandia. Many of the considerations which were involved in the port, including random number generation and load leveling, will be described and are common to any Monte Carlo calculation being transferred to a parallel computer. The increase in speed we achieved on the Teraflop is over three orders of magnitude relative to a single-processor workstation. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under Contract DE-AC04-94AL85000.

  19. Determination of interaction forces between parallel dislocations by the evaluation of J integrals of plane elasticity

    NASA Astrophysics Data System (ADS)

    Lubarda, Vlado A.

    2016-03-01

    The Peach-Koehler expressions for the glide and climb components of the force exerted on a straight dislocation in an infinite isotropic medium by another straight dislocation are derived by evaluating the plane and antiplane strain versions of J integrals around the center of the dislocation. After expressing the elastic fields as the sums of elastic fields of each dislocation, the energy momentum tensor is decomposed into three parts. It is shown that only one part, involving mixed products from the two dislocation fields, makes a nonvanishing contribution to J integrals and the corresponding dislocation forces. Three examples are considered, with dislocations on parallel or intersecting slip planes. For two edge dislocations on orthogonal slip planes, there are two equilibrium configurations in which the glide and climb components of the dislocation force simultaneously vanish. The interactions between two different types of screw dislocations and a nearby circular void, as well as between parallel line forces in an infinite or semi-infinite medium, are then evaluated.

  20. Use of Hilbert Curves in Parallelized CUDA code: Interaction of Interstellar Atoms with the Heliosphere

    NASA Astrophysics Data System (ADS)

    Destefano, Anthony; Heerikhuisen, Jacob

    2015-04-01

    Fully 3D particle simulations can be a computationally and memory expensive task, especially when high resolution grid cells are required. The problem becomes further complicated when parallelization is needed. In this work we focus on computational methods to solve these difficulties. Hilbert curves are used to map the 3D particle space to the 1D contiguous memory space. This method of organization allows for minimized cache misses on the GPU as well as a sorted structure that is equivalent to an octal tree data structure. This type of sorted structure is attractive for uses in adaptive mesh implementations due to the logarithm search time. Implementations using the Message Passing Interface (MPI) library and NVIDIA's parallel computing platform CUDA will be compared, as MPI is commonly used on server nodes with many CPU's. We will also compare static grid structures with those of adaptive mesh structures. The physical test bed will be simulating heavy interstellar atoms interacting with a background plasma, the heliosphere, simulated from fully consistent coupled MHD/kinetic particle code. It is known that charge exchange is an important factor in space plasmas, specifically it modifies the structure of the heliosphere itself. We would like to thank the Alabama Supercomputer Authority for the use of their computational resources.

  1. Massively parallel measurements of molecular interaction kinetics on a microfluidic platform

    PubMed Central

    Geertz, Marcel; Shore, David; Maerkl, Sebastian J.

    2012-01-01

    Quantitative biology requires quantitative data. No high-throughput technologies exist capable of obtaining several hundred independent kinetic binding measurements in a single experiment. We present an integrated microfluidic device (k-MITOMI) for the simultaneous kinetic characterization of 768 biomolecular interactions. We applied k-MITOMI to the kinetic analysis of transcription factor (TF)—DNA interactions, measuring the detailed kinetic landscapes of the mouse TF Zif268, and the yeast TFs Tye7p, Yox1p, and Tbf1p. We demonstrated the integrated nature of k-MITOMI by expressing, purifying, and characterizing 27 additional yeast transcription factors in parallel on a single device. Overall, we obtained 2,388 association and dissociation curves of 223 unique molecular interactions with equilibrium dissociation constants ranging from 2 × 10-6 M to 2 × 10-9 M, and dissociation rate constants of approximately 6 s-1 to 8.5 × 10-3 s-1. Association rate constants were uniform across 3 TF families, ranging from 3.7 × 106 M-1 s-1 to 9.6 × 107 M-1 s-1, and are well below the diffusion limit. We expect that k-MITOMI will contribute to our quantitative understanding of biological systems and accelerate the development and characterization of engineered systems. PMID:23012409

  2. Parallel kinetic Monte Carlo simulation framework incorporating accurate models of adsorbate lateral interactions

    NASA Astrophysics Data System (ADS)

    Nielsen, Jens; d'Avezac, Mayeul; Hetherington, James; Stamatakis, Michail

    2013-12-01

    Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. More recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion.

  3. Parallel kinetic Monte Carlo simulation framework incorporating accurate models of adsorbate lateral interactions

    SciTech Connect

    Nielsen, Jens; D’Avezac, Mayeul; Hetherington, James; Stamatakis, Michail

    2013-12-14

    Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. More recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion.

  4. Parallel application of the experiential analysis technique with subject and hypnotist: a new possibility for measuring interactional synchrony.

    PubMed

    Varga, K; Bányai, E I; Gösi-Greguss, A C

    1994-04-01

    The Parallel Experiential Analysis Technique (PEAT), a new method for gathering data on the subjective experiences of both the hypnotist and the subject, is described. The PEAT is an interactional modification of the Experiential Analysis Technique (EAT). Procedural details and methodological observations resulting from the modification of the EAT are discussed. Suggestions on how to characterize the phenomenology of the hypnotic interaction and to determine the degree of interactional synchrony on the subjective level between the hypnotist and subject are made. PMID:8200715

  5. 3D magnetospheric parallel hybrid multi-grid method applied to planet-plasma interactions

    NASA Astrophysics Data System (ADS)

    Leclercq, L.; Modolo, R.; Leblanc, F.; Hess, S.; Mancini, M.

    2016-03-01

    We present a new method to exploit multiple refinement levels within a 3D parallel hybrid model, developed to study planet-plasma interactions. This model is based on the hybrid formalism: ions are kinetically treated whereas electrons are considered as a inertia-less fluid. Generally, ions are represented by numerical particles whose size equals the volume of the cells. Particles that leave a coarse grid subsequently entering a refined region are split into particles whose volume corresponds to the volume of the refined cells. The number of refined particles created from a coarse particle depends on the grid refinement rate. In order to conserve velocity distribution functions and to avoid calculations of average velocities, particles are not coalesced. Moreover, to ensure the constancy of particles' shape function sizes, the hybrid method is adapted to allow refined particles to move within a coarse region. Another innovation of this approach is the method developed to compute grid moments at interfaces between two refinement levels. Indeed, the hybrid method is adapted to accurately account for the special grid structure at the interfaces, avoiding any overlapping grid considerations. Some fundamental test runs were performed to validate our approach (e.g. quiet plasma flow, Alfven wave propagation). Lastly, we also show a planetary application of the model, simulating the interaction between Jupiter's moon Ganymede and the Jovian plasma.

  6. Parallel computation of fluid-structural interactions using high resolution upwind schemes

    NASA Astrophysics Data System (ADS)

    Hu, Zongjun

    An efficient and accurate solver is developed to simulate the non-linear fluid-structural interactions in turbomachinery flutter flows. A new low diffusion E-CUSP scheme, Zha CUSP scheme, is developed to improve the efficiency and accuracy of the inviscid flux computation. The 3D unsteady Navier-Stokes equations with the Baldwin-Lomax turbulence model are solved using the finite volume method with the dual-time stepping scheme. The linearized equations are solved with Gauss-Seidel line iterations. The parallel computation is implemented using MPI protocol. The solver is validated with 2D cases for its turbulence modeling, parallel computation and unsteady calculation. The Zha CUSP scheme is validated with 2D cases, including a supersonic flat plate boundary layer, a transonic converging-diverging nozzle and a transonic inlet diffuser. The Zha CUSP2 scheme is tested with 3D cases, including a circular-to-rectangular nozzle, a subsonic compressor cascade and a transonic channel. The Zha CUSP schemes are proved to be accurate, robust and efficient in these tests. The steady and unsteady separation flows in a 3D stationary cascade under high incidence and three inlet Mach numbers are calculated to study the steady state separation flow patterns and their unsteady oscillation characteristics. The leading edge vortex shedding is the mechanism behind the unsteady characteristics of the high incidence separated flows. The separation flow characteristics is affected by the inlet Mach number. The blade aeroelasticity of a linear cascade with forced oscillating blades is studied using parallel computation. A simplified two-passage cascade with periodic boundary condition is first calculated under a medium frequency and a low incidence. The full scale cascade with 9 blades and two end walls is then studied more extensively under three oscillation frequencies and two incidence angles. The end wall influence and the blade stability are studied and compared under different frequencies and incidence angles. The Zha CUSP schemes are the first time to be applied in moving grid systems and 2D and 3D calculations. The implicit Gauss-Seidel iteration with dual time stepping is the first time to be used for moving grid systems. The NASA flutter cascade is the first time to be calculated in full scale.

  7. Distinct cerebellar lobules process arousal, valence and their interaction in parallel following a temporal hierarchy.

    PubMed

    Styliadis, Charis; Ioannides, Andreas A; Bamidis, Panagiotis D; Papadelis, Christos

    2015-04-15

    The cerebellum participates in emotion-related neural circuits formed by different cortical and subcortical areas, which sub-serve arousal and valence. Recent neuroimaging studies have shown a functional specificity of cerebellar lobules in the processing of emotional stimuli. However, little is known about the temporal component of this process. The goal of the current study is to assess the spatiotemporal profile of neural responses within the cerebellum during the processing of arousal and valence. We hypothesized that the excitation and timing of distinct cerebellar lobules is influenced by the emotional content of the stimuli. By using magnetoencephalography, we recorded magnetic fields from twelve healthy human individuals while passively viewing affective pictures rated along arousal and valence. By using a beamformer, we localized gamma-band activity in the cerebellum across time and we related the foci of activity to the anatomical organization of the cerebellum. Successive cerebellar activations were observed within distinct lobules starting ~160ms after the stimuli onset. Arousal was processed within both vermal (VI and VIIIa) and hemispheric (left Crus II) lobules. Valence (left VI) and its interaction (left V and left Crus I) with arousal were processed only within hemispheric lobules. Arousal processing was identified first at early latencies (160ms) and was long-lived (until 980ms). In contrast, the processing of valence and its interaction to arousal was short lived at later stages (420-530ms and 570-640ms respectively). Our findings provide for the first time evidence that distinct cerebellar lobules process arousal, valence, and their interaction in a parallel yet temporally hierarchical manner determined by the emotional content of the stimuli. PMID:25665964

  8. Experimental Studies of the Interaction Between a Parallel Shear Flow and a Directionally-Solidifying Front

    NASA Technical Reports Server (NTRS)

    Zhang, Meng; Maxworthy, Tony

    1999-01-01

    It has long been recognized that flow in the melt can have a profound influence on the dynamics of a solidifying interface and hence the quality of the solid material. In particular, flow affects the heat and mass transfer, and causes spatial and temporal variations in the flow and melt composition. This results in a crystal with nonuniform physical properties. Flow can be generated by buoyancy, expansion or contraction upon phase change, and thermo-soluto capillary effects. In general, these flows can not be avoided and can have an adverse effect on the stability of the crystal structures. This motivates crystal growth experiments in a microgravity environment, where buoyancy-driven convection is significantly suppressed. However, transient accelerations (g-jitter) caused by the acceleration of the spacecraft can affect the melt, while convection generated from the effects other than buoyancy remain important. Rather than bemoan the presence of convection as a source of interfacial instability, Hurle in the 1960s suggested that flow in the melt, either forced or natural convection, might be used to stabilize the interface. Delves considered the imposition of both a parabolic velocity profile and a Blasius boundary layer flow over the interface. He concluded that fast stirring could stabilize the interface to perturbations whose wave vector is in the direction of the fluid velocity. Forth and Wheeler considered the effect of the asymptotic suction boundary layer profile. They showed that the effect of the shear flow was to generate travelling waves parallel to the flow with a speed proportional to the Reynolds number. There have been few quantitative, experimental works reporting on the coupling effect of fluid flow and morphological instabilities. Huang studied plane Couette flow over cells and dendrites. It was found that this flow could greatly enhance the planar stability and even induce the cell-planar transition. A rotating impeller was buried inside the sample cell, driven by an outside rotating magnet, in order to generate the flow. However, it appears that this was not a well-controlled flow and may also have been unsteady. In the present experimental study, we want to study how a forced parallel shear flow in a Hele-Shaw cell interacts with the directionally solidifying crystal interface. The comparison of experimental data show that the parallel shear flow in a Hele-Shaw cell has a strong stabilizing effect on the planar interface by damping the existing initial perturbations. The flow also shows a stabilizing effect on the cellular interface by slightly reducing the exponential growth rate of cells. The left-right symmetry of cells is broken by the flow with cells tilting toward the incoming flow direction. The tilting angle increases with the velocity ratio. The experimental results are explained through the parallel flow effect on lateral solute transport. The phenomenon of cells tilting against the flow is consistent with the numerical result of Dantzig and Chao.

  9. Interaction of a Rectangular Jet with a Flat-Plate Placed Parallel to the Flow

    NASA Technical Reports Server (NTRS)

    Zaman, K. B. M. Q.; Brown, C. A.; Bridges, J. A.

    2013-01-01

    An experimental study is carried out addressing the flowfield and radiated noise from the interaction of a large aspect ratio rectangular jet with a flat plate placed parallel to but away from the direct path of the jet. Sound pressure level spectra exhibit an increase in the noise levels for both the 'reflected' and 'shielded' sides of the plate relative to the free-jet case. Detailed cross-sectional distributions of flowfield properties obtained by hot-wire anemometry are documented for a low subsonic condition. Corresponding mean Mach number distributions obtained by Pitot-probe surveys are presented for high subsonic conditions. In the latter flow regime and for certain relative locations of the plate, a flow resonance accompanied by audible tones is encountered. Under the resonant condition the jet cross-section experiences an 'axis-switching' and flow visualization indicates the presence of an organized 'vortex street'. The trends of the resonant frequency variation with flow parameters exhibit some similarities to, but also marked differences with, corresponding trends of the well-known edgetone phenomenon.

  10. Large-scale massively parallel atomistic simulations of short pulse laser interaction with metals

    NASA Astrophysics Data System (ADS)

    Wu, Chengping; Zhigilei, Leonid; Computational Materials Group Team

    2014-03-01

    Taking advantage of petascale supercomputing architectures, large-scale massively parallel atomistic simulations (108-109 atoms) are performed to study the microscopic mechanisms of short pulse laser interaction with metals. The results of the simulations reveal a complex picture of highly non-equilibrium processes responsible for material modification and/or ejection. At low laser fluences below the ablation threshold, fast melting and resolidification occur under conditions of extreme heating and cooling rates resulting in surface microstructure modification. At higher laser fluences in the spallation regime, the material is ejected by the relaxation of laser-induced stresses and proceeds through the nucleation, growth and percolation of multiple voids in the sub-surface region of the irradiated target. At a fluence of ~ 2.5 times the spallation threshold, the top part of the target reaches the conditions for an explosive decomposition into vapor and small droplets, marking the transition to the phase explosion regime of laser ablation. The dynamics of plume formation and the characteristics of the ablation plume are obtained from the simulations and compared with the results of time-resolved plume imaging experiments. Financial support for this work was provided by NSF (DMR-0907247 and CMMI-1301298) and AFOSR (FA9550-10-1-0541). Computational support was provided by the OLCF (MAT048) and XSEDE (TG-DMR110090).

  11. Parallelization of the Flow Field Dependent Variation Scheme for Solving the Triple Shock/Boundary Layer Interaction Problem

    NASA Technical Reports Server (NTRS)

    Schunk, Richard Gregory; Chung, T. J.

    2001-01-01

    A parallelized version of the Flowfield Dependent Variation (FDV) Method is developed to analyze a problem of current research interest, the flowfield resulting from a triple shock/boundary layer interaction. Such flowfields are often encountered in the inlets of high speed air-breathing vehicles including the NASA Hyper-X research vehicle. In order to resolve the complex shock structure and to provide adequate resolution for boundary layer computations of the convective heat transfer from surfaces inside the inlet, models containing over 500,000 nodes are needed. Efficient parallelization of the computation is essential to achieving results in a timely manner. Results from a parallelization scheme, based upon multi-threading, as implemented on multiple processor supercomputers and workstations is presented.

  12. Parallelization of the Flow Field Dependent Variation Scheme for Solving the Triple Shock/Boundary Layer Interaction Problem

    NASA Technical Reports Server (NTRS)

    Schunk, Greg; chung, T. J.

    1999-01-01

    A parallelized version of the Flowfield Dependent Variation (FDV) Method is developed to analyze a problem of current research interest, the flowfield resulting from a triple shock/boundary layer interaction. Such flowfields are often encountered in the inlets of high speed air-breathing vehicles including NASA's Hyper-X. In order to resolve the complex shock structure and to provide adequate resolution for boundary layer computations of the convective heat transfer from surfaces inside the inlet, models containing over 500,000 nodes are needed. Efficient parallelization of the computation is essential to obtaining the results in a timely manner. Results from different parallelization schemes, based upon multi-threading and message passing, as implemented on multiple processor supercomputers and on distributed workstations are compared.

  13. Parallel diffusion of energetic particles interacting with noisy reduced MHD turbulence

    NASA Astrophysics Data System (ADS)

    Reimer, A.; Shalchi, A.

    2016-03-01

    We investigate analytically parallel diffusion in noisy reduced magnetohydrodynamic (NRMHD) turbulence. We employ different theories such as quasi-linear theory, second-order quasi-linear theory, and the weakly non-linear theory to compute the parallel diffusion coefficient. Our analytical findings are compared with test-particle simulations performed previously. We demonstrate systematically that quasi-linear theory does not work for the turbulence model considered here because it provides an infinite parallel diffusion coefficient. The second-order theory, on the other hand, provides a finite parallel mean free path which is, however, too large. Only by using the weakly non-linear theory we can reproduce the simulations and, thus, we conclude that resonance broadening due to perpendicular diffusion is an important effect if it comes to particle transport along the mean field in NRMHD turbulence.

  14. Request queues for interactive clients in a shared file system of a parallel computing system

    DOEpatents

    Bent, John M.; Faibish, Sorin

    2015-08-18

    Interactive requests are processed from users of log-in nodes. A metadata server node is provided for use in a file system shared by one or more interactive nodes and one or more batch nodes. The interactive nodes comprise interactive clients to execute interactive tasks and the batch nodes execute batch jobs for one or more batch clients. The metadata server node comprises a virtual machine monitor; an interactive client proxy to store metadata requests from the interactive clients in an interactive client queue; a batch client proxy to store metadata requests from the batch clients in a batch client queue; and a metadata server to store the metadata requests from the interactive client queue and the batch client queue in a metadata queue based on an allocation of resources by the virtual machine monitor. The metadata requests can be prioritized, for example, based on one or more of a predefined policy and predefined rules.

  15. Structural Studies on Porphyrin-PNA Conjugates in Parallel PNA:PNA Duplexes: Effect of Stacking Interactions on Helicity.

    PubMed

    Accetta, Alessandro; Petrovic, Ana G; Marchelli, Rosangela; Berova, Nina; Corradini, Roberto

    2015-12-01

    Parallel PNA:PNA duplexes were synthesized and conjugated with meso-tris(pyridyl)phenylporphyrin carboxylic acid at the N-terminus. The introduction of one porphyrin unit was shown to affect slightly the stability of the PNA:PNA parallel duplex, whereas the presence of two porphyrin units at the same end resulted in a dramatic increase of the melting temperature, accompanied by hysteresis between melting and cooling curves. The circular dichroism (CD) profile of the Soret band and fluorescence quenching strongly support the occurrence of a face-to-face interaction between the two porphyrin units. Introduction of a L-lysine residue at the C-terminal of one strand of the parallel duplex induced a left-handed helical structure in the PNA:PNA duplex if the latter contains only one or no porphyrin moiety. The left-handed helicity was revealed by nucleobase CD profile at 240-280 nm and by the induced-CD observed in the presence of the DiSC2 (5) cyanine dye at ~500-550 nm. Surprisingly, the presence of two porphyrin units led to the disappearance of the nucleobase CD signal and the absence of CD exciton coupling within the Soret band region. In addition, a dramatic decrease of induced CD of DiSC2 (5) was observed. These results are in agreement with a model where the porphyrin-porphyrin interactions cause partial loss of chirality of the PNA:PNA parallel duplex, forcing it to adopt a ladder-like conformation. PMID:26412743

  16. Parallel PIC Simulations of Ultra-High Intensity Laser Plasma Interactions.

    NASA Astrophysics Data System (ADS)

    Lasinski, B. F.; Still, C. H.; Langdon, A. B.; Wilks, S. C.; Hatchett, S. P.; Hinkel, D. E.

    1999-11-01

    We extend our previous simulations of high intensity short pulse laser plasma interactionsfootnote B. F. Lasinski, A. B. Langdon, S. P. Hatchett, M. H. Key, and M. Tabak, Phys. Plasmas 6, 2041 (1999); S. C. Wilks and W. L. Kruer, IEEE Journal of Quantum Electronics 11, 1954 (1997). to 3D and to much larger systems in 2D using our new, modern, 3D, electromagnetic, fully relativistic, massively parallel PIC code. Our simulation parameters are guided by the recent Petawatt experiments at Livermore. We study the generation of hot electrons and energetic ions and the associated complex phenomena. Laser light filamentation and the formation of high static magnetic fields are described.

  17. MPI parallelization of Vlasov codes for the simulation of nonlinear laser-plasma interactions

    NASA Astrophysics Data System (ADS)

    Savchenko, V.; Won, K.; Afeyan, B.; Decyk, V.; Albrecht-Marc, M.; Ghizzo, A.; Bertrand, P.

    2003-10-01

    The simulation of optical mixing driven KEEN waves [1] and electron plasma waves [1] in laser-produced plasmas require nonlinear kinetic models and massive parallelization. We use Massage Passing Interface (MPI) libraries and Appleseed [2] to solve the Vlasov Poisson system of equations on an 8 node dual processor MAC G4 cluster. We use the semi-Lagrangian time splitting method [3]. It requires only row-column exchanges in the global data redistribution, minimizing the total number of communications between processors. Recurrent communication patterns for 2D FFTs involves global transposition. In the Vlasov-Maxwell case, we use splitting into two 1D spatial advections and a 2D momentum advection [4]. Discretized momentum advection equations have a double loop structure with the outer index being assigned to different processors. We adhere to a code structure with separate routines for calculations and data management for parallel computations. [1] B. Afeyan et al., IFSA 2003 Conference Proceedings, Monterey, CA [2] V. K. Decyk, Computers in Physics, 7, 418 (1993) [3] Sonnendrucker et al., JCP 149, 201 (1998) [4] Begue et al., JCP 151, 458 (1999)

  18. Parallel adaptive fluid-structure interaction simulation of explosions impacting on building structures

    SciTech Connect

    Deiterding, Ralf; Wood, Stephen L

    2013-01-01

    We pursue a level set approach to couple an Eulerian shock-capturing fluid solver with space-time refinement to an explicit solid dynamics solver for large deformations and fracture. The coupling algorithms considering recursively finer fluid time steps as well as overlapping solver updates are discussed in detail. Our ideas are implemented in the AMROC adaptive fluid solver framework and are used for effective fluid-structure coupling to the general purpose solid dynamics code DYNA3D. Beside simulations verifying the coupled fluid-structure solver and assessing its parallel scalability, the detailed structural analysis of a reinforced concrete column under blast loading and the simulation of a prototypical blast explosion in a realistic multistory building are presented.

  19. Electrophysiological interaction through the interstitial space between adjacent unmyelinated parallel fibers.

    PubMed Central

    Barr, R C; Plonsey, R

    1992-01-01

    The influence of interstitial or extracellular potentials on propagation usually has been ignored, often through assuming these potentials to be insignificantly different from zero, presumably because both measurements and calculations become much more complex when interstitial interactions are included. This study arose primarily from an interest in cardiac muscle, where it has been well established that substantial interstitial potentials occur in tightly packed structures, e.g., tens of millivolts within the ventricular wall. We analyzed the electrophysiological interaction between two adjacent unmyelinated fibers within a restricted extracellular space. Numerical evaluations made use of two linked core-conductor models and Hodgkin-Huxley membrane properties. Changes in transmembrane potentials induced in the second fiber ranged from nonexistent with large intervening volumes to large enough to initiate excitation when fibers were coupled by interstitial currents through a small interstitial space. With equal interstitial and intracellular longitudinal conductivities and close coupling, the interaction was large enough (induced Vm approximately 20 mV peak-to-peak) that action potentials from one fiber initiated excitation in the other, for the 40-microns radius evaluated. With close coupling but no change in structure, propagation velocity in the first fiber varied from 1.66 mm/ms (when both fibers were simultaneously stimulated) to 2.84 mm/ms (when the second fiber remained passive). Although normal propagation through interstitial interaction is unlikely, the magnitudes of the electrotonic interactions were large and may have a substantial modulating effect on function. Images FIGURE 1 PMID:1600078

  20. Tn-seq; high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms

    PubMed Central

    van Opijnen, Tim; Bodi, Kip L.; Camilli, Andrew

    2009-01-01

    Biological pathways are structured in complex networks of interacting genes. Solving the architecture of such networks may provide valuable information, such as how microorganisms cause disease. Here we present a method (Tn-seq) for accurately determining quantitative genetic interactions on a genome-wide scale in microorganisms. Tn-seq is based on the assembly of a saturated Mariner transposon insertion library. After library selection, changes in frequency of each insertion mutant are determined by sequencing of the flanking regions en masse. These changes are used to calculate each mutant’s fitness. Fitness was determined for each gene of the gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis. A genome-wide screen for genetic interactions identified both alleviating and aggravating interactions that could be further divided into seven distinct categories. Due to the wide activity of the Mariner transposon, Tn-seq has the potential to contribute to the exploration of complex pathways across many different species. PMID:19767758

  1. Gamma ray bursts from comet neutron star magnetosphere interaction, field twisting and E sub parallel formation

    SciTech Connect

    Colgate, S.A.

    1990-01-01

    Consider the problem of a comet in a collision trajectory with a magnetized neutron star. The question addressed in this paper is whether the comet interacts strongly enough with a magnetic field such as to capture at a large radius or whether in general the comet will escape a magnetized neutron star. 6 refs., 4 figs.

  2. The grid-based fast multipole method--a massively parallel numerical scheme for calculating two-electron interaction energies.

    PubMed

    Toivanen, Elias A; Losilla, Sergio A; Sundholm, Dage

    2015-12-21

    Algorithms and working expressions for a grid-based fast multipole method (GB-FMM) have been developed and implemented. The computational domain is divided into cubic subdomains, organized in a hierarchical tree. The contribution to the electrostatic interaction energies from pairs of neighboring subdomains is computed using numerical integration, whereas the contributions from further apart subdomains are obtained using multipole expansions. The multipole moments of the subdomains are obtained by numerical integration. Linear scaling is achieved by translating and summing the multipoles according to the tree structure, such that each subdomain interacts with a number of subdomains that are almost independent of the size of the system. To compute electrostatic interaction energies of neighboring subdomains, we employ an algorithm which performs efficiently on general purpose graphics processing units (GPGPU). Calculations using one CPU for the FMM part and 20 GPGPUs consisting of tens of thousands of execution threads for the numerical integration algorithm show the scalability and parallel performance of the scheme. For calculations on systems consisting of Gaussian functions (α = 1) distributed as fullerenes from C20 to C720, the total computation time and relative accuracy (ppb) are independent of the system size. PMID:26006111

  3. Parallel Three-Dimensional Computation of Fluid Dynamics and Fluid-Structure Interactions of Ram-Air Parachutes

    NASA Technical Reports Server (NTRS)

    Tezduyar, Tayfun E.

    1998-01-01

    This is a final report as far as our work at University of Minnesota is concerned. The report describes our research progress and accomplishments in development of high performance computing methods and tools for 3D finite element computation of aerodynamic characteristics and fluid-structure interactions (FSI) arising in airdrop systems, namely ram-air parachutes and round parachutes. This class of simulations involves complex geometries, flexible structural components, deforming fluid domains, and unsteady flow patterns. The key components of our simulation toolkit are a stabilized finite element flow solver, a nonlinear structural dynamics solver, an automatic mesh moving scheme, and an interface between the fluid and structural solvers; all of these have been developed within a parallel message-passing paradigm.

  4. OSIRIS - an object-oriented parallel 3D PIC code for modeling laser and particle beam-plasma interaction

    NASA Astrophysics Data System (ADS)

    Hemker, Roy

    1999-11-01

    The advances in computational speed make it now possible to do full 3D PIC simulations of laser plasma and beam plasma interactions, but at the same time the increased complexity of these problems makes it necessary to apply modern approaches like object oriented programming to the development of simulation codes. We report here on our progress in developing an object oriented parallel 3D PIC code using Fortran 90. In its current state the code contains algorithms for 1D, 2D, and 3D simulations in cartesian coordinates and for 2D cylindrically-symmetric geometry. For all of these algorithms the code allows for a moving simulation window and arbitrary domain decomposition for any number of dimensions. Recent 3D simulation results on the propagation of intense laser and electron beams through plasmas will be presented.

  5. DNS of hydrodynamically interacting droplets in turbulent clouds: Parallel implementation and scalability analysis using 2D domain decomposition

    NASA Astrophysics Data System (ADS)

    Ayala, Orlando; Parishani, Hossein; Chen, Liu; Rosa, Bogdan; Wang, Lian-Ping

    2014-12-01

    The study of turbulent collision of cloud droplets requires simultaneous considerations of the transport by background air turbulence (i.e., geometric collision rate) and influence of droplet disturbance flows (i.e., collision efficiency). In recent years, this multiscale problem has been addressed through a hybrid direct numerical simulation (HDNS) approach (Ayala et al., 2007). This approach, while currently is the only viable tool to quantify the effects of air turbulence on collision statistics, is computationally expensive. In order to extend the HDNS approach to higher flow Reynolds numbers, here we developed a highly scalable implementation of the approach using 2D domain decomposition. The scalability of the parallel implementation was studied using several parallel computers, at 5123 and 10243 grid resolutions with O(106)-O(107) droplets. It was found that the execution time scaled with number of processors almost linearly until it saturates and deteriorates due to communication latency issues. To better understand the scalability, we developed a complexity analysis by partitioning the execution tasks into computation, communication, and data copy. Using this complexity analysis, we were able to predict the scalability performance of our parallel code. Furthermore, the theory was used to estimate the maximum number of processors below which the approximately linear scalability is sustained. We theoretically showed that we could efficiently solved problems of up to 81923 with O(100,000) processors. The complexity analysis revealed that the pseudo-spectral simulation of background turbulent flow for a dilute droplet suspension typical of cloud conditions typically takes about 80% of the total execution time, except when the droplets are small (less than 5 μm in a flow with energy dissipation rate of 400 cm2/s3 and liquid water content of 1 g/m3), for which case the particle-particle hydrodynamic interactions become the bottleneck. The complexity analysis was also used to explore some alternative methods to handle FFT calculations within the flow simulation and to advance droplets less than 5 μm in radius, for better computational efficiency. Finally, preliminary results are reported to shed light on the Reynolds number-dependence of collision kernel of non-interacting droplets.

  6. Effects of sex steroids on bones and muscles: Similarities, parallels, and putative interactions in health and disease.

    PubMed

    Carson, James A; Manolagas, Stavros C

    2015-11-01

    Estrogens and androgens influence the growth and maintenance of bones and muscles and are responsible for their sexual dimorphism. A decline in their circulating levels leads to loss of mass and functional integrity in both tissues. In the article, we highlight the similarities of the molecular and cellular mechanisms of action of sex steroids in the two tissues; the commonality of a critical role of mechanical forces on tissue mass and function; emerging evidence for an interplay between mechanical forces and hormonal and growth factor signals in both bones and muscles; as well as the current state of evidence for or against a cross-talk between muscles and bone. In addition, we review evidence for the parallels in the development of osteoporosis and sarcopenia with advancing age and the potential common mechanisms responsible for the age-dependent involution of these two tissues. Lastly, we discuss the striking difference in the availability of several drug therapies for the prevention and treatment of osteoporosis, as compared to none for sarcopenia. This article is part of a Special Issue entitled "Muscle Bone Interactions". PMID:26453497

  7. Interaction of parallel strike-slip faults and a characteristic distance in the spatial distribution of active faults

    NASA Astrophysics Data System (ADS)

    Kato, Naoyuki; Lei, Xinglin

    2001-01-01

    A numerical simulation of the activities of many parallel strike-slip faults is performed to explore the effect of the interaction of fault slip on the spatial distribution of active faults. In the model, a large number of faults with random strengths are embedded in an elastic layer (lithosphere) over a Maxwell-type viscoelastic half-space (asthenosphere) and shear loading of a constant strain rate is applied. When slip takes place on a model fault, shear stress is decreased around the fault and then recovered with time due to the viscoelastic response of the asthenosphere. The decrease in shear stress prohibits the occurrence of another earthquake around the slipped fault, resulting in the existence of a characteristic distance between active faults. This characteristic distance is found to be controlled by the thickness of the elastic layer, the strain rate and the viscoelastic relaxation time. The density of simulated active faults increases with the strain rate, consistent with observations of active faults in Japan. Furthermore, the present simulation result may explain the characteristic distance which breaks the fractal structure of the spatial distribution of active faults in Japan, which was discovered by Lei & Kusunose (1999).

  8. Development of a Multi-Grids Approach into a Parallelized Hybrid Model to Describe Ganymede's Interaction with the Jovian Plasma

    NASA Astrophysics Data System (ADS)

    Leclercq, L.; Modolo, R.; Leblanc, F.; Hess, S. L.; Andre, N.

    2014-12-01

    Ganymede is the only satellite which has its own magnetosphere, which is embedded in the Jovian magnetosphere (Kivelson et al. 1996). This peculiar interaction has been investigated by means of a 3D parallel multi-species hybrid model based on a CAM-CL algorithm (Mathews et al. 1994). In this formalism, ions have a kinetic description whereas electrons are considered as an inertialess fluid which ensures the neutrality of the plasma and contributes to the total current and electronic pressure. Maxwell's equations are solved to compute the temporal evolution of electromagnetic field. Hybrid simulations are performed on a uniform cartesian grid with a spatial resolution of about 240 km. Our results are globally consistent with other models and Galileo measurements. Nevertheless, our description of the magnetopause and the ionosphere is not satisfying enough due to the low spatial resolution. Indeed, we want to describe scale heights of 125 km in the ionosphere whereas the best spatial resolution that we are allowed to use is about 240 km. Therefore, in order to obtain more efficient and relevant results, it is necessary to improve the size of the grid. In this optic, we are introducing a multi-grids approach in order to refine the spatial resolution by a factor 2 (~120km) near Ganymede. The creation of a finer mesh in the simulation grid leads to make some peculiar computations at the interfaces between the two different grids, whether for the calculation of moments, such as charge density or current, or the computation of electromagnetic fields. Moreover, the parallelization of the code, based on domain decomposition methods, imposes us to take care of boundary conditions. In the hybrid model, macroparticules, which represent a kind of cloud of physical particles, have a volume equal to that of a grid cell. Then, the macroparticules entering into the higher spatial resolution region are splited into smaller macroparticules whose the volume corresponds to the volume of a cell of the finer mesh. The improvement of the spatial resolution in the hybrid model will also allow us to relevantly couple the results of this model with those of our 3D multi-species exospheric model (Turc et al. 2014), into a test-particle model that describes the ionosphere of Ganymede. Basic tests and validation results of the multi-grids approach are presented.

  9. Parallel Atomistic Simulations

    SciTech Connect

    HEFFELFINGER,GRANT S.

    2000-01-18

    Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.

  10. Parallel rendering

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1995-01-01

    This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.

  11. Scalable parallel methods for monolithic coupling in fluid-structure interaction with application to blood flow modeling

    SciTech Connect

    Barker, Andrew T. Cai Xiaochuan

    2010-02-01

    We introduce and study numerically a scalable parallel finite element solver for the simulation of blood flow in compliant arteries. The incompressible Navier-Stokes equations are used to model the fluid and coupled to an incompressible linear elastic model for the blood vessel walls. Our method features an unstructured dynamic mesh capable of modeling complicated geometries, an arbitrary Lagrangian-Eulerian framework that allows for large displacements of the moving fluid domain, monolithic coupling between the fluid and structure equations, and fully implicit time discretization. Simulations based on blood vessel geometries derived from patient-specific clinical data are performed on large supercomputers using scalable Newton-Krylov algorithms preconditioned with an overlapping restricted additive Schwarz method that preconditions the entire fluid-structure system together. The algorithm is shown to be robust and scalable for a variety of physical parameters, scaling to hundreds of processors and millions of unknowns.

  12. Stochastic gyroresonant electron acceleration in a low-beta plasma. I - Interaction with parallel transverse cold plasma waves

    NASA Technical Reports Server (NTRS)

    Steinacker, Juergen; Miller, James A.

    1992-01-01

    The gyroresonance of electrons with parallel transverse cold plasma waves is considered, and the Fokker-Planck equation describing the evolution of the electron distribution function in the presence of a spectrum of turbulence is derived. A new resonance which produces a divergence in the Fokker-Planck coefficients is identified; it results when the electron is in gyroresonance with a wave that has a group velocity equal to the velocity of the electron along the magnetic field. Under the assumption of a power-law spectral density, the Fokker-Planck coefficients are calculated numerically, and their complicated momentum and pitch-angle dependence, as well as the influence of various approximations to the dispersion relation, gyroresonance condition, and spectral density are discussed. It is found that there is no resonance gap at any pitch angle as long as the full gyroresonance condition is used and waves propagating on both directions are present.

  13. Three-dimensional reconnection in the outer heliosphere: interactions between parallel current sheets, and the effects of interstellar pick-up ions

    NASA Astrophysics Data System (ADS)

    Gingell, Peter; Burgess, David; Matteini, Lorenzo

    2015-04-01

    We examine the evolution of a three-dimensional system comprising a series of closely packed, parallel current sheets. Each individual current sheet may be subject to a tearing instability, and hence generate magnetic islands and hot populations of ions associated with magnetic reconnection. However, previous studies have shown that a drift-kink instability can significantly affect the three-dimensional evolution of each current sheet, leading to an effective widening that can reduce reconnection rates and limit magnetic island formation compared to the two-dimensional case. This system also introduces the possibility of interaction between adjacent current sheets, leading to a complex magnetic topology, perpendicular particle transport, and a turbulent end-state. The evolution of this system has important consequences for the structure of the outer heliosphere, where pile-up of parallel current sheets is expected to produce a sectored heliosheath. In order to better model this region, we also introduce a population of interstellar H+ pick-up ions, which may dominate the pressure in the region and significantly alter the spectra of the otherwise largely monochromatic drift-kink instability. We will discuss the evolution of this system with particular focus on particle heating and transport, and the turbulent spectrum of the fluctuations generated by current sheet interactions.

  14. 3-D Hybrid Kinetic Modeling of the Interaction Between the Solar Wind and Lunar-like Exospheric Pickup Ions in Case of Oblique/ Quasi-Parallel/Parallel Upstream Magnetic Field

    NASA Technical Reports Server (NTRS)

    Lipatov, A. S.; Farrell, W. M.; Cooper, J. F.; Sittler, E. C., Jr.; Hartle, R. E.

    2015-01-01

    The interactions between the solar wind and Moon-sized objects are determined by a set of the solar wind parameters and plasma environment of the space objects. The orientation of upstream magnetic field is one of the key factors which determines the formation and structure of bow shock wave/Mach cone or Alfven wing near the obstacle. The study of effects of the direction of the upstream magnetic field on lunar-like plasma environment is the main subject of our investigation in this paper. Photoionization, electron-impact ionization and charge exchange are included in our hybrid model. The computational model includes the self-consistent dynamics of the light (hydrogen (+), helium (+)) and heavy (sodium (+)) pickup ions. The lunar interior is considered as a weakly conducting body. Our previous 2013 lunar work, as reported in this journal, found formation of a triple structure of the Mach cone near the Moon in the case of perpendicular upstream magnetic field. Further advances in modeling now reveal the presence of strong wave activity in the upstream solar wind and plasma wake in the cases of quasiparallel and parallel upstream magnetic fields. However, little wave activity is found for the opposite case with a perpendicular upstream magnetic field. The modeling does not show a formation of the Mach cone in the case of theta(Sub B,U) approximately equal to 0 degrees.

  15. Interaction of a finite-length ion beam with a background plasma - Reflected ions at the quasi-parallel bow shock

    NASA Technical Reports Server (NTRS)

    Onsager, T. G.; Winske, D.; Thomsen, M. F.

    1991-01-01

    The coupling of a finite-length, field-aligned, ion beam with a uniform background plasma is investigated using one-dimensional hybrid computer simulations. The finite-length beam is used to study the interaction between the incident solar wind and ions reflected from the earth's quasi-parallel bow shock, where the reflection process may vary with time. The coupling between the reflected ions and the solar wind is relevant to ion heating at the bow shock and possibly to the formation of hot, flow anomalies and re-formation of the shock itself. Consistent with linear theory, the waves which dominate the interaction are the electromagnetic right-hand polarized resonant and nonresonant modes. However, in addition to the instability growth rates, the length of time that the waves are in contact with the beam is also an important factor in determining which wave mode will dominate the interaction. It is found that interaction will result in strong coupling, where a significant fraction of the available free energy is converted into thermal energy in a short time, provided the beam is sufficiently dense or sufficiently long.

  16. Wind tunnel tests of a two bladed model rotor to evaluate the TAMI system in descending forward flight

    NASA Technical Reports Server (NTRS)

    White, R. P., Jr.

    1977-01-01

    A research investigation was conducted to assess the potential of the Tip Air Mass Injection system in reducing the noise output during blade vortex interaction in descending low speed flight. In general it was concluded that the noise output due to blade vortex interaction can be reduced by 4 to 6 db with an equivalent power expenditure of approximately 14 percent of installed power.

  17. Parallel quicksort

    SciTech Connect

    Vrto, I. ); Chelbus, B.S. )

    1991-04-01

    This paper reports on the development of a parallel version of quicksort on a CRCW PRAM. The algorithm uses n processors and a linear space to sort n keys in the expected time O(log n) with large probability.

  18. Parallel machines: Parallel machine languages

    SciTech Connect

    Iannucci, R.A. )

    1990-01-01

    This book presents a framework for understanding the tradeoffs between the conventional view and the dataflow view with the objective of discovering the critical hardware structures which must be present in any scalable, general-purpose parallel computer to effectively tolerate latency and synchronization costs. The author presents an approach to scalable general purpose parallel computation. Linguistic Concerns, Compiling Issues, Intermediate Language Issues, and hardware/technological constraints are presented as a combined approach to architectural Develoement. This book presents the notion of a parallel machine language.

  19. Massively parallel multiple interacting continua formulation for modeling flow in fractured porous media using the subsurface reactive flow and transport code PFLOTRAN

    NASA Astrophysics Data System (ADS)

    Kumar, J.; Mills, R. T.; Lichtner, P. C.; Hammond, G. E.

    2010-12-01

    Fracture dominated flows occur in numerous subsurface geochemical processes and at many different scales in rock pore structures, micro-fractures, fracture networks and faults. Fractured porous media can be modeled as multiple interacting continua which are connected to each other through transfer terms that capture the flow of mass and energy in response to pressure, temperature and concentration gradients. However, the analysis of large-scale transient problems using the multiple interacting continuum approach presents an algorithmic and computational challenge for problems with very large numbers of degrees of freedom. A generalized dual porosity model based on the Dual Continuum Disconnected Matrix approach has been implemented within a massively parallel multiphysics-multicomponent-multiphase subsurface reactive flow and transport code PFLOTRAN. Developed as part of the Department of Energy's SciDAC-2 program, PFLOTRAN provides subsurface simulation capabilities that can scale from laptops to ultrascale supercomputers, and utilizes the PETSc framework to solve the large, sparse algebraic systems that arises in complex subsurface reactive flow and transport problems. It has been successfully applied to the solution of problems composed of more than two billions degrees of freedom, utilizing up to 131,072 processor cores on Jaguar, the Cray XT5 system at Oak Ridge National Laboratory that is the world’s fastest supercomputer. Building upon the capabilities and computational efficiency of PFLOTRAN, we will present an implementation of the multiple interacting continua formulation for fractured porous media along with an application case study.

  20. A 3D GPU-accelerated MPI-parallel computational tool for simulating interaction of moving rigid bodies with two-fluid flows

    NASA Astrophysics Data System (ADS)

    Pathak, Ashish; Raessi, Mehdi

    2014-11-01

    We present a 3D MPI-parallel, GPU-accelerated computational tool that captures the interaction between a moving rigid body and two-fluid flows. Although the immediate application is the study of ocean wave energy converters (WECs), the model was developed at a general level and can be used in other applications. Solving the full Navier-Stokes equations, the model is able to capture non-linear effects, including wave-breaking and fluid-structure interaction, that have significant impact on WEC performance. To transport mass and momentum, we use a consistent scheme that can handle large density ratios (e.g. air/water). We present a novel reconstruction scheme for resolving three-phase (solid-liquid-gas) cells in the volume-of-fluid context, where the fluid interface orientation is estimated via a minimization procedure, while imposing a contact angle. The reconstruction allows for accurate mass and momentum transport in the vicinity of three-phase cells. The fast-fictitious-domain method is used for capturing the interaction between a moving rigid body and two-fluid flow. The pressure Poisson solver is accelerated using GPUs in the MPI framework. We present results of an array of test cases devised to assess the performance and accuracy of the computational tool.

  1. Interactive Visualization of Large-Scale Hydrological Data using Emerging Technologies in Web Systems and Parallel Programming

    NASA Astrophysics Data System (ADS)

    Demir, I.; Krajewski, W. F.

    2013-12-01

    As geoscientists are confronted with increasingly massive datasets from environmental observations to simulations, one of the biggest challenges is having the right tools to gain scientific insight from the data and communicate the understanding to stakeholders. Recent developments in web technologies make it easy to manage, visualize and share large data sets with general public. Novel visualization techniques and dynamic user interfaces allow users to interact with data, and modify the parameters to create custom views of the data to gain insight from simulations and environmental observations. This requires developing new data models and intelligent knowledge discovery techniques to explore and extract information from complex computational simulations or large data repositories. Scientific visualization will be an increasingly important component to build comprehensive environmental information platforms. This presentation provides an overview of the trends and challenges in the field of scientific visualization, and demonstrates information visualization and communication tools developed within the light of these challenges.

  2. Parallel pipelining

    SciTech Connect

    Joseph, D.D.; Bai, R.; Liao, T.Y.; Huang, A.; Hu, H.H.

    1995-09-01

    In this paper the authors introduce the idea of parallel pipelining for water lubricated transportation of oil (or other viscous material). A parallel system can have major advantages over a single pipe with respect to the cost of maintenance and continuous operation of the system, to the pressure gradients required to restart a stopped system and to the reduction and even elimination of the fouling of pipe walls in continuous operation. The authors show that the action of capillarity in small pipes is more favorable for restart than in large pipes. In a parallel pipeline system, they estimate the number of small pipes needed to deliver the same oil flux as in one larger pipe as N = (R/r){sup {alpha}}, where r and R are the radii of the small and large pipes, respectively, and {alpha} = 4 or 19/7 when the lubricating water flow is laminar or turbulent.

  3. Activity and interactions of methane seep microorganisms assessed by parallel transcription and FISH-NanoSIMS analyses

    PubMed Central

    Dekas, Anne E; Connon, Stephanie A; Chadwick, Grayson L; Trembath-Reichert, Elizabeth; Orphan, Victoria J

    2016-01-01

    To characterize the activity and interactions of methanotrophic archaea (ANME) and Deltaproteobacteria at a methane-seeping mud volcano, we used two complimentary measures of microbial activity: a community-level analysis of the transcription of four genes (16S rRNA, methyl coenzyme M reductase A (mcrA), adenosine-5′-phosphosulfate reductase α-subunit (aprA), dinitrogenase reductase (nifH)), and a single-cell-level analysis of anabolic activity using fluorescence in situ hybridization coupled to nanoscale secondary ion mass spectrometry (FISH-NanoSIMS). Transcript analysis revealed that members of the deltaproteobacterial groups Desulfosarcina/Desulfococcus (DSS) and Desulfobulbaceae (DSB) exhibit increased rRNA expression in incubations with methane, suggestive of ANME-coupled activity. Direct analysis of anabolic activity in DSS cells in consortia with ANME by FISH-NanoSIMS confirmed their dependence on methanotrophy, with no 15NH4+ assimilation detected without methane. In contrast, DSS and DSB cells found physically independent of ANME (i.e., single cells) were anabolically active in incubations both with and without methane. These single cells therefore comprise an active ‘free-living' population, and are not dependent on methane or ANME activity. We investigated the possibility of N2 fixation by seep Deltaproteobacteria and detected nifH transcripts closely related to those of cultured diazotrophic Deltaproteobacteria. However, nifH expression was methane-dependent. 15N2 incorporation was not observed in single DSS cells, but was detected in single DSB cells. Interestingly, 15N2 incorporation in single DSB cells was methane-dependent, raising the possibility that DSB cells acquired reduced 15N products from diazotrophic ANME while spatially coupled, and then subsequently dissociated. With this combined data set we address several outstanding questions in methane seep microbial ecosystems and highlight the benefit of measuring microbial activity in the context of spatial associations. PMID:26394007

  4. Activity and interactions of methane seep microorganisms assessed by parallel transcription and FISH-NanoSIMS analyses.

    PubMed

    Dekas, Anne E; Connon, Stephanie A; Chadwick, Grayson L; Trembath-Reichert, Elizabeth; Orphan, Victoria J

    2016-03-01

    To characterize the activity and interactions of methanotrophic archaea (ANME) and Deltaproteobacteria at a methane-seeping mud volcano, we used two complimentary measures of microbial activity: a community-level analysis of the transcription of four genes (16S rRNA, methyl coenzyme M reductase A (mcrA), adenosine-5'-phosphosulfate reductase α-subunit (aprA), dinitrogenase reductase (nifH)), and a single-cell-level analysis of anabolic activity using fluorescence in situ hybridization coupled to nanoscale secondary ion mass spectrometry (FISH-NanoSIMS). Transcript analysis revealed that members of the deltaproteobacterial groups Desulfosarcina/Desulfococcus (DSS) and Desulfobulbaceae (DSB) exhibit increased rRNA expression in incubations with methane, suggestive of ANME-coupled activity. Direct analysis of anabolic activity in DSS cells in consortia with ANME by FISH-NanoSIMS confirmed their dependence on methanotrophy, with no (15)NH4(+) assimilation detected without methane. In contrast, DSS and DSB cells found physically independent of ANME (i.e., single cells) were anabolically active in incubations both with and without methane. These single cells therefore comprise an active 'free-living' population, and are not dependent on methane or ANME activity. We investigated the possibility of N2 fixation by seep Deltaproteobacteria and detected nifH transcripts closely related to those of cultured diazotrophic Deltaproteobacteria. However, nifH expression was methane-dependent. (15)N2 incorporation was not observed in single DSS cells, but was detected in single DSB cells. Interestingly, (15)N2 incorporation in single DSB cells was methane-dependent, raising the possibility that DSB cells acquired reduced (15)N products from diazotrophic ANME while spatially coupled, and then subsequently dissociated. With this combined data set we address several outstanding questions in methane seep microbial ecosystems and highlight the benefit of measuring microbial activity in the context of spatial associations. PMID:26394007

  5. Parallel Dislocation Simulator

    Energy Science and Technology Software Center (ESTSC)

    2006-10-30

    ParaDiS is software capable of simulating the motion, evolution, and interaction of dislocation networks in single crystals using massively parallel computer architectures. The software is capable of outputting the stress-strain response of a single crystal whose plastic deformation is controlled by the dislocation processes.

  6. Aeroacoustic theory for noncompact wing-gust interaction

    NASA Technical Reports Server (NTRS)

    Martinez, R.; Widnall, S. E.

    1981-01-01

    Three aeroacoustic models for noncompact wing-gust interaction were developed for subsonic flow. The first is that for a two dimensional (infinite span) wing passing through an oblique gust. The unsteady pressure field was obtained by the Wiener-Hopf technique; the airfoil loading and the associated acoustic field were calculated, respectively, by allowing the field point down on the airfoil surface, or by letting it go to infinity. The second model is a simple spanwise superposition of two dimensional solutions to account for three dimensional acoustic effects of wing rotation (for a helicopter blade, or some other rotating planform) and of finiteness of wing span. A three dimensional theory for a single gust was applied to calculate the acoustic signature in closed form due to blade vortex interaction in helicopters. The third model is that of a quarter infinite plate with side edge through a gust at high subsonic speed. An approximate solution for the three dimensional loading and the associated three dimensional acoustic field in closed form was obtained. The results reflected the acoustic effect of satisfying the correct loading condition at the side edge.

  7. Non-equilibrium reaction and relaxation dynamics in a strongly interacting explicit solvent: F + CD3CN treated with a parallel multi-state EVB model.

    PubMed

    Glowacki, David R; Orr-Ewing, Andrew J; Harvey, Jeremy N

    2015-07-28

    We describe a parallelized linear-scaling computational framework developed to implement arbitrarily large multi-state empirical valence bond (MS-EVB) calculations within CHARMM and TINKER. Forces are obtained using the Hellmann-Feynman relationship, giving continuous gradients, and good energy conservation. Utilizing multi-dimensional Gaussian coupling elements fit to explicitly correlated coupled cluster theory, we built a 64-state MS-EVB model designed to study the F + CD3CN → DF + CD2CN reaction in CD3CN solvent (recently reported in Dunning et al. [Science 347(6221), 530 (2015)]). This approach allows us to build a reactive potential energy surface whose balanced accuracy and efficiency considerably surpass what we could achieve otherwise. We ran molecular dynamics simulations to examine a range of observables which follow in the wake of the reactive event: energy deposition in the nascent reaction products, vibrational relaxation rates of excited DF in CD3CN solvent, equilibrium power spectra of DF in CD3CN, and time dependent spectral shifts associated with relaxation of the nascent DF. Many of our results are in good agreement with time-resolved experimental observations, providing evidence for the accuracy of our MS-EVB framework in treating both the solute and solute/solvent interactions. The simulations provide additional insight into the dynamics at sub-picosecond time scales that are difficult to resolve experimentally. In particular, the simulations show that (immediately following deuterium abstraction) the nascent DF finds itself in a non-equilibrium regime in two different respects: (1) it is highly vibrationally excited, with ∼23 kcal mol(-1) localized in the stretch and (2) its post-reaction solvation environment, in which it is not yet hydrogen-bonded to CD3CN solvent molecules, is intermediate between the non-interacting gas-phase limit and the solution-phase equilibrium limit. Vibrational relaxation of the nascent DF results in a spectral blue shift, while relaxation of the post-reaction solvation environment results in a red shift. These two competing effects mean that the post-reaction relaxation profile is distinct from what is observed when Franck-Condon vibrational excitation of DF occurs within a microsolvation environment initially at equilibrium. Our conclusions, along with the theoretical and parallel software framework presented in this paper, should be more broadly applicable to a range of complex reactive systems. PMID:26233120

  8. Non-equilibrium reaction and relaxation dynamics in a strongly interacting explicit solvent: F + CD3CN treated with a parallel multi-state EVB model

    NASA Astrophysics Data System (ADS)

    Glowacki, David R.; Orr-Ewing, Andrew J.; Harvey, Jeremy N.

    2015-07-01

    We describe a parallelized linear-scaling computational framework developed to implement arbitrarily large multi-state empirical valence bond (MS-EVB) calculations within CHARMM and TINKER. Forces are obtained using the Hellmann-Feynman relationship, giving continuous gradients, and good energy conservation. Utilizing multi-dimensional Gaussian coupling elements fit to explicitly correlated coupled cluster theory, we built a 64-state MS-EVB model designed to study the F + CD3CN → DF + CD2CN reaction in CD3CN solvent (recently reported in Dunning et al. [Science 347(6221), 530 (2015)]). This approach allows us to build a reactive potential energy surface whose balanced accuracy and efficiency considerably surpass what we could achieve otherwise. We ran molecular dynamics simulations to examine a range of observables which follow in the wake of the reactive event: energy deposition in the nascent reaction products, vibrational relaxation rates of excited DF in CD3CN solvent, equilibrium power spectra of DF in CD3CN, and time dependent spectral shifts associated with relaxation of the nascent DF. Many of our results are in good agreement with time-resolved experimental observations, providing evidence for the accuracy of our MS-EVB framework in treating both the solute and solute/solvent interactions. The simulations provide additional insight into the dynamics at sub-picosecond time scales that are difficult to resolve experimentally. In particular, the simulations show that (immediately following deuterium abstraction) the nascent DF finds itself in a non-equilibrium regime in two different respects: (1) it is highly vibrationally excited, with ˜23 kcal mol-1 localized in the stretch and (2) its post-reaction solvation environment, in which it is not yet hydrogen-bonded to CD3CN solvent molecules, is intermediate between the non-interacting gas-phase limit and the solution-phase equilibrium limit. Vibrational relaxation of the nascent DF results in a spectral blue shift, while relaxation of the post-reaction solvation environment results in a red shift. These two competing effects mean that the post-reaction relaxation profile is distinct from what is observed when Franck-Condon vibrational excitation of DF occurs within a microsolvation environment initially at equilibrium. Our conclusions, along with the theoretical and parallel software framework presented in this paper, should be more broadly applicable to a range of complex reactive systems.

  9. The Double Hierarchy Method. A parallel 3D contact method for the interaction of spherical particles with rigid FE boundaries using the DEM

    NASA Astrophysics Data System (ADS)

    Santasusana, Miquel; Irazábal, Joaquín; Oñate, Eugenio; Carbonell, Josep Maria

    2016-04-01

    In this work, we present a new methodology for the treatment of the contact interaction between rigid boundaries and spherical discrete elements (DE). Rigid body parts are present in most of large-scale simulations. The surfaces of the rigid parts are commonly meshed with a finite element-like (FE) discretization. The contact detection and calculation between those DE and the discretized boundaries is not straightforward and has been addressed by different approaches. The algorithm presented in this paper considers the contact of the DEs with the geometric primitives of a FE mesh, i.e. facet, edge or vertex. To do so, the original hierarchical method presented by Horner et al. (J Eng Mech 127(10):1027-1032, 2001) is extended with a new insight leading to a robust, fast and accurate 3D contact algorithm which is fully parallelizable. The implementation of the method has been developed in order to deal ideally with triangles and quadrilaterals. If the boundaries are discretized with another type of geometries, the method can be easily extended to higher order planar convex polyhedra. A detailed description of the procedure followed to treat a wide range of cases is presented. The description of the developed algorithm and its validation is verified with several practical examples. The parallelization capabilities and the obtained performance are presented with the study of an industrial application example.

  10. Getting a feel for parameters: using interactive parallel plots as a tool for parameter identification in the new rainfall-runoff model WALRUS

    NASA Astrophysics Data System (ADS)

    Brauer, Claudia; Torfs, Paul; Teuling, Ryan; Uijlenhoet, Remko

    2015-04-01

    Recently, we developed the Wageningen Lowland Runoff Simulator (WALRUS) to fill the gap between complex, spatially distributed models often used in lowland catchments and simple, parametric models which have mostly been developed for mountainous catchments (Brauer et al., 2014ab). This parametric rainfall-runoff model can be used all over the world in both freely draining lowland catchments and polders with controlled water levels. The open source model code is implemented in R and can be downloaded from www.github.com/ClaudiaBrauer/WALRUS. The structure and code of WALRUS are simple, which facilitates detailed investigation of the effect of parameters on all model variables. WALRUS contains only four parameters requiring calibration; they are intended to have a strong, qualitative relation with catchment characteristics. Parameter estimation remains a challenge, however. The model structure contains three main feedbacks: (1) between groundwater and surface water; (2) between saturated and unsaturated zone; (3) between catchment wetness and (quick/slow) flowroute division. These feedbacks represent essential rainfall-runoff processes in lowland catchments, but increase the risk of parameter dependence and equifinality. Therefore, model performance should not only be judged based on a comparison between modelled and observed discharges, but also based on the plausibility of the internal modelled variables. Here, we present a method to analyse the effect of parameter values on internal model states and fluxes in a qualitative and intuitive way using interactive parallel plotting. We applied WALRUS to ten Dutch catchments with different sizes, slopes and soil types and both freely draining and polder areas. The model was run with a large number of parameter sets, which were created using Latin Hypercube Sampling. The model output was characterised in terms of several signatures, both measures of goodness of fit and statistics of internal model variables (such as the percentage of rain water travelling through the quickflow reservoir). End users can then eliminate parameter combinations with unrealistic outcomes based on expert knowledge using interactive parallel plots. In these plots, for instance, ranges can be selected for each signature and only model runs which yield signature values in these ranges are highlighted. The resulting selection of realistic parameter sets can be used for ensemble simulations. C.C. Brauer, A.J. Teuling, P.J.J.F. Torfs, R. Uijlenhoet (2014a): The Wageningen Lowland Runoff Simulator (WALRUS): a lumped rainfall-runoff model for catchments with shallow groundwater, Geoscientific Model Development, 7, 2313-2332, www.geosci-model-dev.net/7/2313/2014/gmd-7-2313-2014.pdf C.C. Brauer, P.J.J.F. Torfs, A.J. Teuling, R. Uijlenhoet (2014b): The Wageningen Lowland Runoff Simulator (WALRUS): application to the Hupsel Brook catchment and Cabauw polder, Hydrology and Earth System Sciences, 18, 4007-4028, www.hydrol-earth-syst-sci.net/18/4007/2014/hess-18-4007-2014.pdf

  11. Development of a prototype PET scanner with depth-of-interaction measurement using solid-state photomultiplier arrays and parallel readout electronics

    PubMed Central

    Shao, Yiping; Sun, Xishan; Lan, Kejian A.; Bircher, Chad; Lou, Kai; Deng, Zhi

    2014-01-01

    In this study, we developed a prototype animal PET by applying several novel technologies to use the solid-state photomultiplier (SSPM) arrays for measuring the depth-of-interaction (DOI) and improving imaging performance. Each PET detector has an 8×8 array of about 1.9×1.9×30.0 mm3 lutetium-yttrium-oxyorthosilicate (LYSO) scintillators, with each end optically connected to a SSPM array (16-channel in a 4×4 matrix) through a light guide to enable continuous DOI measurement. Each SSPM has an active area of about 3×3 mm2, and its output is read by a custom-developed application-specific-integrated-circuit (ASIC) to directly convert analog signals to digital timing pulses that encode the interaction information. These pulses are transferred to and be decoded by a field-programmable-gate-array (FPGA) based time-to-digital convertor for coincident event selection and data acquisition. The independent readout of each SSPM and the parallel signal process can significantly improve the signal-to-noise ratio and enable using flexible algorithms for different data processes. The prototype PET consists of two rotating detector panels on a portable gantry with four detectors in each panel to provide 16 mm axial and variable transaxial field-of-view (FOV) sizes. List-mode ordered-subset-expectation-maximization image reconstruction was implemented. The measured mean energy, coincidence timing, and DOI resolution for a crystal were about 17.6%, 2.8 ns, and 5.6 mm, respectively. The measured transaxial resolutions at the center of the FOV were 2.0 mm and 2.3 mm for images reconstructed with and without DOI, respectively. In addition, the resolutions across the FOV with DOI were substantially better than those without DOI. The quality of PET images of both a hot-rod phantom and mouse acquired with DOI was much higher than that of images obtained without DOI. This study demonstrates that SSPM arrays and advanced readout/processing electronics can be used to develop a practical DOI-measureable PET scanner. PMID:24556629

  12. Development of a prototype PET scanner with depth-of-interaction measurement using solid-state photomultiplier arrays and parallel readout electronics

    NASA Astrophysics Data System (ADS)

    Shao, Yiping; Sun, Xishan; Lan, Kejian A.; Bircher, Chad; Lou, Kai; Deng, Zhi

    2014-03-01

    In this study, we developed a prototype animal PET by applying several novel technologies to use solid-state photomultiplier (SSPM) arrays to measure the depth of interaction (DOI) and improve imaging performance. Each PET detector has an 8 × 8 array of about 1.9 × 1.9 × 30.0 mm3 lutetium-yttrium-oxyorthosilicate scintillators, with each end optically connected to an SSPM array (16 channels in a 4 × 4 matrix) through a light guide to enable continuous DOI measurement. Each SSPM has an active area of about 3 × 3 mm2, and its output is read by a custom-developed application-specific integrated circuit to directly convert analogue signals to digital timing pulses that encode the interaction information. These pulses are transferred to and are decoded by a field-programmable gate array-based time-to-digital convertor for coincident event selection and data acquisition. The independent readout of each SSPM and the parallel signal process can significantly improve the signal-to-noise ratio and enable the use of flexible algorithms for different data processes. The prototype PET consists of two rotating detector panels on a portable gantry with four detectors in each panel to provide 16 mm axial and variable transaxial field-of-view (FOV) sizes. List-mode ordered subset expectation maximization image reconstruction was implemented. The measured mean energy, coincidence timing and DOI resolution for a crystal were about 17.6%, 2.8 ns and 5.6 mm, respectively. The measured transaxial resolutions at the center of the FOV were 2.0 mm and 2.3 mm for images reconstructed with and without DOI, respectively. In addition, the resolutions across the FOV with DOI were substantially better than those without DOI. The quality of PET images of both a hot-rod phantom and mouse acquired with DOI was much higher than that of images obtained without DOI. This study demonstrates that SSPM arrays and advanced readout/processing electronics can be used to develop a practical DOI-measureable PET scanner.

  13. Research investigation of helicopter main rotor/tail rotor interaction noise

    NASA Technical Reports Server (NTRS)

    Fitzgerald, J.; Kohlhepp, F.

    1988-01-01

    Acoustic measurements were obtained in a Langley 14 x 22 foot Subsonic Wind Tunnel to study the aeroacoustic interaction of 1/5th scale main rotor, tail rotor, and fuselage models. An extensive aeroacoustic data base was acquired for main rotor, tail rotor, fuselage aerodynamic interaction for moderate forward speed flight conditions. The details of the rotor models, experimental design and procedure, aerodynamic and acoustic data acquisition and reduction are presented. The model was initially operated in trim for selected fuselage angle of attack, main rotor tip-path-plane angle, and main rotor thrust combinations. The effects of repositioning the tail rotor in the main rotor wake and the corresponding tail rotor countertorque requirements were determined. Each rotor was subsequently tested in isolation at the thrust and angle of attack combinations for trim. The acoustic data indicated that the noise was primarily dominated by the main rotor, especially for moderate speed main rotor blade-vortex interaction conditions. The tail rotor noise increased when the main rotor was removed indicating that tail rotor inflow was improved with the main rotor present.

  14. A Fast Parallel Simulation Code for Interaction between Proto-Planetary Disk and Embedded Proto-Planets: Implementation for 3D Code

    SciTech Connect

    Li, Shengtai; Li, Hui

    2012-06-14

    We develop a 3D simulation code for interaction between the proto-planetary disk and embedded proto-planets. The protoplanetary disk is treated as a three-dimensional (3D), self-gravitating gas whose motion is described by the locally isothermal Navier-Stokes equations in a spherical coordinate centered on the star. The differential equations for the disk are similar to those given in Kley et al. (2009) with a different gravitational potential that is defined in Nelson et al. (2000). The equations are solved by directional split Godunov method for the inviscid Euler equations plus operator-split method for the viscous source terms. We use a sub-cycling technique for the azimuthal sweep to alleviate the time step restriction. We also extend the FARGO scheme of Masset (2000) and modified in Li et al. (2001) to our 3D code to accelerate the transport in the azimuthal direction. Furthermore, we have implemented a reduced 2D (r, {theta}) and a fully 3D self-gravity solver on our uniform disk grid, which extends our 2D method (Li, Buoni, & Li 2008) to 3D. This solver uses a mode cut-off strategy and combines FFT in the azimuthal direction and direct summation in the radial and meridional direction. An initial axis-symmetric equilibrium disk is generated via iteration between the disk density profile and the 2D disk-self-gravity. We do not need any softening in the disk self-gravity calculation as we have used a shifted grid method (Li et al. 2008) to calculate the potential. The motion of the planet is limited on the mid-plane and the equations are the same as given in D'Angelo et al. (2005), which we adapted to the polar coordinates with a fourth-order Runge-Kutta solver. The disk gravitational force on the planet is assumed to evolve linearly with time between two hydrodynamics time steps. The Planetary potential acting on the disk is calculated accurately with a small softening given by a cubic-spline form (Kley et al. 2009). Since the torque is extremely sensitive to the position of the planet, we adopt the corotating frame that allows the planet moving only in radial direction if only one planet is present. This code has been extensively tested on a number of problems. For the earthmass planet with constant aspect ratio h = 0.05, the torque calculated using our code matches quite well with the the 3D linear theory results by Tanaka et al. (2002). The code is fully parallelized via message-passing interface (MPI) and has very high parallel efficiency. Several numerical examples for both fixed planet and moving planet are provided to demonstrate the efficacy of the numerical method and code.

  15. Detached-eddy simulation of flow non-linearity of fluid-structural interactions using high order schemes and parallel computation

    NASA Astrophysics Data System (ADS)

    Wang, Baoyuan

    The objective of this research is to develop an efficient and accurate methodology to resolve flow non-linearity of fluid-structural interaction. To achieve this purpose, a numerical strategy to apply the detached-eddy simulation (DES) with a fully coupled fluid-structural interaction model is established for the first time. The following novel numerical algorithms are also created: a general sub-domain boundary mapping procedure for parallel computation to reduce wall clock simulation time, an efficient and low diffusion E-CUSP (LDE) scheme used as a Riemann solver to resolve discontinuities with minimal numerical dissipation, and an implicit high order accuracy weighted essentially non-oscillatory (WENO) scheme to capture shock waves. The Detached-Eddy Simulation is based on the model proposed by Spalart in 1997. Near solid walls within wall boundary layers, the Reynolds averaged Navier-Stokes (RANS) equations are solved. Outside of the wall boundary layers, the 3D filtered compressible Navier-Stokes equations are solved based on large eddy simulation(LES). The Spalart-Allmaras one equation turbulence model is solved to provide the Reynolds stresses in the RANS region and the subgrid scale stresses in the LES region. An improved 5th order finite differencing weighted essentially non-oscillatory (WENO) scheme with an optimized epsilon value is employed for the inviscid fluxes. The new LDE scheme used with the WENO scheme is able to capture crisp shock profiles and exact contact surfaces. A set of fully conservative 4th order finite central differencing schemes are used for the viscous terms. The 3D Navier-Stokes equations are discretized based on a conservative finite differencing scheme. The unfactored line Gauss-Seidel relaxation iteration is employed for time marching. A general sub-domain boundary mapping procedure is developed for arbitrary topology multi-block structured grids with grid points matched on sub-domain boundaries. Extensive numerical experiments are conducted to test the performance of the numerical algorithms. The RANS simulation with the Spalart-Allmaras one equation turbulence model is the foundation for DES and is hence validated with other transonic flows. The predicted results agree very well with the experiments. The RANS code is then further used to study the slot size effect of a co-flow jet (CFJ) airfoil. The DES solver with fully coupled fluid-structural interaction methodology is validated with vortex induced vibration of a cylinder and a transonic forced pitching airfoil. For the cylinder, the laminar Navier-Stokes equations are solved due to the low Reynolds number. The 3D effects are observed in both stationary and oscillating cylinder simulation because of the flow separations behind the cylinder. For the transonic forced pitching airfoil DES computation, there is no flow separation in the flow field. The DES results agree well with the RANS results. These two cases indicate that the DES is more effective on predicting flow separation. The DES code is used to simulate the limited cycle oscillation of NLR7301 airfoil. For the cases computed in this research, the predicted LCO frequency, amplitudes, averaged lift and moment, all agree excellently with the experiment. The solutions appear to have bifurcation and are dependent on the initial perturbation. The developed methodology is able to capture the LCO with very small amplitudes measured in the experiment. This is attributed to the high order low diffusion schemes, fully coupled FSI model, and the turbulence model used. This research appears to be the first time that a numerical simulation of LCO matches the experiment. The DES code is also used to simulate the CFJ airfoil jet mixing at high angle of attack. In conclusion, the numerical strategy of the high order DES with fully coupled FSI model and parallel computing developed in this research is demonstrated to have high accuracy, robustness, and efficiency. Future work to further maturate the methodology is suggested. (Abstract shortened by UMI.)

  16. Introduction to the POKER parallel programming environment

    SciTech Connect

    Snyder, L.

    1983-01-01

    The POKER parallel programming environment is a graphics-based, interactive system for programming the configurable, highly parallel (CHIP) computer. Designed to support nearly all aspects of parallel programming in one integrated system, POKER has been implemented as a (=35000 line) C program on the VAX 11/780 under UNIX. It provides a number of novel features including graphics programming of parallel processor communication. 4 references.

  17. Parallel pivoting combined with parallel reduction

    NASA Technical Reports Server (NTRS)

    Alaghband, Gita

    1987-01-01

    Parallel algorithms for triangularization of large, sparse, and unsymmetric matrices are presented. The method combines the parallel reduction with a new parallel pivoting technique, control over generations of fill-ins and a check for numerical stability, all done in parallel with the work being distributed over the active processes. The parallel technique uses the compatibility relation between pivots to identify parallel pivot candidates and uses the Markowitz number of pivots to minimize fill-in. This technique is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds.

  18. Special parallel processing workshop

    SciTech Connect

    1994-12-01

    This report contains viewgraphs from the Special Parallel Processing Workshop. These viewgraphs deal with topics such as parallel processing performance, message passing, queue structure, and other basic concept detailing with parallel processing.

  19. The Vortex Lattice Method for the Rotor-Vortex Interaction Problem

    NASA Technical Reports Server (NTRS)

    Padakannaya, R.

    1974-01-01

    The rotor blade-vortex interaction problem and the resulting impulsive airloads which generate undesirable noise levels are discussed. A numerical lifting surface method to predict unsteady aerodynamic forces induced on a finite aspect ratio rectangular wing by a straight, free vortex placed at an arbitrary angle in a subsonic incompressible free stream is developed first. Using a rigid wake assumption, the wake vortices are assumed to move downsteam with the free steam velocity. Unsteady load distributions are obtained which compare favorably with the results of planar lifting surface theory. The vortex lattice method has been extended to a single bladed rotor operating at high advance ratios and encountering a free vortex from a fixed wing upstream of the rotor. The predicted unsteady load distributions on the model rotor blade are generally in agreement with the experimental results. This method has also been extended to full scale rotor flight cases in which vortex induced loads near the tip of a rotor blade were indicated. In both the model and the full scale rotor blade airload calculations a flat planar wake was assumed which is a good approximation at large advance ratios because the downwash is small in comparison to the free stream at large advance ratios. The large fluctuations in the measured airloads near the tip of the rotor blade on the advance side is predicted closely by the vortex lattice method.

  20. Parallel rendering techniques for massively parallel visualization

    SciTech Connect

    Hansen, C.; Krogh, M.; Painter, J.

    1995-07-01

    As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory. and parallelism of Massively Parallel Processors (MPPs) are becoming increasingly important. For large applications rendering on the MPP tends to be preferable to rendering on a graphics workstation due to the MPP`s abundant resources: memory, disk, and numerous processors. The challenge becomes developing algorithms that can exploit these resources while minimizing overhead, typically communication costs. This paper will describe recent efforts in parallel rendering for polygonal primitives as well as parallel volumetric techniques. This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume render use a MIMD approach. Implementations for these algorithms are presented for the Thinking Ma.chines Corporation CM-5 MPP.

  1. Parallel algorithms and architectures

    SciTech Connect

    Albrecht, A.; Jung, H.; Mehlhorn, K.

    1987-01-01

    Contents of this book are the following: Preparata: Deterministic simulation of idealized parallel computers on more realistic ones; Convex hull of randomly chosen points from a polytope; Dataflow computing; Parallel in sequence; Towards the architecture of an elementary cortical processor; Parallel algorithms and static analysis of parallel programs; Parallel processing of combinatorial search; Communications; An O(nlogn) cost parallel algorithms for the single function coarsest partition problem; Systolic algorithms for computing the visibility polygon and triangulation of a polygonal region; and RELACS - A recursive layout computing system. Parallel linear conflict-free subtree access.

  2. Runtime volume visualization for parallel CFD

    NASA Technical Reports Server (NTRS)

    Ma, Kwan-Liu

    1995-01-01

    This paper discusses some aspects of design of a data distributed, massively parallel volume rendering library for runtime visualization of parallel computational fluid dynamics simulations in a message-passing environment. Unlike the traditional scheme in which visualization is a postprocessing step, the rendering is done in place on each node processor. Computational scientists who run large-scale simulations on a massively parallel computer can thus perform interactive monitoring of their simulations. The current library provides an interface to handle volume data on rectilinear grids. The same design principles can be generalized to handle other types of grids. For demonstration, we run a parallel Navier-Stokes solver making use of this rendering library on the Intel Paragon XP/S. The interactive visual response achieved is found to be very useful. Performance studies show that the parallel rendering process is scalable with the size of the simulation as well as with the parallel computer.

  3. Exploiting Symmetry on Parallel Architectures.

    NASA Astrophysics Data System (ADS)

    Stiller, Lewis Benjamin

    1995-01-01

    This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.

  4. A comparison with theory of peak to peak sound level for a model helicopter rotor generating blade slap at low tip speeds

    NASA Technical Reports Server (NTRS)

    Fontana, R. R.; Hubbard, J. E., Jr.

    1983-01-01

    Mini-tuft and smoke flow visualization techniques have been developed for the investigation of model helicopter rotor blade vortex interaction noise at low tip speeds. These techniques allow the parameters required for calculation of the blade vortex interaction noise using the Widnall/Wolf model to be determined. The measured acoustics are compared with the predicted acoustics for each test condition. Under the conditions tested it is determined that the dominating acoustic pulse results from the interaction of the blade with a vortex 1-1/4 revolutions old at an interaction angle of less than 8 deg. The Widnall/Wolf model predicts the peak sound pressure level within 3 dB for blade vortex separation distances greater than 1 semichord, but it generally over predicts the peak S.P.L. by over 10 dB for blade vortex separation distances of less than 1/4 semichord.

  5. Applied Parallel Metadata Indexing

    SciTech Connect

    Jacobi, Michael R

    2012-08-01

    The GPFS Archive is parallel archive is a parallel archive used by hundreds of users in the Turquoise collaboration network. It houses 4+ petabytes of data in more than 170 million files. Currently, users must navigate the file system to retrieve their data, requiring them to remember file paths and names. A better solution might allow users to tag data with meaningful labels and searach the archive using standard and user-defined metadata, while maintaining security. last summer, I developed the backend to a tool that adheres to these design goals. The backend works by importing GPFS metadata into a MongoDB cluster, which is then indexed on each attribute. This summer, the author implemented security and developed the user interfae for the search tool. To meet security requirements, each database table is associated with a single user, which only stores records that the user may read, and requires a set of credentials to access. The interface to the search tool is implemented using FUSE (Filesystem in USErspace). FUSE is an intermediate layer that intercepts file system calls and allows the developer to redefine how those calls behave. In the case of this tool, FUSE interfaces with MongoDB to issue queries and populate output. A FUSE implementation is desirable because it allows users to interact with the search tool using commands they are already familiar with. These security and interface additions are essential for a usable product.

  6. Massively Parallel QCD

    SciTech Connect

    Soltz, R; Vranas, P; Blumrich, M; Chen, D; Gara, A; Giampap, M; Heidelberger, P; Salapura, V; Sexton, J; Bhanot, G

    2007-04-11

    The theory of the strong nuclear force, Quantum Chromodynamics (QCD), can be numerically simulated from first principles on massively-parallel supercomputers using the method of Lattice Gauge Theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures that it suggests. We demonstrate these methods on the BlueGene massively-parallel supercomputer and argue that LQCD and the BlueGene architecture are a natural match. This can be traced to the simple fact that LQCD is a regular lattice discretization of space into lattice sites while the BlueGene supercomputer is a discretization of space into compute nodes, and that both are constrained by requirements of locality. This simple relation is both technologically important and theoretically intriguing. The main result of this paper is the speedup of LQCD using up to 131,072 CPUs on the largest BlueGene/L supercomputer. The speedup is perfect with sustained performance of about 20% of peak. This corresponds to a maximum of 70.5 sustained TFlop/s. At these speeds LQCD and BlueGene are poised to produce the next generation of strong interaction physics theoretical results.

  7. Computer-Aided Parallelizer and Optimizer

    NASA Technical Reports Server (NTRS)

    Jin, Haoqiang

    2011-01-01

    The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.

  8. Modeling the Backscatter and Transmitted Light of High Power Smoothed Beams with pF3D, a Massively Parallel Laser Plasma Interaction Code

    SciTech Connect

    Berger, R.L.; Divol, L.; Glenzer, S.; Hinkel, D.E.; Kirkwood, R.K.; Langdon, A.B.; Moody, J.D.; Still, C.H.; Suter, L.; Williams, E.A.; Young, P.E.

    2000-06-01

    Using the three-dimensional wave propagation code, F3D[Berger et al., Phys. Fluids B 5,2243 (1993), Berger et al., Phys. Plasmas 5,4337(1998)], and the massively parallel version pF3D, [Still et al. Phys. Plasmas 7 (2000)], we have computed the transmitted and reflected light for laser and plasma conditions in experiments that simulated ignition hohlraum conditions. The frequency spectrum and the wavenumber spectrum of the transmitted light are calculated and used to identify the relative contributions of stimulated forward Brillouin and self-focusing in hydrocarbon-filled balloons, commonly called gasbags. The effect of beam smoothing, smoothing by spectral dispersion (SSD) and polarization smoothing (PS), on the stimulated Brillouin backscatter (SBS) from Scale-1 NOVA hohlraums was simulated with the use nonlinear saturation models that limit the amplitude of the driven acoustic waves. Other experiments on CO{sub 2} gasbags simultaneously measure at a range of intensities the SBS reflectivity and the Thomson scatter from the SBS-driven acoustic waves that provide a more detailed test of the modeling. These calculations also predict that the backscattered light will be very nonuniform in the nearfield (the focusing system optics) which is important for specifying the backscatter intensities be tolerated by the National Ignition Facility laser system.

  9. Parallel computations and control of adaptive structures

    NASA Technical Reports Server (NTRS)

    Park, K. C.; Alvin, Kenneth F.; Belvin, W. Keith; Chong, K. P. (Editor); Liu, S. C. (Editor); Li, J. C. (Editor)

    1991-01-01

    The equations of motion for structures with adaptive elements for vibration control are presented for parallel computations to be used as a software package for real-time control of flexible space structures. A brief introduction of the state-of-the-art parallel computational capability is also presented. Time marching strategies are developed for an effective use of massive parallel mapping, partitioning, and the necessary arithmetic operations. An example is offered for the simulation of control-structure interaction on a parallel computer and the impact of the approach presented for applications in other disciplines than aerospace industry is assessed.

  10. Parallel flow diffusion battery

    DOEpatents

    Yeh, Hsu-Chi; Cheng, Yung-Sung

    1984-08-07

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  11. Parallel flow diffusion battery

    DOEpatents

    Yeh, H.C.; Cheng, Y.S.

    1984-01-01

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  12. Wave-particle interactions with parallel whistler waves: Nonlinear and time-dependent effects revealed by particle-in-cell simulations

    SciTech Connect

    Camporeale, Enrico; Zimbardo, Gaetano

    2015-09-15

    We present a self-consistent Particle-in-Cell simulation of the resonant interactions between anisotropic energetic electrons and a population of whistler waves, with parameters relevant to the Earth's radiation belt. By tracking PIC particles and comparing with test-particle simulations, we emphasize the importance of including nonlinear effects and time evolution in the modeling of wave-particle interactions, which are excluded in the resonant limit of quasi-linear theory routinely used in radiation belt studies. In particular, we show that pitch angle diffusion is enhanced during the linear growth phase, and it rapidly saturates well before a single bounce period. This calls into question the widely used bounce average performed in most radiation belt diffusion calculations. Furthermore, we discuss how the saturation is related to the fact that the domain in which the particles pitch angle diffuses is bounded, and to the well-known problem of 90° diffusion barrier.

  13. Wave-particle interactions with parallel whistler waves: Nonlinear and time-dependent effects revealed by particle-in-cell simulations

    NASA Astrophysics Data System (ADS)

    Camporeale, Enrico; Zimbardo, Gaetano

    2015-09-01

    We present a self-consistent Particle-in-Cell simulation of the resonant interactions between anisotropic energetic electrons and a population of whistler waves, with parameters relevant to the Earth's radiation belt. By tracking PIC particles and comparing with test-particle simulations, we emphasize the importance of including nonlinear effects and time evolution in the modeling of wave-particle interactions, which are excluded in the resonant limit of quasi-linear theory routinely used in radiation belt studies. In particular, we show that pitch angle diffusion is enhanced during the linear growth phase, and it rapidly saturates well before a single bounce period. This calls into question the widely used bounce average performed in most radiation belt diffusion calculations. Furthermore, we discuss how the saturation is related to the fact that the domain in which the particles pitch angle diffuses is bounded, and to the well-known problem of 90° diffusion barrier.

  14. Parallel processing ITS

    SciTech Connect

    Fan, W.C.; Halbleib, J.A. Sr.

    1996-09-01

    This report provides a users` guide for parallel processing ITS on a UNIX workstation network, a shared-memory multiprocessor or a massively-parallel processor. The parallelized version of ITS is based on a master/slave model with message passing. Parallel issues such as random number generation, load balancing, and communication software are briefly discussed. Timing results for example problems are presented for demonstration purposes.

  15. Parallel simulation today

    NASA Technical Reports Server (NTRS)

    Nicol, David; Fujimoto, Richard

    1992-01-01

    This paper surveys topics that presently define the state of the art in parallel simulation. Included in the tutorial are discussions on new protocols, mathematical performance analysis, time parallelism, hardware support for parallel simulation, load balancing algorithms, and dynamic memory management for optimistic synchronization.

  16. The ParaScope parallel programming environment

    NASA Technical Reports Server (NTRS)

    Cooper, Keith D.; Hall, Mary W.; Hood, Robert T.; Kennedy, Ken; Mckinley, Kathryn S.; Mellor-Crummey, John M.; Torczon, Linda; Warren, Scott K.

    1993-01-01

    The ParaScope parallel programming environment, developed to support scientific programming of shared-memory multiprocessors, includes a collection of tools that use global program analysis to help users develop and debug parallel programs. This paper focuses on ParaScope's compilation system, its parallel program editor, and its parallel debugging system. The compilation system extends the traditional single-procedure compiler by providing a mechanism for managing the compilation of complete programs. Thus, ParaScope can support both traditional single-procedure optimization and optimization across procedure boundaries. The ParaScope editor brings both compiler analysis and user expertise to bear on program parallelization. It assists the knowledgeable user by displaying and managing analysis and by providing a variety of interactive program transformations that are effective in exposing parallelism. The debugging system detects and reports timing-dependent errors, called data races, in execution of parallel programs. The system combines static analysis, program instrumentation, and run-time reporting to provide a mechanical system for isolating errors in parallel program executions. Finally, we describe a new project to extend ParaScope to support programming in FORTRAN D, a machine-independent parallel programming language intended for use with both distributed-memory and shared-memory parallel computers.

  17. Introduction to parallel computing

    SciTech Connect

    Lafferty, E.L.; Michaud, M.C.; Prelle, M.J.; Goethert, J.B.

    1992-05-01

    Today's supercomputers and parallel computers provide an unprecedented amount of computational power in one machine. A basic understanding of the parallel computing techniques that assist in the capture and utilization of that computational power is essential to appreciate the capabilities and the limitations of parallel supercomputers. In addition, an understanding of technical vocabulary is critical in order to converse about parallel computers. The relevant techniques, vocabulary, currently available hardware architectures, and programming languages which provide the basic concepts of parallel computing are introduced in this document. This document updates the document entitled Introduction to Parallel Supercomputing, M88-42, October 1988. It includes a new section on languages for parallel computers, updates the hardware related sections, and includes current references.

  18. Visualization and Tracking of Parallel CFD Simulations

    NASA Technical Reports Server (NTRS)

    Vaziri, Arsi; Kremenetsky, Mark

    1995-01-01

    We describe a system for interactive visualization and tracking of a 3-D unsteady computational fluid dynamics (CFD) simulation on a parallel computer. CM/AVS, a distributed, parallel implementation of a visualization environment (AVS) runs on the CM-5 parallel supercomputer. A CFD solver is run as a CM/AVS module on the CM-5. Data communication between the solver, other parallel visualization modules, and a graphics workstation, which is running AVS, are handled by CM/AVS. Partitioning of the visualization task, between CM-5 and the workstation, can be done interactively in the visual programming environment provided by AVS. Flow solver parameters can also be altered by programmable interactive widgets. This system partially removes the requirement of storing large solution files at frequent time steps, a characteristic of the traditional 'simulate (yields) store (yields) visualize' post-processing approach.

  19. Genome-Wide Fitness and Genetic Interactions Determined by Tn-seq, a High-Throughput Massively Parallel Sequencing Method for Microorganisms

    PubMed Central

    van Opijnen, Tim; Lazinski, David W.; Camilli, Andrew

    2015-01-01

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobials and vaccines. Here we present the method Tn-seq, with which it has become possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies in the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated transposon insertion library. After library selection, changes in frequency of each insertion mutant are determined by sequencing of the flanking regions en masse. These changes are used to calculate each mutant's fitness. The method was originally developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis, but has now been applied to several different microbial species. PMID:24733243

  20. Genome-Wide Fitness and Genetic Interactions Determined by Tn-seq, a High-Throughput Massively Parallel Sequencing Method for Microorganisms

    PubMed Central

    van Opijnen, Tim; Camilli, Andrew

    2013-01-01

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobials and vaccines. Here we present the method Tn-seq, with which it has become possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies in the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated Mariner transposon insertion library. After library selection, changes in frequency of each insertion mutant are determined by sequencing of the flanking regions en masse. These changes are used to calculate each mutant’s fitness. The method has been developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis; however, due to the wide activity of the Mariner transposon, Tn-seq can be applied to many different microbial species. PMID:21053251

  1. Stacked and H-Bonded Cytosine Dimers. Analysis of the Intermolecular Interaction Energies by Parallel Quantum Chemistry and Polarizable Molecular Mechanics.

    PubMed

    Gresh, Nohad; Sponer, Judit E; Devereux, Mike; Gkionis, Konstantinos; de Courcy, Benoit; Piquemal, Jean-Philip; Sponer, Jiri

    2015-07-30

    Until now, atomistic simulations of DNA and RNA and their complexes have been executed using well calibrated but conceptually simple pair-additive empirical potentials (force fields). Although such simulations provided many valuable results, it is well established that simple force fields also introduce errors into the description, underlying the need for development of alternative anisotropic, polarizable molecular mechanics (APMM) potentials. One of the most abundant forces in all kinds of nucleic acids topologies is base stacking. Intra- and interstrand stacking is assumed to be the most essential factor affecting local conformational variations of B-DNA. However, stacking also contributes to formation of all kinds of noncanonical nucleic acids structures, such as quadruplexes or folded RNAs. The present study focuses on 14 stacked cytosine (Cyt) dimers and the doubly H-bonded dimer. We evaluate the extent to which an APMM procedure, SIBFA, could account quantitatively for the results of high-level quantum chemistry (QC) on the total interaction energies, and the individual energy contributions and their nonisotropic behaviors. Good agreements are found at both uncorrelated HF and correlated DFT and CCSD(T) levels. Resorting in SIBFA to distributed QC multipoles and to an explicit representation of the lone pairs is essential to respectively account for the anisotropies of the Coulomb and of the exchange-repulsion QC contributions. PMID:26119247

  2. Genome-Wide Fitness and Genetic Interactions Determined by Tn-seq, a High-Throughput Massively Parallel Sequencing Method for Microorganisms

    PubMed Central

    van Opijnen, Tim; Lazinski, David W.; Camilli, Andrew

    2015-01-01

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobial agents and vaccines. This unit presents Tn-seq, a method that has made it possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies on the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated transposon insertion library. After library selection, changes in the frequency of each insertion mutant are determined by sequencing flanking regions en masse. These changes are used to calculate each mutant’s fitness. The method was originally developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis, but has now been applied to several different microbial species. PMID:25641100

  3. Parallel digital forensics infrastructure.

    SciTech Connect

    Liebrock, Lorie M.; Duggan, David Patrick

    2009-10-01

    This report documents the architecture and implementation of a Parallel Digital Forensics infrastructure. This infrastructure is necessary for supporting the design, implementation, and testing of new classes of parallel digital forensics tools. Digital Forensics has become extremely difficult with data sets of one terabyte and larger. The only way to overcome the processing time of these large sets is to identify and develop new parallel algorithms for performing the analysis. To support algorithm research, a flexible base infrastructure is required. A candidate architecture for this base infrastructure was designed, instantiated, and tested by this project, in collaboration with New Mexico Tech. Previous infrastructures were not designed and built specifically for the development and testing of parallel algorithms. With the size of forensics data sets only expected to increase significantly, this type of infrastructure support is necessary for continued research in parallel digital forensics. This report documents the implementation of the parallel digital forensics (PDF) infrastructure architecture and implementation.

  4. PCLIPS: Parallel CLIPS

    NASA Technical Reports Server (NTRS)

    Hall, Lawrence O.; Bennett, Bonnie H.; Tello, Ivan

    1994-01-01

    A parallel version of CLIPS 5.1 has been developed to run on Intel Hypercubes. The user interface is the same as that for CLIPS with some added commands to allow for parallel calls. A complete version of CLIPS runs on each node of the hypercube. The system has been instrumented to display the time spent in the match, recognize, and act cycles on each node. Only rule-level parallelism is supported. Parallel commands enable the assertion and retraction of facts to/from remote nodes working memory. Parallel CLIPS was used to implement a knowledge-based command, control, communications, and intelligence (C(sup 3)I) system to demonstrate the fusion of high-level, disparate sources. We discuss the nature of the information fusion problem, our approach, and implementation. Parallel CLIPS has also be used to run several benchmark parallel knowledge bases such as one to set up a cafeteria. Results show from running Parallel CLIPS with parallel knowledge base partitions indicate that significant speed increases, including superlinear in some cases, are possible.

  5. Linked-View Parallel Coordinate Plot Renderer

    Energy Science and Technology Software Center (ESTSC)

    2011-06-28

    This software allows multiple linked views for interactive querying via map-based data selection, bar chart analytic overlays, and high dynamic range (HDR) line renderings. The major component of the visualization package is a parallel coordinate renderer with binning, curved layouts, shader-based rendering, and other techniques to allow interactive visualization of multidimensional data.

  6. Eclipse Parallel Tools Platform

    Energy Science and Technology Software Center (ESTSC)

    2005-02-18

    Designing and developing parallel programs is an inherently complex task. Developers must choose from the many parallel architectures and programming paradigms that are available, and face a plethora of tools that are required to execute, debug, and analyze parallel programs i these environments. Few, if any, of these tools provide any degree of integration, or indeed any commonality in their user interfaces at all. This further complicates the parallel developer's task, hampering software engineering practices,more » and ultimately reducing productivity. One consequence of this complexity is that best practice in parallel application development has not advanced to the same degree as more traditional programming methodologies. The result is that there is currently no open-source, industry-strength platform that provides a highly integrated environment specifically designed for parallel application development. Eclipse is a universal tool-hosting platform that is designed to providing a robust, full-featured, commercial-quality, industry platform for the development of highly integrated tools. It provides a wide range of core services for tool integration that allow tool producers to concentrate on their tool technology rather than on platform specific issues. The Eclipse Integrated Development Environment is an open-source project that is supported by over 70 organizations, including IBM, Intel and HP. The Eclipse Parallel Tools Platform (PTP) plug-in extends the Eclipse framwork by providing support for a rich set of parallel programming languages and paradigms, and a core infrastructure for the integration of a wide variety of parallel tools. The first version of the PTP is a prototype that only provides minimal functionality for parallel tool integration of a wide variety of parallel tools. The first version of the PTP is a prototype that only provides minimal functionality for parallel tool integration, support for a small number of parallel architectures, and basis Fortran integration. Future versions will extend the functionality substantially, provide a number of core parallel tools, and provide support across a wide rang of parallel architectures and languages.« less

  7. Advanced parallel processing with supercomputer architectures

    SciTech Connect

    Hwang, K.

    1987-10-01

    This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers.

  8. Mirror versus parallel bimanual reaching

    PubMed Central

    2013-01-01

    Background In spite of their importance to everyday function, tasks that require both hands to work together such as lifting and carrying large objects have not been well studied and the full potential of how new technology might facilitate recovery remains unknown. Methods To help identify the best modes for self-teleoperated bimanual training, we used an advanced haptic/graphic environment to compare several modes of practice. In a 2-by-2 study, we compared mirror vs. parallel reaching movements, and also compared veridical display to one that transforms the right hand’s cursor to the opposite side, reducing the area that the visual system has to monitor. Twenty healthy, right-handed subjects (5 in each group) practiced 200 movements. We hypothesized that parallel reaching movements would be the best performing, and attending to one visual area would reduce the task difficulty. Results The two-way comparison revealed that mirror movement times took an average 1.24 s longer to complete than parallel. Surprisingly, subjects’ movement times moving to one target (attending to one visual area) also took an average of 1.66 s longer than subjects moving to two targets. For both hands, there was also a significant interaction effect, revealing the lowest errors for parallel movements moving to two targets (p < 0.001). This was the only group that began and maintained low errors throughout training. Conclusion Combined with other evidence, these results suggest that the most intuitive reaching performance can be observed with parallel movements with a veridical display (moving to two separate targets). These results point to the expected levels of challenge for these bimanual training modes, which could be used to advise therapy choices in self-neurorehabilitation. PMID:23837908

  9. Introduction to the Poker Parallel Programming Environment. Interim technical report

    SciTech Connect

    Snyder, L.

    1983-08-01

    The Poker Parallel Programming Environment is a graphics-based, interactive system for programming the Configurable, High Parallel (CHiP) Computer. Designed to support nearly all aspects of parallel programming in one integrated system, Poker has been implemented as a (35,000 line) C program on the VAX 11/780 under UNIX. It provides a number of novel features including graphics programming of parallel processor communication.

  10. Parallel computing works

    SciTech Connect

    Not Available

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  11. Totally parallel multilevel algorithms

    NASA Technical Reports Server (NTRS)

    Frederickson, Paul O.

    1988-01-01

    Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.

  12. Massively parallel mathematical sieves

    SciTech Connect

    Montry, G.R.

    1989-01-01

    The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.

  13. Synchronization Of Parallel Discrete Event Simulations

    NASA Technical Reports Server (NTRS)

    Steinman, Jeffrey S.

    1992-01-01

    Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.

  14. Parallel Adaptive Mesh Refinement

    SciTech Connect

    Diachin, L; Hornung, R; Plassmann, P; WIssink, A

    2005-03-04

    As large-scale, parallel computers have become more widely available and numerical models and algorithms have advanced, the range of physical phenomena that can be simulated has expanded dramatically. Many important science and engineering problems exhibit solutions with localized behavior where highly-detailed salient features or large gradients appear in certain regions which are separated by much larger regions where the solution is smooth. Examples include chemically-reacting flows with radiative heat transfer, high Reynolds number flows interacting with solid objects, and combustion problems where the flame front is essentially a two-dimensional sheet occupying a small part of a three-dimensional domain. Modeling such problems numerically requires approximating the governing partial differential equations on a discrete domain, or grid. Grid spacing is an important factor in determining the accuracy and cost of a computation. A fine grid may be needed to resolve key local features while a much coarser grid may suffice elsewhere. Employing a fine grid everywhere may be inefficient at best and, at worst, may make an adequately resolved simulation impractical. Moreover, the location and resolution of fine grid required for an accurate solution is a dynamic property of a problem's transient features and may not be known a priori. Adaptive mesh refinement (AMR) is a technique that can be used with both structured and unstructured meshes to adjust local grid spacing dynamically to capture solution features with an appropriate degree of resolution. Thus, computational resources can be focused where and when they are needed most to efficiently achieve an accurate solution without incurring the cost of a globally-fine grid. Figure 1.1 shows two example computations using AMR; on the left is a structured mesh calculation of a impulsively-sheared contact surface and on the right is the fuselage and volume discretization of an RAH-66 Comanche helicopter [35]. Note the ability of both meshing methods to resolve simulation details by varying the local grid spacing.

  15. Parallel nearest neighbor calculations

    NASA Astrophysics Data System (ADS)

    Trease, Harold

    We are just starting to parallelize the nearest neighbor portion of our free-Lagrange code. Our implementation of the nearest neighbor reconnection algorithm has not been parallelizable (i.e., we just flip one connection at a time). In this paper we consider what sort of nearest neighbor algorithms lend themselves to being parallelized. For example, the construction of the Voronoi mesh can be parallelized, but the construction of the Delaunay mesh (dual to the Voronoi mesh) cannot because of degenerate connections. We will show our most recent attempt to tessellate space with triangles or tetrahedrons with a new nearest neighbor construction algorithm called DAM (Dial-A-Mesh). This method has the characteristics of a parallel algorithm and produces a better tessellation of space than the Delaunay mesh. Parallel processing is becoming an everyday reality for us at Los Alamos. Our current production machines are Cray YMPs with 8 processors that can run independently or combined to work on one job. We are also exploring massive parallelism through the use of two 64K processor Connection Machines (CM2), where all the processors run in lock step mode. The effective application of 3-D computer models requires the use of parallel processing to achieve reasonable "turn around" times for our calculations.

  16. Bilingual parallel programming

    SciTech Connect

    Foster, I.; Overbeek, R.

    1990-01-01

    Numerous experiments have demonstrated that computationally intensive algorithms support adequate parallelism to exploit the potential of large parallel machines. Yet successful parallel implementations of serious applications are rare. The limiting factor is clearly programming technology. None of the approaches to parallel programming that have been proposed to date -- whether parallelizing compilers, language extensions, or new concurrent languages -- seem to adequately address the central problems of portability, expressiveness, efficiency, and compatibility with existing software. In this paper, we advocate an alternative approach to parallel programming based on what we call bilingual programming. We present evidence that this approach provides and effective solution to parallel programming problems. The key idea in bilingual programming is to construct the upper levels of applications in a high-level language while coding selected low-level components in low-level languages. This approach permits the advantages of a high-level notation (expressiveness, elegance, conciseness) to be obtained without the cost in performance normally associated with high-level approaches. In addition, it provides a natural framework for reusing existing code.

  17. The NAS parallel benchmarks

    NASA Technical Reports Server (NTRS)

    Bailey, David (Editor); Barton, John (Editor); Lasinski, Thomas (Editor); Simon, Horst (Editor)

    1993-01-01

    A new set of benchmarks was developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of a set of kernels, the 'Parallel Kernels,' and a simulated application benchmark. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification - all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.

  18. Radiative Heat Transfer in Combustion Applications: Parallel Efficiencies of Two Gas Models, Turbulent Radiation Interactions in Particulate Laden Flows, and Coarse Mesh Finite Difference Acceleration for Improved Temporal Accuracy

    NASA Astrophysics Data System (ADS)

    Cleveland, Mathew A.

    We investigate several aspects of the numerical solution of the radiative transfer equation in the context of coal combustion: the parallel efficiency of two commonly-used opacity models, the sensitivity of turbulent radiation interaction (TRI) effects to the presence of coal particulate, and an improvement of the order of temporal convergence using the coarse mesh finite difference (CMFD) method. There are four opacity models commonly employed to evaluate the radiative transfer equation in combustion applications; line-by-line (LBL), multigroup, band, and global. Most of these models have been rigorously evaluated for serial computations of a spectrum of problem types [1]. Studies of these models for parallel computations [2] are limited. We assessed the performance of the Spectral-Line-Based weighted sum of gray gasses (SLW) model, a global method related to K-distribution methods [1], and the LBL model. The LBL model directly interpolates opacity information from large data tables. The LBL model outperforms the SLW model in almost all cases, as suggested by Wang et al. [3]. The SLW model, however, shows superior parallel scaling performance and a decreased sensitivity to load imbalancing, suggesting that for some problems, global methods such as the SLW model, could outperform the LBL model. Turbulent radiation interaction (TRI) effects are associated with the differences in the time scales of the fluid dynamic equations and the radiative transfer equations. Solving on the fluid dynamic time step size produces large changes in the radiation field over the time step. We have modified the statistically homogeneous, non-premixed flame problem of Deshmukh et al. [4] to include coal-type particulate. The addition of low mass loadings of particulate minimally impacts the TRI effects. Observed differences in the TRI effects from variations in the packing fractions and Stokes numbers are difficult to analyze because of the significant effect of variations in problem initialization. The TRI effects are very sensitive to the initialization of the turbulence in the system. The TRI parameters are somewhat sensitive to the treatment of particulate temperature and the particulate optical thickness, and this effect are amplified by increased particulate loading. Monte Carlo radiative heat transfer simulations of time-dependent combustion processes generally involve an explicit evaluation of emission source because of the expense of the transport solver. Recently, Park et al. [5] have applied quasi-diffusion with Monte Carlo in high energy density radiative transfer applications. We employ a Crank-Nicholson temporal integration scheme in conjunction with the coarse mesh finite difference (CMFD) method, in an effort to improve the temporal accuracy of the Monte Carlo solver. Our results show that this CMFD-CN method is an improvement over Monte Carlo with CMFD time-differenced via Backward Euler, and Implicit Monte Carlo [6] (IMC). The increase in accuracy involves very little increase in computational cost, and the figure of merit for the CMFD-CN scheme is greater than IMC.

  19. Simplified Parallel Domain Traversal

    SciTech Connect

    Erickson III, David J

    2011-01-01

    Many data-intensive scientific analysis techniques require global domain traversal, which over the years has been a bottleneck for efficient parallelization across distributed-memory architectures. Inspired by MapReduce and other simplified parallel programming approaches, we have designed DStep, a flexible system that greatly simplifies efficient parallelization of domain traversal techniques at scale. In order to deliver both simplicity to users as well as scalability on HPC platforms, we introduce a novel two-tiered communication architecture for managing and exploiting asynchronous communication loads. We also integrate our design with advanced parallel I/O techniques that operate directly on native simulation output. We demonstrate DStep by performing teleconnection analysis across ensemble runs of terascale atmospheric CO{sub 2} and climate data, and we show scalability results on up to 65,536 IBM BlueGene/P cores.

  20. The Parallel Axiom

    ERIC Educational Resources Information Center

    Rogers, Pat

    1972-01-01

    Criteria for a reasonable axiomatic system are discussed. A discussion of the historical attempts to prove the independence of Euclids parallel postulate introduces non-Euclidean geometries. Poincare's model for a non-Euclidean geometry is defined and analyzed. (LS)

  1. Parallel programming with PCN

    SciTech Connect

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  2. Scalable parallel communications

    NASA Technical Reports Server (NTRS)

    Maly, K.; Khanna, S.; Overstreet, C. M.; Mukkamala, R.; Zubair, M.; Sekhar, Y. S.; Foudriat, E. C.

    1992-01-01

    Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth service to a single application); and (3) coarse grain parallelism will be able to incorporate many future improvements from related work (e.g., reduced data movement, fast TCP, fine-grain parallelism) also with near linear speed-ups.

  3. Parallel image compression

    NASA Technical Reports Server (NTRS)

    Reif, John H.

    1987-01-01

    A parallel compression algorithm for the 16,384 processor MPP machine was developed. The serial version of the algorithm can be viewed as a combination of on-line dynamic lossless test compression techniques (which employ simple learning strategies) and vector quantization. These concepts are described. How these concepts are combined to form a new strategy for performing dynamic on-line lossy compression is discussed. Finally, the implementation of this algorithm in a massively parallel fashion on the MPP is discussed.

  4. Code Parallelization with CAPO: A User Manual

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry; Biegel, Bryan (Technical Monitor)

    2001-01-01

    A software tool has been developed to assist the parallelization of scientific codes. This tool, CAPO, extends an existing parallelization toolkit, CAPTools developed at the University of Greenwich, to generate OpenMP parallel codes for shared memory architectures. This is an interactive toolkit to transform a serial Fortran application code to an equivalent parallel version of the software - in a small fraction of the time normally required for a manual parallelization. We first discuss the way in which loop types are categorized and how efficient OpenMP directives can be defined and inserted into the existing code using the in-depth interprocedural analysis. The use of the toolkit on a number of application codes ranging from benchmark to real-world application codes is presented. This will demonstrate the great potential of using the toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of processors. The second part of the document gives references to the parameters and the graphic user interface implemented in the toolkit. Finally a set of tutorials is included for hands-on experiences with this toolkit.

  5. Finite parallel wavelengths and ionospheric structuring

    NASA Astrophysics Data System (ADS)

    Sperling, J. L.

    1983-05-01

    In the disturbed ionosphere, gradient drift instabilities are regarded as the primary sources for fluid structuring. In the present investigation, finite wave number components parallel to the ambient magnetic field are included in an analysis of Rayleigh-Taylor and E X B gradient drift instabilities. It is found that the accompanying small but finite parallel electric fields are not totally electrostatic, but rather can have a significant inductive nature. The dispersion relation including parallel wave number and parallel electric fields is derived. Attention is also given to quasi-linear interactions, and two examples which illustrate situations in which the additional diffusion associated with magnetic field fluctuations is likely to play a role.

  6. Parallel time integration software

    SciTech Connect

    2014-07-01

    This package implements an optimal-scaling multigrid solver for the (non) linear systems that arise from the discretization of problems with evolutionary behavior. Typically, solution algorithms for evolution equations are based on a time-marching approach, solving sequentially for one time step after the other. Parallelism in these traditional time-integrarion techniques is limited to spatial parallelism. However, current trends in computer architectures are leading twards system with more, but not faster. processors. Therefore, faster compute speeds must come from greater parallelism. One approach to achieve parallelism in time is with multigrid, but extending classical multigrid methods for elliptic poerators to this setting is a significant achievement. In this software, we implement a non-intrusive, optimal-scaling time-parallel method based on multigrid reduction techniques. The examples in the package demonstrate optimality of our multigrid-reduction-in-time algorithm (MGRIT) for solving a variety of parabolic equations in two and three sparial dimensions. These examples can also be used to show that MGRIT can achieve significant speedup in comparison to sequential time marching on modern architectures.

  7. Parallel time integration software

    Energy Science and Technology Software Center (ESTSC)

    2014-07-01

    This package implements an optimal-scaling multigrid solver for the (non) linear systems that arise from the discretization of problems with evolutionary behavior. Typically, solution algorithms for evolution equations are based on a time-marching approach, solving sequentially for one time step after the other. Parallelism in these traditional time-integrarion techniques is limited to spatial parallelism. However, current trends in computer architectures are leading twards system with more, but not faster. processors. Therefore, faster compute speeds mustmore » come from greater parallelism. One approach to achieve parallelism in time is with multigrid, but extending classical multigrid methods for elliptic poerators to this setting is a significant achievement. In this software, we implement a non-intrusive, optimal-scaling time-parallel method based on multigrid reduction techniques. The examples in the package demonstrate optimality of our multigrid-reduction-in-time algorithm (MGRIT) for solving a variety of parabolic equations in two and three sparial dimensions. These examples can also be used to show that MGRIT can achieve significant speedup in comparison to sequential time marching on modern architectures.« less

  8. Sublattice parallel replica dynamics

    NASA Astrophysics Data System (ADS)

    Martínez, Enrique; Uberuaga, Blas P.; Voter, Arthur F.

    2014-06-01

    Exascale computing presents a challenge for the scientific community as new algorithms must be developed to take full advantage of the new computing paradigm. Atomistic simulation methods that offer full fidelity to the underlying potential, i.e., molecular dynamics (MD) and parallel replica dynamics, fail to use the whole machine speedup, leaving a region in time and sample size space that is unattainable with current algorithms. In this paper, we present an extension of the parallel replica dynamics algorithm [A. F. Voter, Phys. Rev. B 57, R13985 (1998), 10.1103/PhysRevB.57.R13985] by combining it with the synchronous sublattice approach of Shim and Amar [Y. Shim and J. G. Amar, Phys. Rev. B 71, 125432 (2005), 10.1103/PhysRevB.71.125432], thereby exploiting event locality to improve the algorithm scalability. This algorithm is based on a domain decomposition in which events happen independently in different regions in the sample. We develop an analytical expression for the speedup given by this sublattice parallel replica dynamics algorithm and compare it with parallel MD and traditional parallel replica dynamics. We demonstrate how this algorithm, which introduces a slight additional approximation of event locality, enables the study of physical systems unreachable with traditional methodologies and promises to better utilize the resources of current high performance and future exascale computers.

  9. Sublattice parallel replica dynamics.

    PubMed

    Martínez, Enrique; Uberuaga, Blas P; Voter, Arthur F

    2014-06-01

    Exascale computing presents a challenge for the scientific community as new algorithms must be developed to take full advantage of the new computing paradigm. Atomistic simulation methods that offer full fidelity to the underlying potential, i.e., molecular dynamics (MD) and parallel replica dynamics, fail to use the whole machine speedup, leaving a region in time and sample size space that is unattainable with current algorithms. In this paper, we present an extension of the parallel replica dynamics algorithm [A. F. Voter, Phys. Rev. B 57, R13985 (1998)] by combining it with the synchronous sublattice approach of Shim and Amar [ and , Phys. Rev. B 71, 125432 (2005)], thereby exploiting event locality to improve the algorithm scalability. This algorithm is based on a domain decomposition in which events happen independently in different regions in the sample. We develop an analytical expression for the speedup given by this sublattice parallel replica dynamics algorithm and compare it with parallel MD and traditional parallel replica dynamics. We demonstrate how this algorithm, which introduces a slight additional approximation of event locality, enables the study of physical systems unreachable with traditional methodologies and promises to better utilize the resources of current high performance and future exascale computers. PMID:25019913

  10. Parallel architectures for vision

    SciTech Connect

    Maresca, M. ); Lavin, M.A. ); Li, H. )

    1988-08-01

    Vision computing involves the execution of a large number of operations on large sets of structured data. Sequential computers cannot achieve the speed required by most of the current applications and therefore parallel architectural solutions have to be explored. In this paper the authors examine the options that drive the design of a vision oriented computer, starting with the analysis of the basic vision computation and communication requirements. They briefly review the classical taxonomy for parallel computers, based on the multiplicity of the instruction and data stream, and apply a recently proposed criterion, the degree of autonomy of each processor, to further classify fine-grain SIMD massively parallel computers. They identify three types of processor autonomy, namely operation autonomy, addressing autonomy, and connection autonomy. For each type they give the basic definitions and show some examples. They focus on the concept of connection autonomy, which they believe is a key point in the development of massively parallel architectures for vision. They show two examples of parallel computers featuring different types of connection autonomy - the Connection Machine and the Polymorphic-Torus - and compare their cost and benefit.

  11. Parallel shear and turbulence

    NASA Astrophysics Data System (ADS)

    Hayes, Tiffany; Gilmore, Mark; Watts, Christopher; Xie, Shuangwei; Yan, Lincan

    2009-11-01

    Instabilities may be caused in plasma due to (shear) flow. These flows can be transverse or parallel to the magnetic field. Past work has generally focussed on controlling and understanding the processes that occur from (shear) flow transverse to the magnetic field. At UNM experimental work is being performed in the the HelCat device (Helicon Cathode) to control the parallel flow in order to study and understand the processes that arise from this situation. It is also our aim to be able to control the transverse flow simulatneously, but independently of the parallel flow. By inserting a system of biased rings and grids into the plasma we are able to modify the flows, and hence the turbulence. Flows are measured using a seven-tip Mach probe. Results of our ability to control the flows independently are presented.

  12. Multicanonical parallel tempering

    NASA Astrophysics Data System (ADS)

    Faller, Roland; Yan, Qiliang; de Pablo, Juan J.

    2002-04-01

    We present a novel implementation of the parallel tempering Monte Carlo method in a multicanonical ensemble. Multicanonical weights are derived by a self-consistent iterative process using a Boltzmann inversion of global energy histograms. This procedure gives rise to a much broader overlap of thermodynamic-property histograms; fewer replicas are necessary in parallel tempering simulations, and the acceptance of trial swap moves can be made arbitrarily high. We demonstrate the usefulness of the method in the context of a grand-multicanonical ensemble, where we use multicanonical simulations in energy space with the addition of an unmodified chemical potential term in particle-number space. Several possible implementations are discussed, and the best choice is presented in the context of the liquid-gas phase transition of the Lennard-Jones fluid. A substantial decrease in the necessary number of replicas can be achieved through the proposed method, thereby providing a higher efficiency and the possibility of parallelization.

  13. Parallel channel flow excursions

    SciTech Connect

    Johnston, B.S.

    1990-01-01

    Among the many known types of vapor-liquid flow instability is the excursion which may occur in heated parallel channels. Under certain conditions, the pressure drop requirement in a heated channel may increase with decreases in flow rate. This leads to an excursive reduction in flow. For channels heated by electricity or nuclear fission, this can result in overheating and damage to the channel. In the design of any parallel channel device, flow excursion limits should be established. After a review of parallel channel behavior and analysis, a conservative criterion will be proposed for avoiding excursions. In support of this criterion, recent experimental work on boiling in downward flow will be described. 5 figs.

  14. Parallel optical sampler

    DOEpatents

    Tauke-Pedretti, Anna; Skogen, Erik J; Vawter, Gregory A

    2014-05-20

    An optical sampler includes a first and second 1.times.n optical beam splitters splitting an input optical sampling signal and an optical analog input signal into n parallel channels, respectively, a plurality of optical delay elements providing n parallel delayed input optical sampling signals, n photodiodes converting the n parallel optical analog input signals into n respective electrical output signals, and n optical modulators modulating the input optical sampling signal or the optical analog input signal by the respective electrical output signals, and providing n successive optical samples of the optical analog input signal. A plurality of output photodiodes and eADCs convert the n successive optical samples to n successive digital samples. The optical modulator may be a photodiode interconnected Mach-Zehnder Modulator. A method of sampling the optical analog input signal is disclosed.

  15. Parallel programming with Ada

    SciTech Connect

    Kok, J.

    1988-01-01

    To the human programmer the ease of coding distributed computing is highly dependent on the suitability of the employed programming language. But with a particular language it is also important whether the possibilities of one or more parallel architectures can efficiently be addressed by available language constructs. In this paper the possibilities are discussed of the high-level language Ada and in particular of its tasking concept as a descriptional tool for the design and implementation of numerical and other algorithms that allow execution of parts in parallel. Language tools are explained and their use for common applications is shown. Conclusions are drawn about the usefulness of several Ada concepts.

  16. The NAS Parallel Benchmarks

    SciTech Connect

    Bailey, David H.

    2009-11-15

    The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, although the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage over vector supercomputers, and, if so, which of the parallel offerings would be most useful in real-world scientific computation. In part to draw attention to some of the performance reporting abuses prevalent at the time, the present author wrote a humorous essay 'Twelve Ways to Fool the Masses,' which described in a light-hearted way a number of the questionable ways in which both vendor marketing people and scientists were inflating and distorting their performance results. All of this underscored the need for an objective and scientifically defensible measure to compare performance on these systems.

  17. Coarrars for Parallel Processing

    NASA Technical Reports Server (NTRS)

    Snyder, W. Van

    2011-01-01

    The design of the Coarray feature of Fortran 2008 was guided by answering the question "What is the smallest change required to convert Fortran to a robust and efficient parallel language." Two fundamental issues that any parallel programming model must address are work distribution and data distribution. In order to coordinate work distribution and data distribution, methods for communication and synchronization must be provided. Although originally designed for Fortran, the Coarray paradigm has stimulated development in other languages. X10, Chapel, UPC, Titanium, and class libraries being developed for C++ have the same conceptual framework.

  18. Speeding up parallel processing

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.

    1988-01-01

    In 1967 Amdahl expressed doubts about the ultimate utility of multiprocessors. The formulation, now called Amdahl's law, became part of the computing folklore and has inspired much skepticism about the ability of the current generation of massively parallel processors to efficiently deliver all their computing power to programs. The widely publicized recent results of a group at Sandia National Laboratory, which showed speedup on a 1024 node hypercube of over 500 for three fixed size problems and over 1000 for three scalable problems, have convincingly challenged this bit of folklore and have given new impetus to parallel scientific computing.

  19. Adaptive parallel logic networks

    NASA Technical Reports Server (NTRS)

    Martinez, Tony R.; Vidal, Jacques J.

    1988-01-01

    Adaptive, self-organizing concurrent systems (ASOCS) that combine self-organization with massive parallelism for such applications as adaptive logic devices, robotics, process control, and system malfunction management, are presently discussed. In ASOCS, an adaptive network composed of many simple computing elements operating in combinational and asynchronous fashion is used and problems are specified by presenting if-then rules to the system in the form of Boolean conjunctions. During data processing, which is a different operational phase from adaptation, the network acts as a parallel hardware circuit.

  20. Parallelization of Thermochemical Nanolithography

    NASA Astrophysics Data System (ADS)

    Curtis, Jennifer E.; Carroll, Keith; Lu, Xi; Kim, Suenne; Gao, Yang; Kim, Hoe-Joon; Somnath, Suhas; Polloni, Laura; Sordan, Roman; King, William; Riedo, Elisa

    2014-03-01

    One of the most pressing technological challenges in the development of next generation nanoscale devices is the rapid, parallel, precise and robust fabrication of nanostructures. We demonstrate the possibility to parallelize thermochemical nanolithography (TCNL) by employing five nano-tips for the fabrication of luminescent polymer nanostructures and graphene-based nanoribbons. This work has been supported by the National Science Foundation PHYS 0848797 (J.E.C.), CMMI 1100290 (E.R., W.P.K), MRSEC program DMR 0820382 (E.R., J.E.C.), and the Office of Basic Energy Sciences DOE DE-FG02-06ER46293 (E.R.).

  1. Highly parallel computation

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.; Tichy, Walter F.

    1990-01-01

    Among the highly parallel computing architectures required for advanced scientific computation, those designated 'MIMD' and 'SIMD' have yielded the best results to date. The present development status evaluation of such architectures shown neither to have attained a decisive advantage in most near-homogeneous problems' treatment; in the cases of problems involving numerous dissimilar parts, however, such currently speculative architectures as 'neural networks' or 'data flow' machines may be entailed. Data flow computers are the most practical form of MIMD fine-grained parallel computers yet conceived; they automatically solve the problem of assigning virtual processors to the real processors in the machine.

  2. VLSI and parallel computation

    SciTech Connect

    Suaya, R.; Birtwistle, G.

    1988-01-01

    This volume presents a cross-section of the most current research in parallel computation encompassing theoretical models, VLSI design, routing, and machine implementations. The book comprises a series of invited tutorial chapters on advanced topics in VLSI and concurrency. The chapters have been revised and updated to form a coherent volume exploring issues of fundamental importance in parallel computation, as well as significant research results in the contributor's specialties. Topics include load sharing models, PRAM models of computation, neural networks, Cochlea models, the design of algorithms for explicit concurrency, and VLSI CAD.

  3. Deoxyribo Nanonucleic Acid: Antiparallel, Parallel and Unparalleled

    SciTech Connect

    Egli, M.

    2010-03-05

    The crystal structure of a single-stranded DNA oligonucleotide has revealed formation of a unique three-dimensional array by continuous antiparallel and parallel pairing between monomers. The array is based on tertiary interactions and represents a second-generation nanotechnological system.

  4. Parallel Molecular Dynamics Program for Molecules

    Energy Science and Technology Software Center (ESTSC)

    1995-03-07

    ParBond is a parallel classical molecular dynamics code that models bonded molecular systems, typically of an organic nature. It uses classical force fields for both non-bonded Coulombic and Van der Waals interactions and for 2-, 3-, and 4-body bonded (bond, angle, dihedral, and improper) interactions. It integrates Newton''s equation of motion for the molecular system and evaluates various thermodynamical properties of the system as it progresses.

  5. Massively parallel processor computer

    NASA Technical Reports Server (NTRS)

    Fung, L. W. (Inventor)

    1983-01-01

    An apparatus for processing multidimensional data with strong spatial characteristics, such as raw image data, characterized by a large number of parallel data streams in an ordered array is described. It comprises a large number (e.g., 16,384 in a 128 x 128 array) of parallel processing elements operating simultaneously and independently on single bit slices of a corresponding array of incoming data streams under control of a single set of instructions. Each of the processing elements comprises a bidirectional data bus in communication with a register for storing single bit slices together with a random access memory unit and associated circuitry, including a binary counter/shift register device, for performing logical and arithmetical computations on the bit slices, and an I/O unit for interfacing the bidirectional data bus with the data stream source. The massively parallel processor architecture enables very high speed processing of large amounts of ordered parallel data, including spatial translation by shifting or sliding of bits vertically or horizontally to neighboring processing elements.

  6. Parallel fast gauss transform

    SciTech Connect

    Sampath, Rahul S; Sundar, Hari; Veerapaneni, Shravan

    2010-01-01

    We present fast adaptive parallel algorithms to compute the sum of N Gaussians at N points. Direct sequential computation of this sum would take O(N{sup 2}) time. The parallel time complexity estimates for our algorithms are O(N/n{sub p}) for uniform point distributions and O( (N/n{sub p}) log (N/n{sub p}) + n{sub p}log n{sub p}) for non-uniform distributions using n{sub p} CPUs. We incorporate a plane-wave representation of the Gaussian kernel which permits 'diagonal translation'. We use parallel octrees and a new scheme for translating the plane-waves to efficiently handle non-uniform distributions. Computing the transform to six-digit accuracy at 120 billion points took approximately 140 seconds using 4096 cores on the Jaguar supercomputer. Our implementation is 'kernel-independent' and can handle other 'Gaussian-type' kernels even when explicit analytic expression for the kernel is not known. These algorithms form a new class of core computational machinery for solving parabolic PDEs on massively parallel architectures.

  7. High performance parallel architectures

    SciTech Connect

    Anderson, R.E. )

    1989-09-01

    In this paper the author describes current high performance parallel computer architectures. A taxonomy is presented to show computer architecture from the user programmer's point-of-view. The effects of the taxonomy upon the programming model are described. Some current architectures are described with respect to the taxonomy. Finally, some predictions about future systems are presented. 5 refs., 1 fig.

  8. Parallel hierarchical global illumination

    SciTech Connect

    Snell, Q.O.

    1997-10-08

    Solving the global illumination problem is equivalent to determining the intensity of every wavelength of light in all directions at every point in a given scene. The complexity of the problem has led researchers to use approximation methods for solving the problem on serial computers. Rather than using an approximation method, such as backward ray tracing or radiosity, the authors have chosen to solve the Rendering Equation by direct simulation of light transport from the light sources. This paper presents an algorithm that solves the Rendering Equation to any desired accuracy, and can be run in parallel on distributed memory or shared memory computer systems with excellent scaling properties. It appears superior in both speed and physical correctness to recent published methods involving bidirectional ray tracing or hybrid treatments of diffuse and specular surfaces. Like progressive radiosity methods, it dynamically refines the geometry decomposition where required, but does so without the excessive storage requirements for ray histories. The algorithm, called Photon, produces a scene which converges to the global illumination solution. This amounts to a huge task for a 1997-vintage serial computer, but using the power of a parallel supercomputer significantly reduces the time required to generate a solution. Currently, Photon can be run on most parallel environments from a shared memory multiprocessor to a parallel supercomputer, as well as on clusters of heterogeneous workstations.

  9. Parallel hierarchical radiosity rendering

    SciTech Connect

    Carter, M.

    1993-07-01

    In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

  10. Parallel programming with PCN

    SciTech Connect

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  11. Parallel Total Energy

    Energy Science and Technology Software Center (ESTSC)

    2004-10-21

    This is a total energy electronic structure code using Local Density Approximation (LDA) of the density funtional theory. It uses the plane wave as the wave function basis set. It can sue both the norm conserving pseudopotentials and the ultra soft pseudopotentials. It can relax the atomic positions according to the total energy. It is a parallel code using MP1.

  12. Parallel Multigrid Equation Solver

    Energy Science and Technology Software Center (ESTSC)

    2001-09-07

    Prometheus is a fully parallel multigrid equation solver for matrices that arise in unstructured grid finite element applications. It includes a geometric and an algebraic multigrid method and has solved problems of up to 76 mullion degrees of feedom, problems in linear elasticity on the ASCI blue pacific and ASCI red machines.

  13. Massively parallel signature sequencing.

    PubMed

    Zhou, Daixing; Rao, Mahendra S; Walker, Roger; Khrebtukova, Irina; Haudenschild, Christian D; Miura, Takumi; Decola, Shannon; Vermaas, Eric; Moon, Keith; Vasicek, Thomas J

    2006-01-01

    Massively parallel signature sequencing is an ultra-high throughput sequencing technology. It can simultaneously sequence millions of sequence tags, and, therefore, is ideal for whole genome analysis. When applied to expression profiling, it reveals almost every transcript in the sample and provides its accurate expression level. This chapter describes the technology and its application in establishing stem cell transcriptome databases. PMID:16881523

  14. Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors

    NASA Technical Reports Server (NTRS)

    Banerjee, Prithviraj; Jones, Mark Howard; Sargent, Jeff S.

    1990-01-01

    Two parallel algorithms for standard cell placement using simulated annealing are developed to run on distributed-memory message-passing hypercube multiprocessors. The cells can be mapped in a two-dimensional area of a chip onto processors in an n-dimensional hypercube in two ways, such that both small and large cell exchange and displacement moves can be applied. The computation of the cost function in parallel among all the processors in the hypercube is described, along with a distributed data structure that needs to be stored in the hypercube to support the parallel cost evaluation. A novel tree broadcasting strategy is used extensively for updating cell locations in the parallel environment. A dynamic parallel annealing schedule estimates the errors due to interacting parallel moves and adapts the rate of synchronization automatically. Two novel approaches in controlling error in parallel algorithms are described: heuristic cell coloring and adaptive sequence control.

  15. Programming parallel architectures - The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1989-01-01

    This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.

  16. A massively asynchronous, parallel brain

    PubMed Central

    Zeki, Semir

    2015-01-01

    Whether the visual brain uses a parallel or a serial, hierarchical, strategy to process visual signals, the end result appears to be that different attributes of the visual scene are perceived asynchronously—with colour leading form (orientation) by 40 ms and direction of motion by about 80 ms. Whatever the neural root of this asynchrony, it creates a problem that has not been properly addressed, namely how visual attributes that are perceived asynchronously over brief time windows after stimulus onset are bound together in the longer term to give us a unified experience of the visual world, in which all attributes are apparently seen in perfect registration. In this review, I suggest that there is no central neural clock in the (visual) brain that synchronizes the activity of different processing systems. More likely, activity in each of the parallel processing-perceptual systems of the visual brain is reset independently, making of the brain a massively asynchronous organ, just like the new generation of more efficient computers promise to be. Given the asynchronous operations of the brain, it is likely that the results of activities in the different processing-perceptual systems are not bound by physiological interactions between cells in the specialized visual areas, but post-perceptually, outside the visual brain. PMID:25823871

  17. Self-testing in parallel

    NASA Astrophysics Data System (ADS)

    McKague, Matthew

    2016-04-01

    Self-testing allows us to determine, through classical interaction only, whether some players in a non-local game share particular quantum states. Most work on self-testing has concentrated on developing tests for small states like one pair of maximally entangled qubits, or on tests where there is a separate player for each qubit, as in a graph state. Here we consider the case of testing many maximally entangled pairs of qubits shared between two players. Previously such a test was shown where testing is sequential, i.e., one pair is tested at a time. Here we consider the parallel case where all pairs are tested simultaneously, giving considerably more power to dishonest players. We derive sufficient conditions for a self-test for many maximally entangled pairs of qubits shared between two players and also two constructions for self-tests where all pairs are tested simultaneously.

  18. Parallel grid population

    DOEpatents

    Wald, Ingo; Ize, Santiago

    2015-07-28

    Parallel population of a grid with a plurality of objects using a plurality of processors. One example embodiment is a method for parallel population of a grid with a plurality of objects using a plurality of processors. The method includes a first act of dividing a grid into n distinct grid portions, where n is the number of processors available for populating the grid. The method also includes acts of dividing a plurality of objects into n distinct sets of objects, assigning a distinct set of objects to each processor such that each processor determines by which distinct grid portion(s) each object in its distinct set of objects is at least partially bounded, and assigning a distinct grid portion to each processor such that each processor populates its distinct grid portion with any objects that were previously determined to be at least partially bounded by its distinct grid portion.

  19. Parallel Anisotropic Tetrahedral Adaptation

    NASA Technical Reports Server (NTRS)

    Park, Michael A.; Darmofal, David L.

    2008-01-01

    An adaptive method that robustly produces high aspect ratio tetrahedra to a general 3D metric specification without introducing hybrid semi-structured regions is presented. The elemental operators and higher-level logic is described with their respective domain-decomposed parallelizations. An anisotropic tetrahedral grid adaptation scheme is demonstrated for 1000-1 stretching for a simple cube geometry. This form of adaptation is applicable to more complex domain boundaries via a cut-cell approach as demonstrated by a parallel 3D supersonic simulation of a complex fighter aircraft. To avoid the assumptions and approximations required to form a metric to specify adaptation, an approach is introduced that directly evaluates interpolation error. The grid is adapted to reduce and equidistribute this interpolation error calculation without the use of an intervening anisotropic metric. Direct interpolation error adaptation is illustrated for 1D and 3D domains.

  20. Ultrascalable petaflop parallel supercomputer

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Chiu, George; Cipolla, Thomas M.; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Hall, Shawn; Haring, Rudolf A.; Heidelberger, Philip; Kopcsay, Gerard V.; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan; Takken, Todd

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  1. Parallel Subconvolution Filtering Architectures

    NASA Technical Reports Server (NTRS)

    Gray, Andrew A.

    2003-01-01

    These architectures are based on methods of vector processing and the discrete-Fourier-transform/inverse-discrete- Fourier-transform (DFT-IDFT) overlap-and-save method, combined with time-block separation of digital filters into frequency-domain subfilters implemented by use of sub-convolutions. The parallel-processing method implemented in these architectures enables the use of relatively small DFT-IDFT pairs, while filter tap lengths are theoretically unlimited. The size of a DFT-IDFT pair is determined by the desired reduction in processing rate, rather than on the order of the filter that one seeks to implement. The emphasis in this report is on those aspects of the underlying theory and design rules that promote computational efficiency, parallel processing at reduced data rates, and simplification of the designs of very-large-scale integrated (VLSI) circuits needed to implement high-order filters and correlators.

  2. Parallel multilevel preconditioners

    SciTech Connect

    Bramble, J.H.; Pasciak, J.E.; Xu, Jinchao.

    1989-01-01

    In this paper, we shall report on some techniques for the development of preconditioners for the discrete systems which arise in the approximation of solutions to elliptic boundary value problems. Here we shall only state the resulting theorems. It has been demonstrated that preconditioned iteration techniques often lead to the most computationally effective algorithms for the solution of the large algebraic systems corresponding to boundary value problems in two and three dimensional Euclidean space. The use of preconditioned iteration will become even more important on computers with parallel architecture. This paper discusses an approach for developing completely parallel multilevel preconditioners. In order to illustrate the resulting algorithms, we shall describe the simplest application of the technique to a model elliptic problem.

  3. PCLIPS: Parallel CLIPS

    NASA Technical Reports Server (NTRS)

    Gryphon, Coranth D.; Miller, Mark D.

    1991-01-01

    PCLIPS (Parallel CLIPS) is a set of extensions to the C Language Integrated Production System (CLIPS) expert system language. PCLIPS is intended to provide an environment for the development of more complex, extensive expert systems. Multiple CLIPS expert systems are now capable of running simultaneously on separate processors, or separate machines, thus dramatically increasing the scope of solvable tasks within the expert systems. As a tool for parallel processing, PCLIPS allows for an expert system to add to its fact-base information generated by other expert systems, thus allowing systems to assist each other in solving a complex problem. This allows individual expert systems to be more compact and efficient, and thus run faster or on smaller machines.

  4. Aerodynamic, aeroacoustic, and aeroelastic investigations of airfoil-vortex interaction using large-eddy simulation

    NASA Astrophysics Data System (ADS)

    Ilie, Marcel

    In helicopters, vortices (generated at the tip of the rotor blades) interact with the next advancing blades during certain flight and manoeuvring conditions, generating undesirable levels of acoustic noise and vibration. These Blade-Vortex Interactions (BVIs), which may cause the most disturbing acoustic noise, normally occur in descent or high-speed forward flight. Acoustic noise characterization (and potential reduction) is one the areas generating intensive research interest to the rotorcraft industry. Since experimental investigations of BVI are extremely costly, some insights into the BVI or AVI (2-D Airfoil-Vortex Interaction) can be gained using Computational Fluid Dynamics (CFD) numerical simulations. Numerical simulation of BVI or AVI has been of interest to CFD for many years. There are still difficulties concerning an accurate numerical prediction of BVI. One of the main issues is the inherent dissipation of CFD turbulence models, which severely affects the preservation of the vortex characteristics. Moreover this is not an issue only for aerodynamic and aeroacoustic analysis but also for aeroelastic investigations as well, especially when the strong (two-way) aeroelastic coupling is of interest. The present investigation concentrates mainly on AVI simulations. The simulations are performed for Mach number, Ma = 0.3, resulting in a Reynolds number, Re = 1.3 x 106, which is based on the chord, c, of the airfoil (NACA0012). Extensive literature search has indicated that the present work represents the first comprehensive investigation of AVI using the LES numerical approach, in the rotorcraft research community. The major factor affecting the aerodynamic coefficients and aeroacoustic field as a result of airfoil-vortex interaction is observed to be the unsteady pressure generated at the location of the interaction. The present numerical results show that the aerodynamic coefficients (lift, moment, and drag) and aeroacoustic field are strongly dependent on the airfoil-vortex vertical miss-distance, airfoil angle of attack, vortex characteristics, and aeroelastic response of airfoil to airfoil-vortex interaction. A decay of airfoil-vortex interactions with the increase of vertical miss-distance and angle of attack was observed. Also, a decay of airfoil-vortex interactions is observed for the case of a flexible structure when compared with the case of a rigid structure. The decay of vortex core size produces a decrease in the aerodynamic coefficients.

  5. Parallel sphere rendering

    SciTech Connect

    Krogh, M.; Painter, J.; Hansen, C.

    1996-10-01

    Sphere rendering is an important method for visualizing molecular dynamics data. This paper presents a parallel algorithm that is almost 90 times faster than current graphics workstations. To render extremely large data sets and large images, the algorithm uses the MIMD features of the supercomputers to divide up the data, render independent partial images, and then finally composite the multiple partial images using an optimal method. The algorithm and performance results are presented for the CM-5 and the M.

  6. ASSEMBLY OF PARALLEL PLATES

    DOEpatents

    Groh, E.F.; Lennox, D.H.

    1963-04-23

    This invention is concerned with a rigid assembly of parallel plates in which keyways are stamped out along the edges of the plates and a self-retaining key is inserted into aligned keyways. Spacers having similar keyways are included between adjacent plates. The entire assembly is locked into a rigid structure by fastening only the outermost plates to the ends of the keys. (AEC)

  7. Xyce parallel electronic simulator.

    SciTech Connect

    Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.

    2010-05-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.

  8. Trajectory optimization using parallel shooting method on parallel computer

    SciTech Connect

    Wirthman, D.J.; Park, S.Y.; Vadali, S.R.

    1995-03-01

    The efficiency of a parallel shooting method on a parallel computer for solving a variety of optimal control guidance problems is studied. Several examples are considered to demonstrate that a speedup of nearly 7 to 1 is achieved with the use of 16 processors. It is suggested that further improvements in performance can be achieved by parallelizing in the state domain. 10 refs.

  9. New Computational Methods for the Prediction and Analysis of Helicopter Noise

    NASA Technical Reports Server (NTRS)

    Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak

    1996-01-01

    This paper describes several new methods to predict and analyze rotorcraft noise. These methods are: 1) a combined computational fluid dynamics and Kirchhoff scheme for far-field noise predictions, 2) parallel computer implementation of the Kirchhoff integrations, 3) audio and visual rendering of the computed acoustic predictions over large far-field regions, and 4) acoustic tracebacks to the Kirchhoff surface to pinpoint the sources of the rotor noise. The paper describes each method and presents sample results for three test cases. The first case consists of in-plane high-speed impulsive noise and the other two cases show idealized parallel and oblique blade-vortex interactions. The computed results show good agreement with available experimental data but convey much more information about the far-field noise propagation. When taken together, these new analysis methods exploit the power of new computer technologies and offer the potential to significantly improve our prediction and understanding of rotorcraft noise.

  10. The Galley Parallel File System

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David

    1996-01-01

    As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. The interface conceals the parallelism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. We discuss Galley's file structure and application interface, as well as an application that has been implemented using that interface.

  11. Resistor Combinations for Parallel Circuits.

    ERIC Educational Resources Information Center

    McTernan, James P.

    1978-01-01

    To help simplify both teaching and learning of parallel circuits, a high school electricity/electronics teacher presents and illustrates the use of tables of values for parallel resistive circuits in which total resistances are whole numbers. (MF)

  12. Parallel software support for computational structural mechanics

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.

    1987-01-01

    The application of the parallel programming methodology known as the Force was conducted. Two application issues were addressed. The first involves the efficiency of the implementation and its completeness in terms of satisfying the needs of other researchers implementing parallel algorithms. Support for, and interaction with, other Computational Structural Mechanics (CSM) researchers using the Force was the main issue, but some independent investigation of the Barrier construct, which is extremely important to overall performance, was also undertaken. Another efficiency issue which was addressed was that of relaxing the strong synchronization condition imposed on the self-scheduled parallel DO loop. The Force was extended by the addition of logical conditions to the cases of a parallel case construct and by the inclusion of a self-scheduled version of this construct. The second issue involved applying the Force to the parallelization of finite element codes such as those found in the NICE/SPAR testbed system. One of the more difficult problems encountered is the determination of what information in COMMON blocks is actually used outside of a subroutine and when a subroutine uses a COMMON block merely as scratch storage for internal temporary results.

  13. Highly parallel computation

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.; Tichy, Walter F.

    1990-01-01

    Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed.

  14. Parallel sphere rendering

    SciTech Connect

    Krogh, M.; Hansen, C.; Painter, J.; de Verdiere, G.C.

    1995-05-01

    Sphere rendering is an important method for visualizing molecular dynamics data. This paper presents a parallel divide-and-conquer algorithm that is almost 90 times faster than current graphics workstations. To render extremely large data sets and large images, the algorithm uses the MIMD features of the supercomputers to divide up the data, render independent partial images, and then finally composite the multiple partial images using an optimal method. The algorithm and performance results are presented for the CM-5 and the T3D.

  15. Parallel Eclipse Project Checkout

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas M.; Joswig, Joseph C.; Shams, Khawaja S.; Powell, Mark W.; Bachmann, Andrew G.

    2011-01-01

    Parallel Eclipse Project Checkout (PEPC) is a program written to leverage parallelism and to automate the checkout process of plug-ins created in Eclipse RCP (Rich Client Platform). Eclipse plug-ins can be aggregated in a feature project. This innovation digests a feature description (xml file) and automatically checks out all of the plug-ins listed in the feature. This resolves the issue of manually checking out each plug-in required to work on the project. To minimize the amount of time necessary to checkout the plug-ins, this program makes the plug-in checkouts parallel. After parsing the feature, a request to checkout for each plug-in in the feature has been inserted. These requests are handled by a thread pool with a configurable number of threads. By checking out the plug-ins in parallel, the checkout process is streamlined before getting started on the project. For instance, projects that took 30 minutes to checkout now take less than 5 minutes. The effect is especially clear on a Mac, which has a network monitor displaying the bandwidth use. When running the client from a developer s home, the checkout process now saturates the bandwidth in order to get all the plug-ins checked out as fast as possible. For comparison, a checkout process that ranged from 8-200 Kbps from a developer s home is now able to saturate a pipe of 1.3 Mbps, resulting in significantly faster checkouts. Eclipse IDE (integrated development environment) tries to build a project as soon as it is downloaded. As part of another optimization, this innovation programmatically tells Eclipse to stop building while checkouts are happening, which dramatically reduces lock contention and enables plug-ins to continue downloading until all of them finish. Furthermore, the software re-enables automatic building, and forces Eclipse to do a clean build once it finishes checking out all of the plug-ins. This software is fully generic and does not contain any NASA-specific code. It can be applied to any Eclipse-based repository with a similar structure. It also can apply build parameters and preferences automatically at the end of the checkout.

  16. Adaptive multigrid in parallel

    SciTech Connect

    Stals, L.

    1995-12-01

    Early experiments with parallel multigrid used square domains and uniform grids. Recently, several authors have considered problems with more complicated domains. However these methods still use structured grids. The aim of this paper is to show that it is possible to write efficient multigrid programs using MIMD architectures. By allowing unstructured grids we can solve problems on any polygonal region and use adaptive refinement methods. The program is written in a mixture of C{sup ++} and PVM. It is designed to solve elliptic partial differential equations using the finite element method. We use newest node bisection to refine the grid and the Kernighan-Lin method to rebalance the load.

  17. Fastpath Speculative Parallelization

    NASA Astrophysics Data System (ADS)

    Spear, Michael F.; Kelsey, Kirk; Bai, Tongxin; Dalessandro, Luke; Scott, Michael L.; Ding, Chen; Wu, Peng

    We describe Fastpath, a system for speculative parallelization of sequential programs on conventional multicore processors. Our system distinguishes between the lead thread, which executes at almost-native speed, and speculative threads, which execute somewhat slower. This allows us to achieve nontrivial speedup, even on two-core machines. We present a mathematical model of potential speedup, parameterized by application characteristics and implementation constants. We also present preliminary results gleaned from two different Fastpath implementations, each derived from an implementation of software transactional memory.

  18. Parallel Pascal - An extended Pascal for parallel computers

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.

    1984-01-01

    Parallel Pascal is an extended version of the conventional serial Pascal programming language which includes a convenient syntax for specifying array operations. It is upward compatible with standard Pascal and involves only a small number of carefully chosen new features. Parallel Pascal was developed to reduce the semantic gap between standard Pascal and a large range of highly parallel computers. Two important design goals of Parallel Pascal were efficiency and portability. Portability is particularly difficult to achieve since different parallel computers frequently have very different capabilities.

  19. Roo: A parallel theorem prover

    SciTech Connect

    Lusk, E.L.; McCune, W.W.; Slaney, J.K.

    1991-11-01

    We describe a parallel theorem prover based on the Argonne theorem-proving system OTTER. The parallel system, called Roo, runs on shared-memory multiprocessors such as the Sequent Symmetry. We explain the parallel algorithm used and give performance results that demonstrate near-linear speedups on large problems.

  20. CSM parallel structural methods research

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.

    1989-01-01

    Parallel structural methods, research team activities, advanced architecture computers for parallel computational structural mechanics (CSM) research, the FLEX/32 multicomputer, a parallel structural analyses testbed, blade-stiffened aluminum panel with a circular cutout and the dynamic characteristics of a 60 meter, 54-bay, 3-longeron deployable truss beam are among the topics discussed.

  1. Programming models for parallel systems

    SciTech Connect

    Williams, S.A.

    1990-01-01

    This book focuses on parallel processing systems and the programming models that are necessary to accomplish this task. The book covers the categories of parallel programming models including sequential, array, pipeline, and shared memory processing, message passing, and functional, logic, and object-oriented programming. It examines transformation techniques. A final chapter summarizes the previous discussions and explores the future potential of parallel processing.

  2. Parallelized direct execution simulation of message-passing parallel programs

    NASA Technical Reports Server (NTRS)

    Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

    1994-01-01

    As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.

  3. Tolerant (parallel) Programming

    NASA Technical Reports Server (NTRS)

    DiNucci, David C.; Bailey, David H. (Technical Monitor)

    1997-01-01

    In order to be truly portable, a program must be tolerant of a wide range of development and execution environments, and a parallel program is just one which must be tolerant of a very wide range. This paper first defines the term "tolerant programming", then describes many layers of tools to accomplish it. The primary focus is on F-Nets, a formal model for expressing computation as a folded partial-ordering of operations, thereby providing an architecture-independent expression of tolerant parallel algorithms. For implementing F-Nets, Cooperative Data Sharing (CDS) is a subroutine package for implementing communication efficiently in a large number of environments (e.g. shared memory and message passing). Software Cabling (SC), a very-high-level graphical programming language for building large F-Nets, possesses many of the features normally expected from today's computer languages (e.g. data abstraction, array operations). Finally, L2(sup 3) is a CASE tool which facilitates the construction, compilation, execution, and debugging of SC programs.

  4. Parallel ptychographic reconstruction

    PubMed Central

    Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom; Deng, Junjing; Ross, Rob; Jacobsen, Chris

    2014-01-01

    Ptychography is an imaging method whereby a coherent beam is scanned across an object, and an image is obtained by iterative phasing of the set of diffraction patterns. It is able to be used to image extended objects at a resolution limited by scattering strength of the object and detector geometry, rather than at an optics-imposed limit. As technical advances allow larger fields to be imaged, computational challenges arise for reconstructing the correspondingly larger data volumes, yet at the same time there is also a need to deliver reconstructed images immediately so that one can evaluate the next steps to take in an experiment. Here we present a parallel method for real-time ptychographic phase retrieval. It uses a hybrid parallel strategy to divide the computation between multiple graphics processing units (GPUs) and then employs novel techniques to merge sub-datasets into a single complex phase and amplitude image. Results are shown on a simulated specimen and a real dataset from an X-ray experiment conducted at a synchrotron light source. PMID:25607174

  5. Benchmarking massively parallel architectures

    SciTech Connect

    Lubeck, O.; Moore, J.; Simmons, M.; Wasserman, H.

    1993-07-01

    The purpose of this paper is to summarize some initial experiences related to measuring the performance of massively parallel processors (MPPs) at Los Alamos National Laboratory (LANL). Actually, the range of MPP architectures the authors have used is rather limited, being confined mostly to the Thinking Machines Corporation (TMC) Connection Machine CM-2 and CM-5. Some very preliminary work has been carried out on the Kendall Square KSR-1, and efforts related to other machines, such as the Intel Paragon and the soon-to-be-released CRAY T3D are planned. This paper will concentrate more on methodology rather than discuss specific architectural strengths and weaknesses; the latter is expected to be the subject of future reports. MPP benchmarking is a field in critical need of structure and definition. As the authors have stated previously, such machines have enormous potential, and there is certainly a dire need for orders of magnitude computational power over current supercomputers. However, performance reports for MPPs must emphasize actual sustainable performance from real applications in a careful, responsible manner. Such has not always been the case. A recent paper has described in some detail, the problem of potentially misleading performance reporting in the parallel scientific computing field. Thus, in this paper, the authors briefly offer a few general ideas on MPP performance analysis.

  6. Benchmarking massively parallel architectures

    SciTech Connect

    Lubeck, O.; Moore, J.; Simmons, M.; Wasserman, H.

    1993-01-01

    The purpose of this paper is to summarize some initial experiences related to measuring the performance of massively parallel processors (MPPs) at Los Alamos National Laboratory (LANL). Actually, the range of MPP architectures the authors have used is rather limited, being confined mostly to the Thinking Machines Corporation (TMC) Connection Machine CM-2 and CM-5. Some very preliminary work has been carried out on the Kendall Square KSR-1, and efforts related to other machines, such as the Intel Paragon and the soon-to-be-released CRAY T3D are planned. This paper will concentrate more on methodology rather than discuss specific architectural strengths and weaknesses; the latter is expected to be the subject of future reports. MPP benchmarking is a field in critical need of structure and definition. As the authors have stated previously, such machines have enormous potential, and there is certainly a dire need for orders of magnitude computational power over current supercomputers. However, performance reports for MPPs must emphasize actual sustainable performance from real applications in a careful, responsible manner. Such has not always been the case. A recent paper has described in some detail, the problem of potentially misleading performance reporting in the parallel scientific computing field. Thus, in this paper, the authors briefly offer a few general ideas on MPP performance analysis.

  7. A parallel world in the dark

    SciTech Connect

    Higaki, Tetsutaro; Jeong, Kwang Sik; Takahashi, Fuminobu E-mail: ksjeong@tuhep.phys.tohoku.ac.jp

    2013-08-01

    The baryon-dark matter coincidence is a long-standing issue. Interestingly, the recent observations suggest the presence of dark radiation, which, if confirmed, would pose another coincidence problem of why the density of dark radiation is comparable to that of photons. These striking coincidences may be traced back to the dark sector with particle contents and interactions that are quite similar, if not identical, to the standard model: a dark parallel world. It naturally solves the coincidence problems of dark matter and dark radiation, and predicts a sterile neutrino(s) with mass of O(0.1−1) eV, as well as self-interacting dark matter made of the counterpart of ordinary baryons. We find a robust prediction for the relation between the abundance of dark radiation and the sterile neutrino, which can serve as the smoking-gun evidence of the dark parallel world.

  8. Time sharing massively parallel machines. Draft

    SciTech Connect

    Gorda, B.; Wolski, R.

    1995-03-01

    As part of the Massively Parallel Computing Initiative (MPCI) at the Lawrence Livermore National Laboratory, the authors have developed a simple, effective and portable time sharing mechanism by scheduling gangs of processes on tightly coupled parallel machines. By time-sharing the resources, the system interleaves production and interactive jobs. Immediate priority is given to interactive use, maintaining good response time. Production jobs are scheduled during idle periods, making use of the otherwise unused resources. In this paper the authors discuss their experience with gang scheduling over the 3 year life-time of the project. In section 2, they motivate the project and discuss some of its details. Section 3.0 describes the general scheduling problem and how gang scheduling addresses it. In section 4.0, they describe the implementation. Section 8.0 presents results culled over the lifetime of the project. They conclude this paper with some observations and possible future directions.

  9. A parallel world in the dark

    NASA Astrophysics Data System (ADS)

    Higaki, Tetsutaro; Jeong, Kwang Sik; Takahashi, Fuminobu

    2013-08-01

    The baryon-dark matter coincidence is a long-standing issue. Interestingly, the recent observations suggest the presence of dark radiation, which, if confirmed, would pose another coincidence problem of why the density of dark radiation is comparable to that of photons. These striking coincidences may be traced back to the dark sector with particle contents and interactions that are quite similar, if not identical, to the standard model: a dark parallel world. It naturally solves the coincidence problems of dark matter and dark radiation, and predicts a sterile neutrino(s) with mass of Script O(0.1-1) eV, as well as self-interacting dark matter made of the counterpart of ordinary baryons. We find a robust prediction for the relation between the abundance of dark radiation and the sterile neutrino, which can serve as the smoking-gun evidence of the dark parallel world.

  10. A parallel, portable and versatile treecode

    SciTech Connect

    Warren, M.S.; Salmon, J.K. |

    1994-10-01

    Portability and versatility are important characteristics of a computer program which is meant to be generally useful. We describe how we have developed a parallel N-body treecode to meet these goals. A variety of applications to which the code can be applied are mentioned. Performance of the program is also measured on several machines. A 512 processor Intel Paragon can solve for the forces on 10 million gravitationally interacting particles to 0.5% rms accuracy in 28.6 seconds.

  11. A systolic array parallelizing compiler

    SciTech Connect

    Tseng, P.S. )

    1990-01-01

    This book presents a completely new approach to the problem of systolic array parallelizing compiler. It describes the AL parallelizing compiler for the Warp systolic array, the first working systolic array parallelizing compiler which can generate efficient parallel code for complete LINPACK routines. This book begins by analyzing the architectural strength of the Warp systolic array. It proposes a model for mapping programs onto the machine and introduces the notion of data relations for optimizing the program mapping. Also presented are successful applications of the AL compiler in matrix computation and image processing. A complete listing of the source program and compiler-generated parallel code are given to clarify the overall picture of the compiler. The book concludes that systolic array parallelizing compiler can produce efficient parallel code, almost identical to what the user would have written by hand.

  12. Parallel Computing in SCALE

    SciTech Connect

    DeHart, Mark D; Williams, Mark L; Bowman, Stephen M

    2010-01-01

    The SCALE computational architecture has remained basically the same since its inception 30 years ago, although constituent modules and capabilities have changed significantly. This SCALE concept was intended to provide a framework whereby independent codes can be linked to provide a more comprehensive capability than possible with the individual programs - allowing flexibility to address a wide variety of applications. However, the current system was designed originally for mainframe computers with a single CPU and with significantly less memory than today's personal computers. It has been recognized that the present SCALE computation system could be restructured to take advantage of modern hardware and software capabilities, while retaining many of the modular features of the present system. Preliminary work is being done to define specifications and capabilities for a more advanced computational architecture. This paper describes the state of current SCALE development activities and plans for future development. With the release of SCALE 6.1 in 2010, a new phase of evolutionary development will be available to SCALE users within the TRITON and NEWT modules. The SCALE (Standardized Computer Analyses for Licensing Evaluation) code system developed by Oak Ridge National Laboratory (ORNL) provides a comprehensive and integrated package of codes and nuclear data for a wide range of applications in criticality safety, reactor physics, shielding, isotopic depletion and decay, and sensitivity/uncertainty (S/U) analysis. Over the last three years, since the release of version 5.1 in 2006, several important new codes have been introduced within SCALE, and significant advances applied to existing codes. Many of these new features became available with the release of SCALE 6.0 in early 2009. However, beginning with SCALE 6.1, a first generation of parallel computing is being introduced. In addition to near-term improvements, a plan for longer term SCALE enhancement activities has been developed to provide an integrated framework for future methods development. Some of the major components of the SCALE parallel computing development plan are parallelization and multithreading of computationally intensive modules and redesign of the fundamental SCALE computational architecture.

  13. Unified Parallel Software

    SciTech Connect

    McKay, Mike

    2003-12-01

    UPS (Unified Paralled Software is a collection of software tools libraries, scripts, executables) that assist in parallel programming. This consists of: o libups.a C/Fortran callable routines for message passing (utilities written on top of MPI) and file IO (utilities written on top of HDF). o libuserd-HDF.so EnSight user-defined reader for visualizing data files written with UPS File IO. o ups_libuserd_query, ups_libuserd_prep.pl, ups_libuserd_script.pl Executables/scripts to get information from data files and to simplify the use of EnSight on those data files. o ups_io_rm/ups_io_cp Manipulate data files written with UPS File IO These tools are portable to a wide variety of Unix platforms.

  14. Unified Parallel Software

    Energy Science and Technology Software Center (ESTSC)

    2003-12-01

    UPS (Unified Paralled Software is a collection of software tools libraries, scripts, executables) that assist in parallel programming. This consists of: o libups.a C/Fortran callable routines for message passing (utilities written on top of MPI) and file IO (utilities written on top of HDF). o libuserd-HDF.so EnSight user-defined reader for visualizing data files written with UPS File IO. o ups_libuserd_query, ups_libuserd_prep.pl, ups_libuserd_script.pl Executables/scripts to get information from data files and to simplify the use ofmore » EnSight on those data files. o ups_io_rm/ups_io_cp Manipulate data files written with UPS File IO These tools are portable to a wide variety of Unix platforms.« less

  15. Toward Parallel Document Clustering

    SciTech Connect

    Mogill, Jace A.; Haglin, David J.

    2011-09-01

    A key challenge to automated clustering of documents in large text corpora is the high cost of comparing documents in a multimillion dimensional document space. The Anchors Hierarchy is a fast data structure and algorithm for localizing data based on a triangle inequality obeying distance metric, the algorithm strives to minimize the number of distance calculations needed to cluster the documents into “anchors” around reference documents called “pivots”. We extend the original algorithm to increase the amount of available parallelism and consider two implementations: a complex data structure which affords efficient searching, and a simple data structure which requires repeated sorting. The sorting implementation is integrated with a text corpora “Bag of Words” program and initial performance results of end-to-end a document processing workflow are reported.

  16. Parallel Polarization State Generation.

    PubMed

    She, Alan; Capasso, Federico

    2016-01-01

    The control of polarization, an essential property of light, is of wide scientific and technological interest. The general problem of generating arbitrary time-varying states of polarization (SOP) has always been mathematically formulated by a series of linear transformations, i.e. a product of matrices, imposing a serial architecture. Here we show a parallel architecture described by a sum of matrices. The theory is experimentally demonstrated by modulating spatially-separated polarization components of a laser using a digital micromirror device that are subsequently beam combined. This method greatly expands the parameter space for engineering devices that control polarization. Consequently, performance characteristics, such as speed, stability, and spectral range, are entirely dictated by the technologies of optical intensity modulation, including absorption, reflection, emission, and scattering. This opens up important prospects for polarization state generation (PSG) with unique performance characteristics with applications in spectroscopic ellipsometry, spectropolarimetry, communications, imaging, and security. PMID:27184813

  17. Parallel Imaging Microfluidic Cytometer

    PubMed Central

    Ehrlich, Daniel J.; McKenna, Brian K.; Evans, James G.; Belkina, Anna C.; Denis, Gerald V.; Sherr, David; Cheung, Man Ching

    2011-01-01

    By adding an additional degree of freedom from multichannel flow, the parallel microfluidic cytometer (PMC) combines some of the best features of flow cytometry (FACS) and microscope-based high-content screening (HCS). The PMC (i) lends itself to fast processing of large numbers of samples, (ii) adds a 1-D imaging capability for intracellular localization assays (HCS), (iii) has a high rare-cell sensitivity and, (iv) has an unusual capability for time-synchronized sampling. An inability to practically handle large sample numbers has restricted applications of conventional flow cytometers and microscopes in combinatorial cell assays, network biology, and drug discovery. The PMC promises to relieve a bottleneck in these previously constrained applications. The PMC may also be a powerful tool for finding rare primary cells in the clinic. The multichannel architecture of current PMC prototypes allows 384 unique samples for a cell-based screen to be read out in approximately 6–10 minutes, about 30-times the speed of most current FACS systems. In 1-D intracellular imaging, the PMC can obtain protein localization using HCS marker strategies at many times the sample throughput of CCD-based microscopes or CCD-based single-channel flow cytometers. The PMC also permits the signal integration time to be varied over a larger range than is practical in conventional flow cytometers. The signal-to-noise advantages are useful, for example, in counting rare positive cells in the most difficult early stages of genome-wide screening. We review the status of parallel microfluidic cytometry and discuss some of the directions the new technology may take. PMID:21704835

  18. Parallel Detection of Cathodoluminescence.

    NASA Astrophysics Data System (ADS)

    Day, John C. C.

    Available from UMI in association with The British Library. A GEC P8600 Charge-coupled device has been used in the design and fabrication of a parallel detection system or optical multichannel analyser for the analysis of Cathodoluminescence Spectra. The P8600, whilst designed for video applications, is used as a linear array by merging entire rows of pixels together on the on-board output amplifier. A dual slope integration method of correlated double sampling has been used for noise reduction. An analysis of the performance of this system is given and the achieved noise level of 22 electrons is found to be in good agreement with that theoretically possible. A complete description of the circuits is given together with details of its use with a "Link 860" computer/analyser and a "Philips 400" electron microscope. To demonstrate the system, a study of the cathodoluminescent properties of Cadmium Telluride grown by molecular beam epitaxy has been made. In particular the effect of dislocations, stacking faults and twins on luminescence has been studied. Dislocations are seen to cause a quenching of excitonic emission with no corresponding increase in any other emission. The effect of stacking faults was seen to vary between different samples with an enhancement of long wavelength emission seen in poor quality samples. This supports the premise that the faults are nucleated by surface impurities which are also responsible for the enhanced emission. Some twin defects have been found to cause enhanced excitonic emission. This is compatible with the existence of natural quantum wells at twin faults proposed by other workers. The speed with which the parallel detection system can acquire spectra makes it a valuable tool in the study of beam sensitive materials. To demonstrate this, measurements were made of the decay rates of the weak cathodoluminescence from the organic crystal Coronene. These rates were seen to have time constants less than two minutes and such measurements would not have been amenable by conventional methods.

  19. Combinatorial parallel and scientific computing.

    SciTech Connect

    Pinar, Ali; Hendrickson, Bruce Alan

    2005-04-01

    Combinatorial algorithms have long played a pivotal enabling role in many applications of parallel computing. Graph algorithms in particular arise in load balancing, scheduling, mapping and many other aspects of the parallelization of irregular applications. These are still active research areas, mostly due to evolving computational techniques and rapidly changing computational platforms. But the relationship between parallel computing and discrete algorithms is much richer than the mere use of graph algorithms to support the parallelization of traditional scientific computations. Important, emerging areas of science are fundamentally discrete, and they are increasingly reliant on the power of parallel computing. Examples include computational biology, scientific data mining, and network analysis. These applications are changing the relationship between discrete algorithms and parallel computing. In addition to their traditional role as enablers of high performance, combinatorial algorithms are now customers for parallel computing. New parallelization techniques for combinatorial algorithms need to be developed to support these nontraditional scientific approaches. This chapter will describe some of the many areas of intersection between discrete algorithms and parallel scientific computing. Due to space limitations, this chapter is not a comprehensive survey, but rather an introduction to a diverse set of techniques and applications with a particular emphasis on work presented at the Eleventh SIAM Conference on Parallel Processing for Scientific Computing. Some topics highly relevant to this chapter (e.g. load balancing) are addressed elsewhere in this book, and so we will not discuss them here.

  20. "Serial" effects in parallel models of reading.

    PubMed

    Chang, Ya-Ning; Furber, Steve; Welbourne, Stephen

    2012-06-01

    There is now considerable evidence showing that the time to read a word out loud is influenced by an interaction between orthographic length and lexicality. Given that length effects are interpreted by advocates of dual-route models as evidence of serial processing this would seem to pose a serious challenge to models of single word reading which postulate a common parallel processing mechanism for reading both words and nonwords (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; Rastle, Havelka, Wydell, Coltheart, & Besner, 2009). However, an alternative explanation of these data is that visual processes outside the scope of existing parallel models are responsible for generating the word-length related phenomena (Seidenberg & Plaut, 1998). Here we demonstrate that a parallel model of single word reading can account for the differential word-length effects found in the naming latencies of words and nonwords, provided that it includes a mapping from visual to orthographic representations, and that the nature of those orthographic representations are not preconstrained. The model can also simulate other supposedly "serial" effects. The overall findings were consistent with the view that visual processing contributes substantially to the word-length effects in normal reading and provided evidence to support the single-route theory which assumes words and nonwords are processed in parallel by a common mechanism. PMID:22343366

  1. Toolkit for parallel image processing

    NASA Astrophysics Data System (ADS)

    Squyres, Jeffery M.; Lumsdaine, Andrew; Stevenson, Robert L.

    1998-09-01

    In this paper, we present the design and implementation of a parallel image processing software library (the Parallel Image Processing Toolkit). The Toolkit not only supplies a rich set of image processing routines, it is designed principally as an extensible framework containing generalized parallel computational kernels to support image processing. Users can easily add their own image processing routines without knowledge or explicit use of the underlying data distribution mechanisms or parallel computing model. Shared memory and multi-level memory hierarchies are exploited to achieve high performance on each node, thereby minimizing overall parallel execution time. Multiple load balancing schemes have been implemented within the parallel framework that transparently distribute the computational load evenly on a distributed memory computing environment. Inside the Toolkit, a message-passing model of parallelism is designed around the Message Passing Interface standard. Experimental results are presented to demonstrate the parallel speedup obtained with the Parallel Image Processing Toolkit in a typical workstation cluster with some common image processing tasks.

  2. Parallel processing and expert systems

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Lau, Sonie

    1991-01-01

    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 90's cannot enjoy an increased level of autonomy without the efficient use of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real time demands are met for large expert systems. Speed-up via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial labs in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems was surveyed. The survey is divided into three major sections: (1) multiprocessors for parallel expert systems; (2) parallel languages for symbolic computations; and (3) measurements of parallelism of expert system. Results to date indicate that the parallelism achieved for these systems is small. In order to obtain greater speed-ups, data parallelism and application parallelism must be exploited.

  3. Parallel processor engine model program

    NASA Technical Reports Server (NTRS)

    Mclaughlin, P.

    1984-01-01

    The Parallel Processor Engine Model Program is a generalized engineering tool intended to aid in the design of parallel processing real-time simulations of turbofan engines. It is written in the FORTRAN programming language and executes as a subset of the SOAPP simulation system. Input/output and execution control are provided by SOAPP; however, the analysis, emulation and simulation functions are completely self-contained. A framework in which a wide variety of parallel processing architectures could be evaluated and tools with which the parallel implementation of a real-time simulation technique could be assessed are provided.

  4. Parallel computation with the force

    NASA Technical Reports Server (NTRS)

    Jordan, H. F.

    1985-01-01

    A methodology, called the force, supports the construction of programs to be executed in parallel by a force of processes. The number of processes in the force is unspecified, but potentially very large. The force idea is embodied in a set of macros which produce multiproceossor FORTRAN code and has been studied on two shared memory multiprocessors of fairly different character. The method has simplified the writing of highly parallel programs within a limited class of parallel algorithms and is being extended to cover a broader class. The individual parallel constructs which comprise the force methodology are discussed. Of central concern are their semantics, implementation on different architectures and performance implications.

  5. Parallel Programming in the Age of Ubiquitous Parallelism

    NASA Astrophysics Data System (ADS)

    Pingali, Keshav

    2014-04-01

    Multicore and manycore processors are now ubiquitous, but parallel programming remains as difficult as it was 30-40 years ago. During this time, our community has explored many promising approaches including functional and dataflow languages, logic programming, and automatic parallelization using program analysis and restructuring, but none of these approaches has succeeded except in a few niche application areas. In this talk, I will argue that these problems arise largely from the computation-centric foundations and abstractions that we currently use to think about parallelism. In their place, I will propose a novel data-centric foundation for parallel programming called the operator formulation in which algorithms are described in terms of actions on data. The operator formulation shows that a generalized form of data-parallelism called amorphous data-parallelism is ubiquitous even in complex, irregular graph applications such as mesh generation/refinement/partitioning and SAT solvers. Regular algorithms emerge as a special case of irregular ones, and many application-specific optimization techniques can be generalized to a broader context. The operator formulation also leads to a structural analysis of algorithms called TAO-analysis that provides implementation guidelines for exploiting parallelism efficiently. Finally, I will describe a system called Galois based on these ideas for exploiting amorphous data-parallelism on multicores and GPUs

  6. Using Motivational Interviewing Techniques to Address Parallel Process in Supervision

    ERIC Educational Resources Information Center

    Giordano, Amanda; Clarke, Philip; Borders, L. DiAnne

    2013-01-01

    Supervision offers a distinct opportunity to experience the interconnection of counselor-client and counselor-supervisor interactions. One product of this network of interactions is parallel process, a phenomenon by which counselors unconsciously identify with their clients and subsequently present to their supervisors in a similar fashion…

  7. Parallel-distributed mobile robot simulator

    NASA Astrophysics Data System (ADS)

    Okada, Hiroyuki; Sekiguchi, Minoru; Watanabe, Nobuo

    1996-06-01

    The aim of this project is to achieve an autonomous learning and growth function based on active interaction with the real world. It should also be able to autonomically acquire knowledge about the context in which jobs take place, and how the jobs are executed. This article describes a parallel distributed movable robot system simulator with an autonomous learning and growth function. The autonomous learning and growth function which we are proposing is characterized by its ability to learn and grow through interaction with the real world. When the movable robot interacts with the real world, the system compares the virtual environment simulation with the interaction result in the real world. The system then improves the virtual environment to match the real-world result more closely. This the system learns and grows. It is very important that such a simulation is time- realistic. The parallel distributed movable robot simulator was developed to simulate the space of a movable robot system with an autonomous learning and growth function. The simulator constructs a virtual space faithful to the real world and also integrates the interfaces between the user, the actual movable robot and the virtual movable robot. Using an ultrafast CG (computer graphics) system (FUJITSU AG series), time-realistic 3D CG is displayed.

  8. Parallelization of irregularly coupled regular meshes

    NASA Technical Reports Server (NTRS)

    Chase, Craig; Crowley, Kay; Saltz, Joel; Reeves, Anthony

    1992-01-01

    Regular meshes are frequently used for modeling physical phenomena on both serial and parallel computers. One advantage of regular meshes is that efficient discretization schemes can be implemented in a straight forward manner. However, geometrically-complex objects, such as aircraft, cannot be easily described using a single regular mesh. Multiple interacting regular meshes are frequently used to describe complex geometries. Each mesh models a subregion of the physical domain. The meshes, or subdomains, can be processed in parallel, with periodic updates carried out to move information between the coupled meshes. In many cases, there are a relatively small number (one to a few dozen) subdomains, so that each subdomain may also be partitioned among several processors. We outline a composite run-time/compile-time approach for supporting these problems efficiently on distributed-memory machines. These methods are described in the context of a multiblock fluid dynamics problem developed at LaRC.

  9. Reordering computations for parallel execution

    NASA Technical Reports Server (NTRS)

    Adams, L.

    1985-01-01

    The computations are reordered in the SOR algorithm to maintain the same asymptotic rate of convergence as the rowwise ordering to obtain parallelism at different levels. A parallel program is written to illustrate these ideas and actual machines for implementation of this program are discussed.

  10. Parallel execution model for Prolog

    SciTech Connect

    Fagin, B.S.

    1987-01-01

    One candidate language for parallel symbolic computing is Prolog. Numerous ways for executing Prolog in parallel have been proposed, but current efforts suffer from several deficiencies. Many cannot support fundamental types of concurrency in Prolog. Other models are of purely theoretical interest, ignoring implementation costs. Detailed simulation studies of execution models are scare; at present little is known about the costs and benefits of executing Prolog in parallel. In this thesis, a new parallel execution model for Prolog is presented: the PPP model or Parallel Prolog Processor. The PPP supports AND-parallelism, OR-parallelism, and intelligent backtracking. An implementation of the PPP is described, through the extension of an existing Prolog abstract machine architecture. Several examples of PPP execution are presented, and compilation to the PPP abstract instruction set is discussed. The performance effects of this model are reported, based on a simulation of a large benchmark set. The implications of these results for parallel Prolog systems are discussed, and directions for future work are indicated.

  11. Parallel contingency statistics with Titan.

    SciTech Connect

    Thompson, David C.; Pebay, Philippe Pierre

    2009-09-01

    This report summarizes existing statistical engines in VTK/Titan and presents the recently parallelized contingency statistics engine. It is a sequel to [PT08] and [BPRT09] which studied the parallel descriptive, correlative, multi-correlative, and principal component analysis engines. The ease of use of this new parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; however, the very nature of contingency tables prevent this new engine from exhibiting optimal parallel speed-up as the aforementioned engines do. This report therefore discusses the design trade-offs we made and study performance with up to 200 processors.

  12. The Galley Parallel File System

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David

    1996-01-01

    Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/0 requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley's file structure and application interface, as well as the performance advantages offered by that interface.

  13. Parallel Grid Manipulations in Earth Science Calculations

    NASA Technical Reports Server (NTRS)

    Sawyer, W.; Lucchesi, R.; daSilva, A.; Takacs, L. L.

    1999-01-01

    The National Aeronautics and Space Administration (NASA) Data Assimilation Office (DAO) at the Goddard Space Flight Center is moving its data assimilation system to massively parallel computing platforms. This parallel implementation of GEOS DAS will be used in the DAO's normal activities, which include reanalysis of data, and operational support for flight missions. Key components of GEOS DAS, including the gridpoint-based general circulation model and a data analysis system, are currently being parallelized. The parallelization of GEOS DAS is also one of the HPCC Grand Challenge Projects. The GEOS-DAS software employs several distinct grids. Some examples are: an observation grid- an unstructured grid of points at which observed or measured physical quantities from instruments or satellites are associated- a highly-structured latitude-longitude grid of points spanning the earth at given latitude-longitude coordinates at which prognostic quantities are determined, and a computational lat-lon grid in which the pole has been moved to a different location to avoid computational instabilities. Each of these grids has a different structure and number of constituent points. In spite of that, there are numerous interactions between the grids, e.g., values on one grid must be interpolated to another, or, in other cases, grids need to be redistributed on the underlying parallel platform. The DAO has designed a parallel integrated library for grid manipulations (PILGRIM) to support the needed grid interactions with maximum efficiency. It offers a flexible interface to generate new grids, define transformations between grids and apply them. Basic communication is currently MPI, however the interfaces defined here could conceivably be implemented with other message-passing libraries, e.g., Cray SHMEM, or with shared-memory constructs. The library is written in Fortran 90. First performance results indicate that even difficult problems, such as above-mentioned pole rotation- a sparse interpolation with little data locality between the physical lat-lon grid and a pole rotated computational grid- can be solved efficiently and at the GFlop/s rates needed to solve tomorrow's high resolution earth science models. In the subsequent presentation we will discuss the design and implementation of PILGRIM as well as a number of the problems it is required to solve. Some conclusions will be drawn about the potential performance of the overall earth science models on the supercomputer platforms foreseen for these problems.

  14. Programming parallel architectures: The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1988-01-01

    Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.

  15. Tile-based Level of Detail for the Parallel Age

    SciTech Connect

    Niski, K; Cohen, J D

    2007-08-15

    Today's PCs incorporate multiple CPUs and GPUs and are easily arranged in clusters for high-performance, interactive graphics. We present an approach based on hierarchical, screen-space tiles to parallelizing rendering with level of detail. Adapt tiles, render tiles, and machine tiles are associated with CPUs, GPUs, and PCs, respectively, to efficiently parallelize the workload with good resource utilization. Adaptive tile sizes provide load balancing while our level of detail system allows total and independent management of the load on CPUs and GPUs. We demonstrate our approach on parallel configurations consisting of both single PCs and a cluster of PCs.

  16. AZTEC. Parallel Iterative method Software for Solving Linear Systems

    SciTech Connect

    Hutchinson, S.; Shadid, J.; Tuminaro, R.

    1995-07-01

    AZTEC is an interactive library that greatly simplifies the parrallelization process when solving the linear systems of equations Ax=b where A is a user supplied n X n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. AZTEC is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparse unstructured matrices for parallel solutions.

  17. Parallel processing and expert systems

    NASA Technical Reports Server (NTRS)

    Lau, Sonie; Yan, Jerry C.

    1991-01-01

    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited.

  18. Is Monte Carlo embarrassingly parallel?

    SciTech Connect

    Hoogenboom, J. E.

    2012-07-01

    Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)

  19. Parallel NPARC: Implementation and Performance

    NASA Technical Reports Server (NTRS)

    Townsend, S. E.

    1996-01-01

    Version 3 of the NPARC Navier-Stokes code includes support for large-grain (block level) parallelism using explicit message passing between a heterogeneous collection of computers. This capability has the potential for significant performance gains, depending upon the block data distribution. The parallel implementation uses a master/worker arrangement of processes. The master process assigns blocks to workers, controls worker actions, and provides remote file access for the workers. The processes communicate via explicit message passing using an interface library which provides portability to a number of message passing libraries, such as PVM (Parallel Virtual Machine). A Bourne shell script is used to simplify the task of selecting hosts, starting processes, retrieving remote files, and terminating a computation. This script also provides a simple form of fault tolerance. An analysis of the computational performance of NPARC is presented, using data sets from an F/A-18 inlet study and a Rocket Based Combined Cycle Engine analysis. Parallel speedup and overall computational efficiency were obtained for various NPARC run parameters on a cluster of IBM RS6000 workstations. The data show that although NPARC performance compares favorably with the estimated potential parallelism, typical data sets used with previous versions of NPARC will often need to be reblocked for optimum parallel performance. In one of the cases studied, reblocking increased peak parallel speedup from 3.2 to 11.8.

  20. Parallel integer sorting with medium and fine-scale parallelism

    NASA Technical Reports Server (NTRS)

    Dagum, Leonardo

    1993-01-01

    Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.

  1. EFFICIENT SCHEDULING OF PARALLEL JOBS ON MASSIVELY PARALLEL SYSTEMS

    SciTech Connect

    F. PETRINI; W. FENG

    1999-09-01

    We present buffered coscheduling, a new methodology to multitask parallel jobs in a message-passing environment and to develop parallel programs that can pave the way to the efficient implementation of a distributed operating system. Buffered coscheduling is based on three innovative techniques: communication buffering, strobing, and non-blocking communication. By leveraging these techniques, we can perform effective optimizations based on the global status of the parallel machine rather than on the limited knowledge available locally to each processor. The advantages of buffered coscheduling include higher resource utilization, reduced communication overhead, efficient implementation of low-control strategies and fault-tolerant protocols, accurate performance modeling, and a simplified yet still expressive parallel programming model. Preliminary experimental results show that buffered coscheduling is very effective in increasing the overall performance in the presence of load imbalance and communication-intensive workloads.

  2. Template based parallel checkpointing in a massively parallel computer system

    DOEpatents

    Archer, Charles Jens; Inglett, Todd Alan

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  3. Parallel Architecture For Robotics Computation

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Bejczy, Antal K.

    1990-01-01

    Universal Real-Time Robotic Controller and Simulator (URRCS) is highly parallel computing architecture for control and simulation of robot motion. Result of extensive algorithmic study of different kinematic and dynamic computational problems arising in control and simulation of robot motion. Study led to development of class of efficient parallel algorithms for these problems. Represents algorithmically specialized architecture, in sense capable of exploiting common properties of this class of parallel algorithms. System with both MIMD and SIMD capabilities. Regarded as processor attached to bus of external host processor, as part of bus memory.

  4. Experimental Parallel-Processing Computer

    NASA Technical Reports Server (NTRS)

    Mcgregor, J. W.; Salama, M. A.

    1986-01-01

    Master processor supervises slave processors, each with its own memory. Computer with parallel processing serves as inexpensive tool for experimentation with parallel mathematical algorithms. Speed enhancement obtained depends on both nature of problem and structure of algorithm used. In parallel-processing architecture, "bank select" and control signals determine which one, if any, of N slave processor memories accessible to master processor at any given moment. When so selected, slave memory operates as part of master computer memory. When not selected, slave memory operates independently of main memory. Slave processors communicate with each other via input/output bus.

  5. Parallel inverse iteration with reorthogonalization

    SciTech Connect

    Fann, G.I.; Littlefield, R.J.

    1993-03-01

    A parallel method for finding orthogonal eigenvectors of real symmetric tridiagonal is described. The method uses inverse iteration with repeated Modified Gram-Schmidt (MGS) reorthogonalization of the unconverged iterates for clustered eigenvalues. This approach is more parallelizable than reorthogonalizing against fully converged eigenvectors, as is done by LAPACK's current DSTEIN routine. The new method is found to provide accuracy and speed comparable to DSTEIN's and to have good parallel scalability even for matrices with large clusters of eigenvalues. We present al results for residual and orthogonality tests, plus timings on IBM RS/6000 (sequential) and Intel Touchstone DELTA (parallel) computers.

  6. Parallel inverse iteration with reorthogonalization

    SciTech Connect

    Fann, G.I.; Littlefield, R.J.

    1993-03-01

    A parallel method for finding orthogonal eigenvectors of real symmetric tridiagonal is described. The method uses inverse iteration with repeated Modified Gram-Schmidt (MGS) reorthogonalization of the unconverged iterates for clustered eigenvalues. This approach is more parallelizable than reorthogonalizing against fully converged eigenvectors, as is done by LAPACK`s current DSTEIN routine. The new method is found to provide accuracy and speed comparable to DSTEIN`s and to have good parallel scalability even for matrices with large clusters of eigenvalues. We present al results for residual and orthogonality tests, plus timings on IBM RS/6000 (sequential) and Intel Touchstone DELTA (parallel) computers.

  7. Adaptive, multiresolution visualization of large data sets using parallel octrees.

    SciTech Connect

    Freitag, L. A.; Loy, R. M.

    1999-06-10

    The interactive visualization and exploration of large scientific data sets is a challenging and difficult task; their size often far exceeds the performance and memory capacity of even the most powerful graphics work-stations. To address this problem, we have created a technique that combines hierarchical data reduction methods with parallel computing to allow interactive exploration of large data sets while retaining full-resolution capability. The hierarchical representation is built in parallel by strategically inserting field data into an octree data structure. We provide functionality that allows the user to interactively adapt the resolution of the reduced data sets so that resolution is increased in regions of interest without sacrificing local graphics performance. We describe the creation of the reduced data sets using a parallel octree, the software architecture of the system, and the performance of this system on the data from a Rayleigh-Taylor instability simulation.

  8. Parallel node placement method by bubble simulation

    NASA Astrophysics Data System (ADS)

    Nie, Yufeng; Zhang, Weiwei; Qi, Nan; Li, Yiqiang

    2014-03-01

    An efficient Parallel Node Placement method by Bubble Simulation (PNPBS), employing METIS-based domain decomposition (DD) for an arbitrary number of processors is introduced. In accordance with the desired nodal density and Newton’s Second Law of Motion, automatic generation of node sets by bubble simulation has been demonstrated in previous work. Since the interaction force between nodes is short-range, for two distant nodes, their positions and velocities can be updated simultaneously and independently during dynamic simulation, which indicates the inherent property of parallelism, it is quite suitable for parallel computing. In this PNPBS method, the METIS-based DD scheme has been investigated for uniform and non-uniform node sets, and dynamic load balancing is obtained by evenly distributing work among the processors. For the nodes near the common interface of two neighboring subdomains, there is no need for special treatment after dynamic simulation. These nodes have good geometrical properties and a smooth density distribution which is desirable in the numerical solution of partial differential equations (PDEs). The results of numerical examples show that quasi linear speedup in the number of processors and high efficiency are achieved.

  9. Parallel Strategies for Crash and Impact Simulations

    SciTech Connect

    Attaway, S.; Brown, K.; Hendrickson, B.; Plimpton, S.

    1998-12-07

    We describe a general strategy we have found effective for parallelizing solid mechanics simula- tions. Such simulations often have several computationally intensive parts, including finite element integration, detection of material contacts, and particle interaction if smoothed particle hydrody- namics is used to model highly deforming materials. The need to balance all of these computations simultaneously is a difficult challenge that has kept many commercial and government codes from being used effectively on parallel supercomputers with hundreds or thousands of processors. Our strategy is to load-balance each of the significant computations independently with whatever bal- ancing technique is most appropriate. The chief benefit is that each computation can be scalably paraIlelized. The drawback is the data exchange between processors and extra coding that must be written to maintain multiple decompositions in a single code. We discuss these trade-offs and give performance results showing this strategy has led to a parallel implementation of a widely-used solid mechanics code that can now be run efficiently on thousands of processors of the Pentium-based Sandia/Intel TFLOPS machine. We illustrate with several examples the kinds of high-resolution, million-element models that can now be simulated routinely. We also look to the future and dis- cuss what possibilities this new capabUity promises, as well as the new set of challenges it poses in material models, computational techniques, and computing infrastructure.

  10. Parallel algorithms for message decomposition

    SciTech Connect

    Teng, S.H.; Wang, B.

    1987-06-01

    The authors consider the deterministic and random parallel complexity (time and processor) of message decoding: an essential problem in communications systems and translation systems. They present an optimal parallel algorithm to decompose prefix-coded messages and uniquely decipherable-coded messages in O(n/P) time, using O(P) processors (for all P:1 less than or equal toPless than or equal ton/log n) deterministically as well as randomly on the weakest version of parallel random access machines in which concurrent read and concurrent write to a cell in the common memory are not allowed. This is done by reducing decoding to parallel finite-state automata simulation and the prefix sums.

  11. Parallel programming of industrial applications

    SciTech Connect

    Heroux, M; Koniges, A; Simon, H

    1998-07-21

    In the introductory material, we overview the typical MPP environment for real application computing and the special tools available such as parallel debuggers and performance analyzers. Next, we draw from a series of real applications codes and discuss the specific challenges and problems that are encountered in parallelizing these individual applications. The application areas drawn from include biomedical sciences, materials processing and design, plasma and fluid dynamics, and others. We show how it was possible to get a particular application to run efficiently and what steps were necessary. Finally we end with a summary of the lessons learned from these applications and predictions for the future of industrial parallel computing. This tutorial is based on material from a forthcoming book entitled: "Industrial Strength Parallel Computing" to be published by Morgan Kaufmann Publishers (ISBN l-55860-54).

  12. "Feeling" Series and Parallel Resistances.

    ERIC Educational Resources Information Center

    Morse, Robert A.

    1993-01-01

    Equipped with drinking straws and stirring straws, a teacher can help students understand how resistances in electric circuits combine in series and in parallel. Follow-up suggestions are provided. (ZWH)

  13. Predicting performance of parallel computations

    NASA Technical Reports Server (NTRS)

    Mak, Victor W.; Lundstrom, Stephen F.

    1990-01-01

    An accurate and computationally efficient method for predicting the performance of a class of parallel computations running on concurrent systems is described. A parallel computation is modeled as a task system with precedence relationships expressed as a series-parallel directed acyclic graph. Resources in a concurrent system are modeled as service centers in a queuing network model. Using these two models as inputs, the method outputs predictions of expected execution time of the parallel computation and the concurrent system utilization. The method is validated against both detailed simulation and actual execution on a commercial multiprocessor. Using 100 test cases, the average error of the prediction when compared to simulation statistics is 1.7 percent, with a standard deviation of 1.5 percent; the maximum error is about 10 percent.

  14. Demonstrating Forces between Parallel Wires.

    ERIC Educational Resources Information Center

    Baker, Blane

    2000-01-01

    Describes a physics demonstration that dramatically illustrates the mutual repulsion (attraction) between parallel conductors using insulated copper wire, wooden dowels, a high direct current power supply, electrical tape, and an overhead projector. (WRM)

  15. Evaluation of the Interactions between Water Extractable Soil Organic Matter and Metal Cations (Cu(II), Eu(III)) Using Excitation-Emission Matrix Combined with Parallel Factor Analysis

    PubMed Central

    Wei, Jing; Han, Lu; Song, Jing; Chen, Mengfang

    2015-01-01

    The objectives of this study were to evaluate the binding behavior of Cu(II) and Eu(III) with water extractable organic matter (WEOM) in soil, and assess the competitive effect of the cations. Excitation-emission matrix (EEM) fluorescence spectrometry was used in combination with parallel factor analysis (PARAFAC) to obtain four WEOM components: fulvic-like, humic-like, microbial degraded humic-like, and protein-like substances. Fluorescence titration experiments were performed to obtain the binding parameters of PARAFAC-derived components with Cu(II) and Eu(III). The conditional complexation stability constants (logKM) of Cu(II) with the four components ranged from 5.49 to 5.94, and the Eu(III) logKM values were between 5.26 to 5.81. The component-specific binding parameters obtained from competitive binding experiments revealed that Cu(II) and Eu(III) competed for the same binding sites on the WEOM components. These results would help understand the molecular binding mechanisms of Cu(II) and Eu(III) with WEOM in soil environment. PMID:26121300

  16. Appendix E: Parallel Pascal development system

    NASA Technical Reports Server (NTRS)

    1985-01-01

    The Parallel Pascal Development System enables Parallel Pascal programs to be developed and tested on a conventional computer. It consists of several system programs, including a Parallel Pascal to standard Pascal translator, and a library of Parallel Pascal subprograms. The library includes subprograms for using Parallel Pascal on a parallel system with a fixed degree of parallelism, such as the Massively Parallel Processor, to conveniently manipulate arrays which have dimensions than the hardware. Programs can be conveninetly tested with small sized arrays on the conventional computer before attempting to run on a parallel system.

  17. HEATR project: ATR algorithm parallelization

    NASA Astrophysics Data System (ADS)

    Deardorf, Catherine E.

    1998-09-01

    High Performance Computing (HPC) Embedded Application for Target Recognition (HEATR) is a project funded by the High Performance Computing Modernization Office through the Common HPC Software Support Initiative (CHSSI). The goal of CHSSI is to produce portable, parallel, multi-purpose, freely distributable, support software to exploit emerging parallel computing technologies and enable application of scalable HPC's for various critical DoD applications. Specifically, the CHSSI goal for HEATR is to provide portable, parallel versions of several existing ATR detection and classification algorithms to the ATR-user community to achieve near real-time capability. The HEATR project will create parallel versions of existing automatic target recognition (ATR) detection and classification algorithms and generate reusable code that will support porting and software development process for ATR HPC software. The HEATR Team has selected detection/classification algorithms from both the model- based and training-based (template-based) arena in order to consider the parallelization requirements for detection/classification algorithms across ATR technology. This would allow the Team to assess the impact that parallelization would have on detection/classification performance across ATR technology. A field demo is included in this project. Finally, any parallel tools produced to support the project will be refined and returned to the ATR user community along with the parallel ATR algorithms. This paper will review: (1) HPCMP structure as it relates to HEATR, (2) Overall structure of the HEATR project, (3) Preliminary results for the first algorithm Alpha Test, (4) CHSSI requirements for HEATR, and (5) Project management issues and lessons learned.

  18. Graphics applications utilizing parallel processing

    NASA Technical Reports Server (NTRS)

    Rice, John R.

    1990-01-01

    The results are presented of research conducted to develop a parallel graphic application algorithm to depict the numerical solution of the 1-D wave equation, the vibrating string. The research was conducted on a Flexible Flex/32 multiprocessor and a Sequent Balance 21000 multiprocessor. The wave equation is implemented using the finite difference method. The synchronization issues that arose from the parallel implementation and the strategies used to alleviate the effects of the synchronization overhead are discussed.

  19. Parallel architectures for problem solving

    SciTech Connect

    Kale, L.V.

    1985-01-01

    The problem of exploiting a large amount of hardware in parallel is one of the biggest challenges facing computer science today. The problem of designing parallel architectures and execution methods for solving large combinatorially explosive problems is studied here. Such problems typically do not have a regular structure that can be readily exploited for parallel execution. Prolog is chosen as a language to specify computation because it is seen as a language that is conceptually simple as well as amenable to parallel interpretation. A tree representation of Prolog computation called the REDUCE-OR tree is described as an alternative to the AND-OR tree representation. A process model based on this representation is developed; it captures more parallelism than most other proposed models. A class of bus architectures is proposed to implement the process model. A general model of parallel Prolog systems is developed and the proposed architectures examined in its framework. One of the important features of the proposed architectures is that they limit contracting of work to a close neighborhood. Various interconnection networks are analyzed, and a new one called the lattice-mesh is proposed. The lattice-mesh improves on the square grid of buses, while retaining its linear-area property. An extensive simulation framework was built. Results of some of the experiments conducted on the simulation system are given.

  20. Architectures for reasoning in parallel

    NASA Technical Reports Server (NTRS)

    Hall, Lawrence O.

    1989-01-01

    The research conducted has dealt with rule-based expert systems. The algorithms that may lead to effective parallelization of them were investigated. Both the forward and backward chained control paradigms were investigated in the course of this work. The best computer architecture for the developed and investigated algorithms has been researched. Two experimental vehicles were developed to facilitate this research. They are Backpac, a parallel backward chained rule-based reasoning system and Datapac, a parallel forward chained rule-based reasoning system. Both systems have been written in Multilisp, a version of Lisp which contains the parallel construct, future. Applying the future function to a function causes the function to become a task parallel to the spawning task. Additionally, Backpac and Datapac have been run on several disparate parallel processors. The machines are an Encore Multimax with 10 processors, the Concert Multiprocessor with 64 processors, and a 32 processor BBN GP1000. Both the Concert and the GP1000 are switch-based machines. The Multimax has all its processors hung off a common bus. All are shared memory machines, but have different schemes for sharing the memory and different locales for the shared memory. The main results of the investigations come from experiments on the 10 processor Encore and the Concert with partitions of 32 or less processors. Additionally, experiments have been run with a stripped down version of EMYCIN.

  1. Efficiency of parallel direct optimization

    NASA Technical Reports Server (NTRS)

    Janies, D. A.; Wheeler, W. C.

    2001-01-01

    Tremendous progress has been made at the level of sequential computation in phylogenetics. However, little attention has been paid to parallel computation. Parallel computing is particularly suited to phylogenetics because of the many ways large computational problems can be broken into parts that can be analyzed concurrently. In this paper, we investigate the scaling factors and efficiency of random addition and tree refinement strategies using the direct optimization software, POY, on a small (10 slave processors) and a large (256 slave processors) cluster of networked PCs running LINUX. These algorithms were tested on several data sets composed of DNA and morphology ranging from 40 to 500 taxa. Various algorithms in POY show fundamentally different properties within and between clusters. All algorithms are efficient on the small cluster for the 40-taxon data set. On the large cluster, multibuilding exhibits excellent parallel efficiency, whereas parallel building is inefficient. These results are independent of data set size. Branch swapping in parallel shows excellent speed-up for 16 slave processors on the large cluster. However, there is no appreciable speed-up for branch swapping with the further addition of slave processors (>16). This result is independent of data set size. Ratcheting in parallel is efficient with the addition of up to 32 processors in the large cluster. This result is independent of data set size. c2001 The Willi Hennig Society.

  2. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J; Blocksome, Michael E; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Endpoint-based parallel data processing in a parallel active messaging interface ('PAMI') of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective opeartion through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  3. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-08-12

    Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  4. Parallel transport of long mean-free-path plasma along open magnetic field lines: Parallel heat flux

    SciTech Connect

    Guo Zehua; Tang Xianzhu

    2012-06-15

    In a long mean-free-path plasma where temperature anisotropy can be sustained, the parallel heat flux has two components with one associated with the parallel thermal energy and the other the perpendicular thermal energy. Due to the large deviation of the distribution function from local Maxwellian in an open field line plasma with low collisionality, the conventional perturbative calculation of the parallel heat flux closure in its local or non-local form is no longer applicable. Here, a non-perturbative calculation is presented for a collisionless plasma in a two-dimensional flux expander bounded by absorbing walls. Specifically, closures of previously unfamiliar form are obtained for ions and electrons, which relate two distinct components of the species parallel heat flux to the lower order fluid moments such as density, parallel flow, parallel and perpendicular temperatures, and the field quantities such as the magnetic field strength and the electrostatic potential. The plasma source and boundary condition at the absorbing wall enter explicitly in the closure calculation. Although the closure calculation does not take into account wave-particle interactions, the results based on passing orbits from steady-state collisionless drift-kinetic equation show remarkable agreement with fully kinetic-Maxwell simulations. As an example of the physical implications of the theory, the parallel heat flux closures are found to predict a surprising observation in the kinetic-Maxwell simulation of the 2D magnetic flux expander problem, where the parallel heat flux of the parallel thermal energy flows from low to high parallel temperature region.

  5. The economics of parallel trade.

    PubMed

    Danzon, P M

    1998-03-01

    The potential for parallel trade in the European Union (EU) has grown with the accession of low price countries and the harmonisation of registration requirements. Parallel trade implies a conflict between the principle of autonomy of member states to set their own pharmaceutical prices, the principle of free trade and the industrial policy goal of promoting innovative research and development (R&D). Parallel trade in pharmaceuticals does not yield the normal efficiency gains from trade because countries achieve low pharmaceutical prices by aggressive regulation, not through superior efficiency. In fact, parallel trade reduces economic welfare by undermining price differentials between markets. Pharmaceutical R&D is a global joint cost of serving all consumers worldwide; it accounts for roughly 30% of total costs. Optimal (welfare maximising) pricing to cover joint costs (Ramsey pricing) requires setting different prices in different markets, based on inverse demand elasticities. By contrast, parallel trade and regulation based on international price comparisons tend to force price convergence across markets. In response, manufacturers attempt to set a uniform 'euro' price. The primary losers from 'euro' pricing will be consumers in low income countries who will face higher prices or loss of access to new drugs. In the long run, even higher income countries are likely to be worse off with uniform prices, because fewer drugs will be developed. One policy option to preserve price differentials is to exempt on-patent products from parallel trade. An alternative is confidential contracting between individual manufacturers and governments to provide country-specific ex post discounts from the single 'euro' wholesale price, similar to rebates used by managed care in the US. This would preserve differentials in transactions prices even if parallel trade forces convergence of wholesale prices. PMID:10178655

  6. Parallel Implicit Algorithms for CFD

    NASA Technical Reports Server (NTRS)

    Keyes, David E.

    1998-01-01

    The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.

  7. Parallel stochastic systems biology in the cloud.

    PubMed

    Aldinucci, Marco; Torquati, Massimo; Spampinato, Concetto; Drocco, Maurizio; Misale, Claudia; Calcagno, Cristina; Coppo, Mario

    2014-09-01

    The stochastic modelling of biological systems, coupled with Monte Carlo simulation of models, is an increasingly popular technique in bioinformatics. The simulation-analysis workflow may result computationally expensive reducing the interactivity required in the model tuning. In this work, we advocate the high-level software design as a vehicle for building efficient and portable parallel simulators for the cloud. In particular, the Calculus of Wrapped Components (CWC) simulator for systems biology, which is designed according to the FastFlow pattern-based approach, is presented and discussed. Thanks to the FastFlow framework, the CWC simulator is designed as a high-level workflow that can simulate CWC models, merge simulation results and statistically analyse them in a single parallel workflow in the cloud. To improve interactivity, successive phases are pipelined in such a way that the workflow begins to output a stream of analysis results immediately after simulation is started. Performance and effectiveness of the CWC simulator are validated on the Amazon Elastic Compute Cloud. PMID:23780997

  8. Bounded Parallel-Batch Scheduling on Unrelated Parallel Machines

    NASA Astrophysics Data System (ADS)

    Miao, Cuixia; Zhang, Yuzhong; Wang, Chengfei

    In this paper, we consider the bounded parallel-batch scheduling problem on unrelated parallel machines. Problems R m |B|F are NP-hard for any objective function F. For this reason, we discuss the special case with p ij = p i for i = 1, 2, ⋯ , m , j = 1, 2, ⋯ , n. We give optimal algorithms for the general scheduling to minimize total weighted completion time, makespan and the number of tardy jobs. And we design pseudo-polynomial time algorithms for the case with rejection penalty to minimize the makespan and the total weighted completion time plus the total penalty of the rejected jobs, respectively.

  9. A parallel Jacobson-Oksman optimization algorithm. [parallel processing (computers)

    NASA Technical Reports Server (NTRS)

    Straeter, T. A.; Markos, A. T.

    1975-01-01

    A gradient-dependent optimization technique which exploits the vector-streaming or parallel-computing capabilities of some modern computers is presented. The algorithm, derived by assuming that the function to be minimized is homogeneous, is a modification of the Jacobson-Oksman serial minimization method. In addition to describing the algorithm, conditions insuring the convergence of the iterates of the algorithm and the results of numerical experiments on a group of sample test functions are presented. The results of these experiments indicate that this algorithm will solve optimization problems in less computing time than conventional serial methods on machines having vector-streaming or parallel-computing capabilities.

  10. Parallel plasma fluid turbulence calculations

    NASA Astrophysics Data System (ADS)

    Leboeuf, J. N.; Carreras, B. A.; Charlton, L. A.; Drake, J. B.; Lynch, V. E.; Newman, D. E.; Sidikman, K. L.; Spong, D. A.

    The study of plasma turbulence and transport is a complex problem of critical importance for fusion-relevant plasmas. To this day, the fluid treatment of plasma dynamics is the best approach to realistic physics at the high resolution required for certain experimentally relevant calculations. Core and edge turbulence in a magnetic fusion device have been modeled using state-of-the-art, nonlinear, three-dimensional, initial-value fluid and gyrofluid codes. Parallel implementation of these models on diverse platforms--vector parallel (National Energy Research Supercomputer Center's CRAY Y-MP C90), massively parallel (Intel Paragon XP/S 35), and serial parallel (clusters of high-performance workstations using the Parallel Virtual Machine protocol)--offers a variety of paths to high resolution and significant improvements in real-time efficiency, each with its own advantages. The largest and most efficient calculations have been performed at the 200 Mword memory limit on the C90 in dedicated mode, where an overlap of 12 to 13 out of a maximum of 16 processors has been achieved with a gyrofluid model of core fluctuations. The richness of the physics captured by these calculations is commensurate with the increased resolution and efficiency and is limited only by the ingenuity brought to the analysis of the massive amounts of data generated.

  11. Computing contingency statistics in parallel.

    SciTech Connect

    Bennett, Janine Camille; Thompson, David; Pebay, Philippe Pierre

    2010-09-01

    Statistical analysis is typically used to reduce the dimensionality of and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. Many statistical techniques, e.g., descriptive statistics or principal component analysis, are based on moments and co-moments and, using robust online update formulas, can be computed in an embarrassingly parallel manner, amenable to a map-reduce style implementation. In this paper we focus on contingency tables, through which numerous derived statistics such as joint and marginal probability, point-wise mutual information, information entropy, and {chi}{sup 2} independence statistics can be directly obtained. However, contingency tables can become large as data size increases, requiring a correspondingly large amount of communication between processors. This potential increase in communication prevents optimal parallel speedup and is the main difference with moment-based statistics where the amount of inter-processor communication is independent of data size. Here we present the design trade-offs which we made to implement the computation of contingency tables in parallel.We also study the parallel speedup and scalability properties of our open source implementation. In particular, we observe optimal speed-up and scalability when the contingency statistics are used in their appropriate context, namely, when the data input is not quasi-diffuse.

  12. Parallelizing Timed Petri Net simulations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.

    1993-01-01

    The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included.

  13. Parallel computation and computers for artificial intelligence

    SciTech Connect

    Kowalik, J.S. )

    1988-01-01

    This book discusses Parallel Processing in Artificial Intelligence; Parallel Computing using Multilisp; Execution of Common Lisp in a Parallel Environment; Qlisp; Restricted AND-Parallel Execution of Logic Programs; PARLOG: Parallel Programming in Logic; and Data-driven Processing of Semantic Nets. Attention is also given to: Application of the Butterfly Parallel Processor in Artificial Intelligence; On the Range of Applicability of an Artificial Intelligence Machine; Low-level Vision on Warp and the Apply Programming Mode; AHR: A Parallel Computer for Pure Lisp; FAIM-1: An Architecture for Symbolic Multi-processing; and Overview of Al Application Oriented Parallel Processing Research in Japan.

  14. Massively Parallel MRI Detector Arrays

    PubMed Central

    Keil, Boris; Wald, Lawrence L

    2013-01-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called “ultimate” SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays. PMID:23453758

  15. Fast data parallel polygon rendering

    SciTech Connect

    Ortega, F.A.; Hansen, C.D.

    1993-09-01

    This paper describes a parallel method for polygonal rendering on a massively parallel SIMD machine. This method, based on a simple shading model, is targeted for applications which require very fast polygon rendering for extremely large sets of polygons such as is found in many scientific visualization applications. The algorithms described in this paper are incorporated into a library of 3D graphics routines written for the Connection Machine. The routines are implemented on both the CM-200 and the CM-5. This library enables a scientists to display 3D shaded polygons directly from a parallel machine without the need to transmit huge amounts of data to a post-processing rendering system.

  16. Parallel integrated frame synchronizer chip

    NASA Technical Reports Server (NTRS)

    Ghuman, Parminder Singh (Inventor); Solomon, Jeffrey Michael (Inventor); Bennett, Toby Dennis (Inventor)

    2000-01-01

    A parallel integrated frame synchronizer which implements a sequential pipeline process wherein serial data in the form of telemetry data or weather satellite data enters the synchronizer by means of a front-end subsystem and passes to a parallel correlator subsystem or a weather satellite data processing subsystem. When in a CCSDS mode, data from the parallel correlator subsystem passes through a window subsystem, then to a data alignment subsystem and then to a bit transition density (BTD)/cyclical redundancy check (CRC) decoding subsystem. Data from the BTD/CRC decoding subsystem or data from the weather satellite data processing subsystem is then fed to an output subsystem where it is output from a data output port.

  17. Parallel Adaptive Mesh Refinement Library

    NASA Technical Reports Server (NTRS)

    Mac-Neice, Peter; Olson, Kevin

    2005-01-01

    Parallel Adaptive Mesh Refinement Library (PARAMESH) is a package of Fortran 90 subroutines designed to provide a computer programmer with an easy route to extension of (1) a previously written serial code that uses a logically Cartesian structured mesh into (2) a parallel code with adaptive mesh refinement (AMR). Alternatively, in its simplest use, and with minimal effort, PARAMESH can operate as a domain-decomposition tool for users who want to parallelize their serial codes but who do not wish to utilize adaptivity. The package builds a hierarchy of sub-grids to cover the computational domain of a given application program, with spatial resolution varying to satisfy the demands of the application. The sub-grid blocks form the nodes of a tree data structure (a quad-tree in two or an oct-tree in three dimensions). Each grid block has a logically Cartesian mesh. The package supports one-, two- and three-dimensional models.

  18. Visualizing Parallel Computer System Performance

    NASA Technical Reports Server (NTRS)

    Malony, Allen D.; Reed, Daniel A.

    1988-01-01

    Parallel computer systems are among the most complex of man's creations, making satisfactory performance characterization difficult. Despite this complexity, there are strong, indeed, almost irresistible, incentives to quantify parallel system performance using a single metric. The fallacy lies in succumbing to such temptations. A complete performance characterization requires not only an analysis of the system's constituent levels, it also requires both static and dynamic characterizations. Static or average behavior analysis may mask transients that dramatically alter system performance. Although the human visual system is remarkedly adept at interpreting and identifying anomalies in false color data, the importance of dynamic, visual scientific data presentation has only recently been recognized Large, complex parallel system pose equally vexing performance interpretation problems. Data from hardware and software performance monitors must be presented in ways that emphasize important events while eluding irrelevant details. Design approaches and tools for performance visualization are the subject of this paper.

  19. PARAVT: Parallel Voronoi Tessellation code

    NASA Astrophysics Data System (ADS)

    Gonzalez, Roberto E.

    2016-01-01

    We present a new open source code for massive parallel computation of Voronoi tessellations(VT hereafter) in large data sets. The code is focused for astrophysical purposes where VT densities and neighbors are widely used. There are several serial Voronoi tessellation codes, however no open source and parallel implementations are available to handle the large number of particles/galaxies in current N-body simulations and sky surveys. Parallelization is implemented under MPI and VT using Qhull library. Domain decomposition take into account consistent boundary computation between tasks, and support periodic conditions. In addition, the code compute neighbors lists, Voronoi density and Voronoi cell volumes for each particle, and can compute density on a regular grid.

  20. Two portable parallel tridiagonal solvers

    SciTech Connect

    Eltgroth, P.G.

    1994-07-15

    Many scientific computer codes involve linear systems of equations which are coupled only between nearest neighbors in a single dimension. The most common situation can be formulated as a tridiagonal matrix relating source terms and unknowns. This system of equations is commonly solved using simple forward and back substitution. The usual algorithm is spectacularly ill suited for parallel processing with distributed data, since information must be sequentially communicated across all domains. Two new tridiagonal algorithms have been implemented in FORTRAN 77. The two algorithms differ only in the form of the unknown which is to be found. The first and simplest algorithm solves for a scalar quantity evaluated at each point along the single dimension being considered. The second algorithm solves for a vector quantity evaluated at each point. The solution method is related to other recently published approaches, such as that of Bondeli. An alternative parallel tridiagonal solver, used as part of an Alternating Direction Implicit (ADI) scheme, has recently been developed at LLNL by Lambert. For a discussion of useful parallel tridiagonal solvers, see the work of Mattor, et al. Previous work appears to be concerned only with scalar unknowns. This paper presents a new technique which treats both scalar and vector unknowns. There is no restriction upon the sizes of the subdomains. Even though the usual tridiagonal formulation may not be theoretically optimal when used iteratively, it is used in so many computer codes that it appears reasonable to write a direct substitute for it. The new tridiagonal code can be used on parallel machines with a minimum of disruption to pre-existing programming. As tested on various parallel computers, the parallel code shows efficiency greater than 50% (that is, more than half of the available computer operations are used to advance the calculation) when each processor is given at least 100 unknowns for which to solve.

  1. Parallel algorithms for mapping pipelined and parallel computations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.

    1988-01-01

    Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.

  2. ITER LHe Plants Parallel Operation

    NASA Astrophysics Data System (ADS)

    Fauve, E.; Bonneton, M.; Chalifour, M.; Chang, H.-S.; Chodimella, C.; Monneret, E.; Vincent, G.; Flavien, G.; Fabre, Y.; Grillot, D.

    The ITER Cryogenic System includes three identical liquid helium (LHe) plants, with a total average cooling capacity equivalent to 75 kW at 4.5 K.The LHe plants provide the 4.5 K cooling power to the magnets and cryopumps. They are designed to operate in parallel and to handle heavy load variations.In this proceedingwe will describe the presentstatusof the ITER LHe plants with emphasis on i) the project schedule, ii) the plantscharacteristics/layout and iii) the basic principles and control strategies for a stable operation of the three LHe plants in parallel.

  3. Gang scheduling a parallel machine

    SciTech Connect

    Gorda, B.C.; Brooks, E.D. III.

    1991-03-01

    Program development on parallel machines can be a nightmare of scheduling headaches. We have developed a portable time sharing mechanism to handle the problem of scheduling gangs of processors. User program and their gangs of processors are put to sleep and awakened by the gang scheduler to provide a time sharing environment. Time quantums are adjusted according to priority queues and a system of fair share accounting. The initial platform for this software is the 128 processor BBN TC2000 in use in the Massively Parallel Computing Initiative at the Lawrence Livermore National Laboratory. 2 refs., 1 fig.

  4. Parallel optical coherence tomography system.

    PubMed

    Luo, Yuan; Arauz, Lina J; Castillo, Jose E; Barton, Jennifer K; Kostuk, Raymond K

    2007-12-01

    We present the design and procedures for implementing a parallel optical coherence tomography (POCT) imaging system that can be adapted to an endoscopic format. The POCT system consists of a single mode fiber (SMF) array with multiple reduced diameter (15 microm) SMFs in the sample arm with 15 microm center spacing between fibers. The size of the array determines the size of the transverse imaging field. Electronic scanning eliminates the need for mechanically scanning in the lateral direction. Experimental image data obtained with this system show the capability for parallel axial scan acquisition with lateral resolution comparable to mechanically scanned optical coherence tomography systems. PMID:18059671

  5. The AIS-5000 parallel processor

    SciTech Connect

    Schmitt, L.A.; Wilson, S.S.

    1988-05-01

    The AIS-5000 is a commercially available massively parallel processor which has been designed to operate in an industrial environment. It has fine-grained parallelism with up to 1024 processing elements arranged in a single-instruction multiple-data (SIMD) architecture. The processing elements are arranged in a one-dimensional chain that, for computer vision applications, can be as wide as the image itself. This architecture has superior cost/performance characteristics than two-dimensional mesh-connected systems. The design of the processing elements and their interconnections as well as the software used to program the system allow a wide variety of algorithms and applications to be implemented. In this paper, the overall architecture of the system is described. Various components of the system are discussed, including details of the processing elements, data I/O pathways and parallel memory organization. A virtual two-dimensional model for programming image-based algorithms for the system is presented. This model is supported by the AIS-5000 hardware and software and allows the system to be treated as a full-image-size, two-dimensional, mesh-connected parallel processor. Performance bench marks are given for certain simple and complex functions.

  6. Computational chemistry on parallel computers

    SciTech Connect

    Harrison, R.J.; Shepard, R.; Wagner, A.F.

    1994-03-01

    The recent successful adaptation of mainline computational chemistry codes to parallel computers introduces a new era of cost-effective, computer-intensive chemistry applications and paves the way for future applications on massively parallel centralized computers being developed under the High Performance Computer and Communications Initiative. Parallel computer architecture offers the promise of inexpensive supercomputing for the price of effort in algorithm adaptations to parallelism. In Chemical Sciences-supported work at Argonne, beginning efforts at algorithm changes in computational chemistry codes has resulted in program performances on the Group`s 12-processor Alliant computer superior to that on one-processor Cray X-MP or Y-MP computers. The effort so far has focused on sophisticated and highly accurate electronic structure production codes for determining the forces between atoms and molecules responsible for chemical structure, spectra, and reactivity. Some effort has also been invested in trajectory simulations of molecular dynamics. The American-made Alliant computer (model FX/2812) is one of the latest generation of shared-memory group- or division-size computers that generally cost about an order of magnitude less than the laboratory- or university-size computers such as Crays.

  7. Matpar: Parallel Extensions for MATLAB

    NASA Technical Reports Server (NTRS)

    Springer, P. L.

    1998-01-01

    Matpar is a set of client/server software that allows a MATLAB user to take advantage of a parallel computer for very large problems. The user can replace calls to certain built-in MATLAB functions with calls to Matpar functions.

  8. Parallel, Distributed Scripting with Python

    SciTech Connect

    Miller, P J

    2002-05-24

    Parallel computers used to be, for the most part, one-of-a-kind systems which were extremely difficult to program portably. With SMP architectures, the advent of the POSIX thread API and OpenMP gave developers ways to portably exploit on-the-box shared memory parallelism. Since these architectures didn't scale cost-effectively, distributed memory clusters were developed. The associated MPI message passing libraries gave these systems a portable paradigm too. Having programmers effectively use this paradigm is a somewhat different question. Distributed data has to be explicitly transported via the messaging system in order for it to be useful. In high level languages, the MPI library gives access to data distribution routines in C, C++, and FORTRAN. But we need more than that. Many reasonable and common tasks are best done in (or as extensions to) scripting languages. Consider sysadm tools such as password crackers, file purgers, etc ... These are simple to write in a scripting language such as Python (an open source, portable, and freely available interpreter). But these tasks beg to be done in parallel. Consider the a password checker that checks an encrypted password against a 25,000 word dictionary. This can take around 10 seconds in Python (6 seconds in C). It is trivial to parallelize if you can distribute the information and co-ordinate the work.

  9. File concepts for parallel I/O

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1989-01-01

    The subject of input/output (I/O) was often neglected in the design of parallel computer systems, although for many problems I/O rates will limit the speedup attainable. The I/O problem is addressed by considering the role of files in parallel systems. The notion of parallel files is introduced. Parallel files provide for concurrent access by multiple processes, and utilize parallelism in the I/O system to improve performance. Parallel files can also be used conventionally by sequential programs. A set of standard parallel file organizations is proposed, organizations are suggested, using multiple storage devices. Problem areas are also identified and discussed.

  10. Rotary wing aerodynamically generated noise

    NASA Technical Reports Server (NTRS)

    Schmitz, F. J.; Morse, H. A.

    1982-01-01

    The history and methodology of aerodynamic noise reduction in rotary wing aircraft are presented. Thickness noise during hover tests and blade vortex interaction noise are determined and predicted through the use of a variety of computer codes. The use of test facilities and scale models for data acquisition are discussed.

  11. Parallel volume rendering using the BSP model

    NASA Astrophysics Data System (ADS)

    Xie, Hong; Li, Wanqing

    1997-09-01

    We present a new parallel volume rendering algorithm based on the split-light model for rendering and the bulk synchronous parallel (BSP) model for parallelization. The BSP model provides a simple and architecture-independent approach to structure the parallel program. This parallel program has been tested on a shared memory SGI PowerChallenge machine, a distributed memory IBM SP2 machine and a network of UNIX workstations.

  12. Rochester checkers player: Multi-model parallel programming for animate vision. Technical report

    SciTech Connect

    Marsh, B.D.; Brown, C.M.; LeBlanc, T.J.; Scott, M.L.; Becker, T.G.

    1991-06-01

    Animate vision systems couple computer vision and robotics to achieve robust and accurate vision, as well as other complex behavior. These systems combine low-level sensory processing and effector output with high-level cognitive planning - all computationally intensive tasks that can benefit from parallel processing. No single model of parallel programming is likely to serve for all tasks, however. Early vision algorithms are intensely data parallel, often utilizing fine-grain parallel computations that share an image, while cognition algorithms decompose naturally by function, often consisting of loosely-coupled, coarse-grain parallel units. A typical animate vision application will likely consist of many tasks, each of which may require a different parallel programming model, and all of which must cooperate to achieve the desired behavior. These multi-model programs require an underlying software system that not only supports several different models of parallel computation simultaneously, but which also allows tasks implemented in different models to interact.

  13. Parallel-processing techniques for production systems

    SciTech Connect

    da Mota Tenorio, M.F.

    1987-01-01

    Production systems static and dynamic characteristics are modeled with the use of graph grammar, in order to create means to increase the processing efficiency and the use of parallel computation through compile-time analysis. The model is used to explicate rule interaction, so that proofs of equivalence between knowledge bases can be attempted. Solely relying on program static characteristics shown by the model, a series of observations are made to determine the system dynamic characteristics and modifications to the original knowledge base are suggested as a means of increasing efficiency and decreasing overall search and computational effort. Dependencies between the rules are analyzed and different approaches for automatic detection are presented. From rule dependences, tools for programming environments,logical evaluation of search spaces and Petri net models of production systems are shown. An algorithm for the allocation and partitioning of a production system into a multiprocessor system is also shown, and addresses the problems of communication and execution of these systems in parallel. Finally, the results of a simulator constructed to test several strategies, networks, and algorithms are presented.

  14. A polymorphic reconfigurable emulator for parallel simulation

    NASA Technical Reports Server (NTRS)

    Parrish, E. A., Jr.; Mcvey, E. S.; Cook, G.

    1980-01-01

    Microprocessor and arithmetic support chip technology was applied to the design of a reconfigurable emulator for real time flight simulation. The system developed consists of master control system to perform all man machine interactions and to configure the hardware to emulate a given aircraft, and numerous slave compute modules (SCM) which comprise the parallel computational units. It is shown that all parts of the state equations can be worked on simultaneously but that the algebraic equations cannot (unless they are slowly varying). Attempts to obtain algorithms that will allow parellel updates are reported. The word length and step size to be used in the SCM's is determined and the architecture of the hardware and software is described.

  15. Parallel processing for view-dependent polygonal virtual environments

    NASA Astrophysics Data System (ADS)

    El-Sana, Jihad; Varshney, Amitabh

    1999-03-01

    This paper presents a parallel algorithm for preprocessing as well as real-time navigation of view-dependent virtual environments on shared memory multiprocessors. The algorithm proceeds by hierarchical spatial subdivision of the input dataset by an octree. The parallel algorithm is robust and does not generate any artifacts such as degenerate triangles and mesh foldovers. The algorithm performance scales linearly with increase in the number of processors as well as increase in the input dataset complexity. The resulting visualization performance is fast enough to enable interleaved acquisition and modification with interactive visualization.

  16. Parallel aeroelastic computations for wing and wing-body configurations

    NASA Technical Reports Server (NTRS)

    Byun, Chansup

    1994-01-01

    The objective of this research is to develop computationally efficient methods for solving fluid-structural interaction problems by directly coupling finite difference Euler/Navier-Stokes equations for fluids and finite element dynamics equations for structures on parallel computers. This capability will significantly impact many aerospace projects of national importance such as Advanced Subsonic Civil Transport (ASCT), where the structural stability margin becomes very critical at the transonic region. This research effort will have direct impact on the High Performance Computing and Communication (HPCC) Program of NASA in the area of parallel computing.

  17. A brief parallel I/O tutorial.

    SciTech Connect

    Ward, H. Lee

    2010-03-01

    This document provides common best practices for the efficient utilization of parallel file systems for analysts and application developers. A multi-program, parallel supercomputer is able to provide effective compute power by aggregating a host of lower-power processors using a network. The idea, in general, is that one either constructs the application to distribute parts to the different nodes and processors available and then collects the result (a parallel application), or one launches a large number of small jobs, each doing similar work on different subsets (a campaign). The I/O system on these machines is usually implemented as a tightly-coupled, parallel application itself. It is providing the concept of a 'file' to the host applications. The 'file' is an addressable store of bytes and that address space is global in nature. In essence, it is providing a global address space. Beyond the simple reality that the I/O system is normally composed of a small, less capable, collection of hardware, that concept of a global address space will cause problems if not very carefully utilized. How much of a problem and the ways in which those problems manifest will be different, but that it is problem prone has been well established. Worse, the file system is a shared resource on the machine - a system service. What an application does when it uses the file system impacts all users. It is not the case that some portion of the available resource is reserved. Instead, the I/O system responds to requests by scheduling and queuing based on instantaneous demand. Using the system well contributes to the overall throughput on the machine. From a solely self-centered perspective, using it well reduces the time that the application or campaign is subject to impact by others. The developer's goal should be to accomplish I/O in a way that minimizes interaction with the I/O system, maximizes the amount of data moved per call, and provides the I/O system the most information about the I/O transfer per request.

  18. Parallel supercomputing with commodity components

    NASA Technical Reports Server (NTRS)

    Warren, M. S.; Goda, M. P.; Becker, D. J.

    1997-01-01

    We have implemented a parallel computer architecture based entirely upon commodity personal computer components. Using 16 Intel Pentium Pro microprocessors and switched fast ethernet as a communication fabric, we have obtained sustained performance on scientific applications in excess of one Gigaflop. During one production astrophysics treecode simulation, we performed 1.2 x 10(sup 15) floating point operations (1.2 Petaflops) over a three week period, with one phase of that simulation running continuously for two weeks without interruption. We report on a variety of disk, memory and network benchmarks. We also present results from the NAS parallel benchmark suite, which indicate that this architecture is competitive with current commercial architectures. In addition, we describe some software written to support efficient message passing, as well as a Linux device driver interface to the Pentium hardware performance monitoring registers.

  19. Parallel multiplex laser feedback interferometry

    SciTech Connect

    Zhang, Song; Tan, Yidong; Zhang, Shulian

    2013-12-15

    We present a parallel multiplex laser feedback interferometer based on spatial multiplexing which avoids the signal crosstalk in the former feedback interferometer. The interferometer outputs two close parallel laser beams, whose frequencies are shifted by two acousto-optic modulators by 2Ω simultaneously. A static reference mirror is inserted into one of the optical paths as the reference optical path. The other beam impinges on the target as the measurement optical path. Phase variations of the two feedback laser beams are simultaneously measured through heterodyne demodulation with two different detectors. Their subtraction accurately reflects the target displacement. Under typical room conditions, experimental results show a resolution of 1.6 nm and accuracy of 7.8 nm within the range of 100 μm.

  20. Merlin - Massively parallel heterogeneous computing

    NASA Technical Reports Server (NTRS)

    Wittie, Larry; Maples, Creve

    1989-01-01

    Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.

  1. Parallel processing spacecraft communication system

    NASA Technical Reports Server (NTRS)

    Bolotin, Gary S. (Inventor); Donaldson, James A. (Inventor); Luong, Huy H. (Inventor); Wood, Steven H. (Inventor)

    1998-01-01

    An uplink controlling assembly speeds data processing using a special parallel codeblock technique. A correct start sequence initiates processing of a frame. Two possible start sequences can be used; and the one which is used determines whether data polarity is inverted or non-inverted. Processing continues until uncorrectable errors are found. The frame ends by intentionally sending a block with an uncorrectable error. Each of the codeblocks in the frame has a channel ID. Each channel ID can be separately processed in parallel. This obviates the problem of waiting for error correction processing. If that channel number is zero, however, it indicates that the frame of data represents a critical command only. That data is handled in a special way, independent of the software. Otherwise, the processed data further handled using special double buffering techniques to avoid problems from overrun. When overrun does occur, the system takes action to lose only the oldest data.

  2. Parallel Processing in Combustion Analysis

    NASA Technical Reports Server (NTRS)

    Schunk, Richard Gregory; Chung, T. J.

    2000-01-01

    The objective of this research is to demonstrate the application of the Flow-field Dependent Variation (FDV) method to a problem of current interest in supersonic chemical combustion. Due in part to the stiffness of the chemical reactions, the solution of such problems on unstructured three dimensional grids often dictates the use of parallel computers. Preliminary results for the injection of a supersonic hydrogen stream into vitiated air are presented.

  3. Efficient, massively parallel eigenvalue computation

    NASA Technical Reports Server (NTRS)

    Huo, Yan; Schreiber, Robert

    1993-01-01

    In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.

  4. Wakefield calculations on parallel computers

    SciTech Connect

    Schoessow, P.

    1990-01-01

    The use of parallelism in the solution of wakefield problems is illustrated for two different computer architectures (SIMD and MIMD). Results are given for finite difference codes which have been implemented on a Connection Machine and an Alliant FX/8 and which are used to compute wakefields in dielectric loaded structures. Benchmarks on code performance are presented for both cases. 4 refs., 3 figs., 2 tabs.

  5. Parallel Power Grid Simulation Toolkit

    Energy Science and Technology Software Center (ESTSC)

    2015-09-14

    ParGrid is a 'wrapper' that integrates a coupled Power Grid Simulation toolkit consisting of a library to manage the synchronization and communication of independent simulations. The included library code in ParGid, named FSKIT, is intended to support the coupling multiple continuous and discrete even parallel simulations. The code is designed using modern object oriented C++ methods utilizing C++11 and current Boost libraries to ensure compatibility with multiple operating systems and environments.

  6. Task parallelism and high-performance languages

    SciTech Connect

    Foster, I.

    1996-03-01

    The definition of High Performance Fortran (HPF) is a significant event in the maturation of parallel computing: it represents the first parallel language that has gained widespread support from vendors and users. The subject of this paper is to incorporate support for task parallelism. The term task parallelism refers to the explicit creation of multiple threads of control, or tasks, which synchronize and communicate under programmer control. Task and data parallelism are complementary rather than competing programming models. While task parallelism is more general and can be used to implement algorithms that are not amenable to data-parallel solutions, many problems can benefit from a mixed approach, with for example a task-parallel coordination layer integrating multiple data-parallel computations. Other problems admit to both data- and task-parallel solutions, with the better solution depending on machine characteristics, compiler performance, or personal taste. For these reasons, we believe that a general-purpose high-performance language should integrate both task- and data-parallel constructs. The challenge is to do so in a way that provides the expressivity needed for applications, while preserving the flexibility and portability of a high-level language. In this paper, we examine and illustrate the considerations that motivate the use of task parallelism. We also describe one particular approach to task parallelism in Fortran, namely the Fortran M extensions. Finally, we contrast Fortran M with other proposed approaches and discuss the implications of this work for task parallelism and high-performance languages.

  7. Highly parallel sparse Cholesky factorization

    NASA Technical Reports Server (NTRS)

    Gilbert, John R.; Schreiber, Robert

    1990-01-01

    Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.

  8. Resource-efficient parallel algorithms

    SciTech Connect

    Hochschild, P.H.

    1986-01-01

    This thesis concerns the problem of exploiting the possibilities of parallel computation. To that end, several paradigms for the construction of efficient parallel algorithms were developed. These paradigms are effective in designing algorithms for solving a variety of combinatorial and pattern-matching problems. The resulting space and time-efficient programs operate on simple and regular parallel architectures that are suitable for VLSI implementation. Many of the algorithms owe their efficiency to filtration. A filter is a device used to discard rapidly input data that is irrelevant. Filtration reduces the storage, time, and communication requirements of a wide variety of problems. Filter construction demands balancing two opposing goals. On the one hand, a filter must operate quickly enough to avoid becoming a bottleneck. On the other hand, it must be thorough enough to discard a significant portion of the data. Thus, in general, a filter performs a kind of approximation to the desired computation. This approximation is later refined to yield the correct result. By implementing the paradigm of cascaded filtration, using the funnelled pipeline architecture, efficient solutions to several classical problems were developed.

  9. Web based parallel/distributed medical data mining using software agents

    SciTech Connect

    Kargupta, H.; Stafford, B.; Hamzaoglu, I.

    1997-12-31

    This paper describes an experimental parallel/distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based interface for interactive data visualization. It also presents the results of applying PADMA for detecting patterns in unstructured texts of postmortem reports and laboratory test data for Hepatitis C patients.

  10. NWChem: scalable parallel computational chemistry

    SciTech Connect

    van Dam, Hubertus JJ; De Jong, Wibe A.; Bylaska, Eric J.; Govind, Niranjan; Kowalski, Karol; Straatsma, TP; Valiev, Marat

    2011-11-01

    NWChem is a general purpose computational chemistry code specifically designed to run on distributed memory parallel computers. The core functionality of the code focuses on molecular dynamics, Hartree-Fock and density functional theory methods for both plane-wave basis sets as well as Gaussian basis sets, tensor contraction engine based coupled cluster capabilities and combined quantum mechanics/molecular mechanics descriptions. It was realized from the beginning that scalable implementations of these methods required a programming paradigm inherently different from what message passing approaches could offer. In response a global address space library, the Global Array Toolkit, was developed. The programming model it offers is based on using predominantly one-sided communication. This model underpins most of the functionality in NWChem and the power of it is exemplified by the fact that the code scales to tens of thousands of processors. In this paper the core capabilities of NWChem are described as well as their implementation to achieve an efficient computational chemistry code with high parallel scalability. NWChem is a modern, open source, computational chemistry code1 specifically designed for large scale parallel applications2. To meet the challenges of developing efficient, scalable and portable programs of this nature a particular code design was adopted. This code design involved two main features. First of all, the code is build up in a modular fashion so that a large variety of functionality can be integrated easily. Secondly, to facilitate writing complex parallel algorithms the Global Array toolkit was developed. This toolkit allows one to write parallel applications in a shared memory like approach, but offers additional mechanisms to exploit data locality to lower communication overheads. This framework has proven to be very successful in computational chemistry but is applicable to any engineering domain. Within the context created by the features above NWChem has grown into a general purpose computational chemistry code that supports a wide variety of energy expressions and capabilities to calculate properties based there upon. The main energy expressions are classical mechanics force fields, Hartree-Fock and DFT both for finite systems and condensed phase systems, coupled cluster, as well as QM/MM. For most energy expressions single point calculations, geometry optimizations, excited states, and other properties are available. Below we briefly discuss each of the main energy expressions and the critical points involved in scalable implementations thereof.

  11. Parallel micromanipulation method for microassembly

    NASA Astrophysics Data System (ADS)

    Sin, Jeongsik; Stephanou, Harry E.

    2001-09-01

    Microassembly deals with micron or millimeter scale objects where the tolerance requirements are in the micron range. Typical applications include electronics components (silicon fabricated circuits), optoelectronics components (photo detectors, emitters, amplifiers, optical fibers, microlenses, etc.), and MEMS (Micro-Electro-Mechanical-System) dies. The assembly processes generally require not only high precision but also high throughput at low manufacturing cost. While conventional macroscale assembly methods have been utilized in scaled down versions for microassembly applications, they exhibit limitations on throughput and cost due to the inherently serialized process. Since the assembly process depends heavily on the manipulation performance, an efficient manipulation method for small parts will have a significant impact on the manufacturing of miniaturized products. The objective of this study on 'parallel micromanipulation' is to achieve these three requirements through the handling of multiple small parts simultaneously (in parallel) with high precision (micromanipulation). As a step toward this objective, a new manipulation method is introduced. The method uses a distributed actuation array for gripper free and parallel manipulation, and a centralized, shared actuator for simplified controls. The method has been implemented on a testbed 'Piezo Active Surface (PAS)' in which an actively generated friction force field is the driving force for part manipulation. Basic motion primitives, such as translation and rotation of objects, are made possible with the proposed method. This study discusses the design of the proposed manipulation method PAS, and the corresponding manipulation mechanism. The PAS consists of two piezoelectric actuators for X and Y motion, two linear motion guides, two sets of nozzle arrays, and solenoid valves to switch the pneumatic suction force on and off in individual nozzles. One array of nozzles is fixed relative to the surface on which the objects are placed, while the other set is actuated by the actuator relative to this surface. The combination of piezoactuation and pneumatic force generates a friction force that can manipulate multiple objects simultaneously, without grippers. We model the manipulation as the quasistatic motion with an approximation of limit surface. Also an experiment was carried to validate the proposed idea and the design of the prototype. The object manipulated in the experiments was a small piece of silicon wafer (1 mm X 4 mm) with 10 Hz of 10 micrometers stroke of the piezoelectric actuation system. The method is being extended to the parallel manipulation of small objects such as v-groove fiber assemblies and MEMS dies. The combined precision of piezoelectric actuation and speed of parallel manipulation is expected to yield a low cost microassembly method.

  12. Parallel ecological networks in ecosystems

    PubMed Central

    Olff, Han; Alonso, David; Berg, Matty P.; Eriksson, B. Klemens; Loreau, Michel; Piersma, Theunis; Rooney, Neil

    2009-01-01

    In ecosystems, species interact with other species directly and through abiotic factors in multiple ways, often forming complex networks of various types of ecological interaction. Out of this suite of interactions, predator–prey interactions have received most attention. The resulting food webs, however, will always operate simultaneously with networks based on other types of ecological interaction, such as through the activities of ecosystem engineers or mutualistic interactions. Little is known about how to classify, organize and quantify these other ecological networks and their mutual interplay. The aim of this paper is to provide new and testable ideas on how to understand and model ecosystems in which many different types of ecological interaction operate simultaneously. We approach this problem by first identifying six main types of interaction that operate within ecosystems, of which food web interactions are one. Then, we propose that food webs are structured among two main axes of organization: a vertical (classic) axis representing trophic position and a new horizontal ‘ecological stoichiometry’ axis representing decreasing palatability of plant parts and detritus for herbivores and detrivores and slower turnover times. The usefulness of these new ideas is then explored with three very different ecosystems as test cases: temperate intertidal mudflats; temperate short grass prairie; and tropical savannah. PMID:19451126

  13. Application of the hypercube parallel processor to a large-scale moment method code

    NASA Technical Reports Server (NTRS)

    Manshadi, Farzin; Liewer, Paulet C.; Patterson, Jean E.

    1988-01-01

    The applicability of a parallel computing architecture to the solution of a large-scale moment-method code is investigated. Specifically, the NEC (Numerical Electromagnetics Code) method-of-moments scattering program is implemented on a hypercube parallel processor. The accuracy and the increase in the speed of execution on this parallel architecture are demonstrated. The results show a very large reduction in execution time for large problems. The great potential of this parallel processor is shown for interactive solution of large NEC problems as well as other moment-method techniques such as the finite-element method.

  14. Global Arrays Parallel Programming Toolkit

    SciTech Connect

    Nieplocha, Jaroslaw; Krishnan, Manoj Kumar; Palmer, Bruce J.; Tipparaju, Vinod; Harrison, Robert J.; Chavarría-Miranda, Daniel

    2011-01-01

    The two predominant classes of programming models for parallel computing are distributed memory and shared memory. Both shared memory and distributed memory models have advantages and shortcomings. Shared memory model is much easier to use but it ignores data locality/placement. Given the hierarchical nature of the memory subsystems in modern computers this characteristic can have a negative impact on performance and scalability. Careful code restructuring to increase data reuse and replacing fine grain load/stores with block access to shared data can address the problem and yield performance for shared memory that is competitive with message-passing. However, this performance comes at the cost of compromising the ease of use that the shared memory model advertises. Distributed memory models, such as message-passing or one-sided communication, offer performance and scalability but they are difficult to program. The Global Arrays toolkit attempts to offer the best features of both models. It implements a shared-memory programming model in which data locality is managed by the programmer. This management is achieved by calls to functions that transfer data between a global address space (a distributed array) and local storage. In this respect, the GA model has similarities to the distributed shared-memory models that provide an explicit acquire/release protocol. However, the GA model acknowledges that remote data is slower to access than local data and allows data locality to be specified by the programmer and hence managed. GA is related to the global address space languages such as UPC, Titanium, and, to a lesser extent, Co-Array Fortran. In addition, by providing a set of data-parallel operations, GA is also related to data-parallel languages such as HPF, ZPL, and Data Parallel C. However, the Global Array programming model is implemented as a library that works with most languages used for technical computing and does not rely on compiler technology for achieving parallel efficiency. It also supports a combination of task- and data-parallelism and is available as an extension of the message passing (MPI) model. The GA model exposes to the programmer the hierarchical memory of modern high-performance computer systems, and by recognizing the communication overhead for remote data transfer, it promotes data reuse and locality of reference. Virtually all the scalable architectures possess non-uniform memory access characteristics that reflect their multi-level memory hierarchies. These hierarchies typically comprise processor registers, multiple levels of cache, local memory, and remote memory. Over time, both the number of levels and the cost (in processor cycles) of accessing deeper levels has been increasing. It is important for any scalable programming model to address memory hierarchy since it is critical to the efficient execution of scalable applications.

  15. Implementing clips on a parallel computer

    NASA Technical Reports Server (NTRS)

    Riley, Gary

    1987-01-01

    The C language integrated production system (CLIPS) is a forward chaining rule based language to provide training and delivery for expert systems. Conceptually, rule based languages have great potential for benefiting from the inherent parallelism of the algorithms that they employ. During each cycle of execution, a knowledge base of information is compared against a set of rules to determine if any rules are applicable. Parallelism also can be employed for use with multiple cooperating expert systems. To investigate the potential benefits of using a parallel computer to speed up the comparison of facts to rules in expert systems, a parallel version of CLIPS was developed for the FLEX/32, a large grain parallel computer. The FLEX implementation takes a macroscopic approach in achieving parallelism by splitting whole sets of rules among several processors rather than by splitting the components of an individual rule among processors. The parallel CLIPS prototype demonstrates the potential advantages of integrating expert system tools with parallel computers.

  16. Parallel computational fluid dynamics - Implementations and results

    SciTech Connect

    Simon, H.D.

    1992-01-01

    The present volume on parallel CFD discusses implementations on parallel machines, numerical algorithms for parallel CFD, and performance evaluation and computer science issues. Attention is given to a parallel algorithm for compressible flows through rotor-stator combinations, a massively parallel Euler solver for unstructured grids, a fast scheme to analyze 3D disk airflow on a parallel computer, and a block implicit multigrid solution of the Euler equations. Topics addressed include a 3D ADI algorithm on distributed memory multiprocessors, clustered element-by-element computations for fluid flow, hypercube FFT and the Fourier pseudospectral method, and an investigation of parallel iterative algorithms for CFD. Also discussed are fluid dynamics using interface methods on parallel processors, sorting for particle flow simulation on the connection machine, a large grain mapping method, and efforts toward a Teraflops capability for CFD.

  17. Parallelizing alternating direction implicit solver on GPUs

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We present a parallel Alternating Direction Implicit (ADI) solver on GPUs. Our implementation significantly improves existing implementations in two aspects. First, we address the scalability issue of existing Parallel Cyclic Reduction (PCR) implementations by eliminating their hardware resource con...

  18. High Performance Parallel Computational Nanotechnology

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Craw, James M. (Technical Monitor)

    1995-01-01

    At a recent press conference, NASA Administrator Dan Goldin encouraged NASA Ames Research Center to take a lead role in promoting research and development of advanced, high-performance computer technology, including nanotechnology. Manufacturers of leading-edge microprocessors currently perform large-scale simulations in the design and verification of semiconductor devices and microprocessors. Recently, the need for this intensive simulation and modeling analysis has greatly increased, due in part to the ever-increasing complexity of these devices, as well as the lessons of experiences such as the Pentium fiasco. Simulation, modeling, testing, and validation will be even more important for designing molecular computers because of the complex specification of millions of atoms, thousands of assembly steps, as well as the simulation and modeling needed to ensure reliable, robust and efficient fabrication of the molecular devices. The software for this capacity does not exist today, but it can be extrapolated from the software currently used in molecular modeling for other applications: semi-empirical methods, ab initio methods, self-consistent field methods, Hartree-Fock methods, molecular mechanics; and simulation methods for diamondoid structures. In as much as it seems clear that the application of such methods in nanotechnology will require powerful, highly powerful systems, this talk will discuss techniques and issues for performing these types of computations on parallel systems. We will describe system design issues (memory, I/O, mass storage, operating system requirements, special user interface issues, interconnects, bandwidths, and programming languages) involved in parallel methods for scalable classical, semiclassical, quantum, molecular mechanics, and continuum models; molecular nanotechnology computer-aided designs (NanoCAD) techniques; visualization using virtual reality techniques of structural models and assembly sequences; software required to control mini robotic manipulators for positional control; scalable numerical algorithms for reliability, verifications and testability. There appears no fundamental obstacle to simulating molecular compilers and molecular computers on high performance parallel computers, just as the Boeing 777 was simulated on a computer before manufacturing it.

  19. Method for resource control in parallel environments using program organization and run-time support

    NASA Technical Reports Server (NTRS)

    Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)

    1999-01-01

    A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.

  20. Method for resource control in parallel environments using program organization and run-time support

    NASA Technical Reports Server (NTRS)

    Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)

    2001-01-01

    A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.

  1. Parallel processing of atmospheric chemistry calculations: Preliminary considerations

    SciTech Connect

    Elliott, S.; Jones, P.

    1995-01-01

    Global climate calculations are already saturating the class modern vector supercomputers with only a few central processing units. Increased resolution and inclusion of routines to deal with biogeochemical portions of the terrestrial climate system will soon demand massively parallel approaches. The atmospheric photochemistry ensemble is intimately linked to climate through the trace greenhouse gases ozone and methane and modules for representing it are being attached to global three dimensional transport and GCM frameworks. Atmospheric kinetics involve dozens of highly interactive tracers and so will accentuate the need for parallel processing of earth system simulations. In the present text we lay some of the groundwork for addition of atmospheric kinetics packages to GCM and global scale atmospheric models on multiply parallel computers. The discussion is tailored for consumption by the photochemical modelling community. After a review of numerical atmospheric chemistry methods, we examine how kinetics can be implemented on a parallel computer. We concentrate especially on data layout and flexibility and how these can be implemented in various programming models. We conclude that chemistry can be implemented rather easily within existing frameworks of several parallel atmospheric models. However, memory limitations may preclude high resolution studies of global chemistry.

  2. Fault-tolerant parallel processor

    SciTech Connect

    Harper, R.E.; Lala, J.H. )

    1991-06-01

    This paper addresses issues central to the design and operation of an ultrareliable, Byzantine resilient parallel computer. Interprocessor connectivity requirements are met by treating connectivity as a resource that is shared among many processing elements, allowing flexibility in their configuration and reducing complexity. Redundant groups are synchronized solely by message transmissions and receptions, which aslo provide input data consistency and output voting. Reliability analysis results are presented that demonstrate the reduced failure probability of such a system. Performance analysis results are presented that quantify the temporal overhead involved in executing such fault-tolerance-specific operations. Empirical performance measurements of prototypes of the architecture are presented. 30 refs.

  3. The PARTY parallel runtime system

    NASA Technical Reports Server (NTRS)

    Saltz, J. H.; Mirchandaney, Ravi; Smith, R. M.; Crowley, Kay; Nicol, D. M.

    1989-01-01

    In the present automated system for the organization of the data and computational operations entailed by parallel problems, in ways that optimize multiprocessor performance, general heuristics for partitioning program data and control are implemented by capturing and manipulating representations of a computation at run time. These heuristics are directed toward the dynamic identification and allocation of concurrent work in computations with irregular computational patterns. An optimized static-workload partitioning is computed for such repetitive-computation pattern problems as the iterative ones employed in scientific computation.

  4. True Shear Parallel Plate Viscometer

    NASA Technical Reports Server (NTRS)

    Ethridge, Edwin; Kaukler, William

    2010-01-01

    This viscometer (which can also be used as a rheometer) is designed for use with liquids over a large temperature range. The device consists of horizontally disposed, similarly sized, parallel plates with a precisely known gap. The lower plate is driven laterally with a motor to apply shear to the liquid in the gap. The upper plate is freely suspended from a double-arm pendulum with a sufficiently long radius to reduce height variations during the swing to negligible levels. A sensitive load cell measures the shear force applied by the liquid to the upper plate. Viscosity is measured by taking the ratio of shear stress to shear rate.

  5. Parallel Assembly of LIGA Components

    SciTech Connect

    Christenson, T.R.; Feddema, J.T.

    1999-03-04

    In this paper, a prototype robotic workcell for the parallel assembly of LIGA components is described. A Cartesian robot is used to press 386 and 485 micron diameter pins into a LIGA substrate and then place a 3-inch diameter wafer with LIGA gears onto the pins. Upward and downward looking microscopes are used to locate holes in the LIGA substrate, pins to be pressed in the holes, and gears to be placed on the pins. This vision system can locate parts within 3 microns, while the Cartesian manipulator can place the parts within 0.4 microns.

  6. Heart Fibrillation and Parallel Supercomputers

    NASA Technical Reports Server (NTRS)

    Kogan, B. Y.; Karplus, W. J.; Chudin, E. E.

    1997-01-01

    The Luo and Rudy 3 cardiac cell mathematical model is implemented on the parallel supercomputer CRAY - T3D. The splitting algorithm combined with variable time step and an explicit method of integration provide reasonable solution times and almost perfect scaling for rectilinear wave propagation. The computer simulation makes it possible to observe new phenomena: the break-up of spiral waves caused by intracellular calcium and dynamics and the non-uniformity of the calcium distribution in space during the onset of the spiral wave.

  7. Interacting Parallel Constructions of Knowledge in a CAS Context

    ERIC Educational Resources Information Center

    Kidron, Ivy; Dreyfus, Tommy

    2010-01-01

    We consider the influence of a CAS context on a learner's process of constructing a justification for the bifurcations in a logistic dynamical process. We describe how instrumentation led to cognitive constructions and how the roles of the learner and the CAS intertwine, especially close to the branching and combining of constructing actions. The…

  8. Interacting Parallel Constructions of Knowledge in a CAS Context

    ERIC Educational Resources Information Center

    Kidron, Ivy; Dreyfus, Tommy

    2010-01-01

    We consider the influence of a CAS context on a learner's process of constructing a justification for the bifurcations in a logistic dynamical process. We describe how instrumentation led to cognitive constructions and how the roles of the learner and the CAS intertwine, especially close to the branching and combining of constructing actions. The

  9. Collective Interaction of a Compressible Periodic Parallel Jet Flow

    NASA Technical Reports Server (NTRS)

    Miles, Jeffrey Hilton

    1997-01-01

    A linear instability model for multiple spatially periodic supersonic rectangular jets is solved using Floquet-Bloch theory. The disturbance environment is investigated using a two dimensional perturbation of a mean flow. For all cases large temporal growth rates are found. This work is motivated by an increase in mixing found in experimental measurements of spatially periodic supersonic rectangular jets with phase-locked screech. The results obtained in this paper suggests that phase-locked screech or edge tones may produce correlated spatially periodic jet flow downstream of the nozzles which creates a large span wise multi-nozzle region where a disturbance can propagate. The large temporal growth rates for eddies obtained by model calculation herein are related to the increased mixing since eddies are the primary mechanism that transfer energy from the mean flow to the large turbulent structures. Calculations of growth rates are presented for a range of Mach numbers and nozzle spacings corresponding to experimental test conditions where screech synchronized phase locking was observed. The model may be of significant scientific and engineering value in the quest to understand and construct supersonic mixer-ejector nozzles which provide increased mixing and reduced noise.

  10. Parallel machine architecture and compiler design facilities

    NASA Technical Reports Server (NTRS)

    Kuck, David J.; Yew, Pen-Chung; Padua, David; Sameh, Ahmed; Veidenbaum, Alex

    1990-01-01

    The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role.

  11. Parallel multi-computers and artificial intelligence

    SciTech Connect

    Uhr, L.

    1986-01-01

    This book examines the present state and future direction of multicomputer parallel architectures for artificial intelligence research and development of artificial intelligence applications. The book provides a survey of the large variety of parallel architectures, describing the current state of the art and suggesting promising architectures to produce artificial intelligence systems such as intelligence systems such as intelligent robots. This book integrates artificial intelligence and parallel processing research areas and discusses parallel processing from the viewpoint of artificial intelligence.

  12. Force user's manual: A portable, parallel FORTRAN

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.; Benten, Muhammad S.; Arenstorf, Norbert S.; Ramanan, Aruna V.

    1990-01-01

    The use of Force, a parallel, portable FORTRAN on shared memory parallel computers is described. Force simplifies writing code for parallel computers and, once the parallel code is written, it is easily ported to computers on which Force is installed. Although Force is nearly the same for all computers, specific details are included for the Cray-2, Cray-YMP, Convex 220, Flex/32, Encore, Sequent, Alliant computers on which it is installed.

  13. The electron signature of parallel electric fields

    NASA Astrophysics Data System (ADS)

    Burch, J. L.; Gurgiolo, C.; Menietti, J. D.

    1990-12-01

    Dynamics Explorer I High-Altitude Plasma Instrument electron data are presented. The electron distribution functions have characteristics expected of a region of parallel electric fields. The data are consistent with previous test-particle simulations for observations within parallel electric field regions which indicate that typical hole, bump, and loss-cone electron distributions, which contain evidence for parallel potential differences both above and below the point of observation, are not expected to occur in regions containing actual parallel electric fields.

  14. Automatic Multilevel Parallelization Using OpenMP

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

    2002-01-01

    In this paper we describe the extension of the CAPO (CAPtools (Computer Aided Parallelization Toolkit) OpenMP) parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report some results for several benchmark codes and one full application that have been parallelized using our system.

  15. A parallel algorithm for implicit depletant simulations

    NASA Astrophysics Data System (ADS)

    Glaser, Jens; Karas, Andrew S.; Glotzer, Sharon C.

    2015-11-01

    We present an algorithm to simulate the many-body depletion interaction between anisotropic colloids in an implicit way, integrating out the degrees of freedom of the depletants, which we treat as an ideal gas. Because the depletant particles are statistically independent and the depletion interaction is short-ranged, depletants are randomly inserted in parallel into the excluded volume surrounding a single translated and/or rotated colloid. A configurational bias scheme is used to enhance the acceptance rate. The method is validated and benchmarked both on multi-core processors and graphics processing units for the case of hard spheres, hemispheres, and discoids. With depletants, we report novel cluster phases in which hemispheres first assemble into spheres, which then form ordered hcp/fcc lattices. The method is significantly faster than any method without cluster moves and that tracks depletants explicitly, for systems of colloid packing fraction ϕc < 0.50, and additionally enables simulation of the fluid-solid transition.

  16. Extensive Parallel Processing on Scale-Free Networks

    NASA Astrophysics Data System (ADS)

    Sollich, Peter; Tantari, Daniele; Annibale, Alessia; Barra, Adriano

    2014-12-01

    We adapt belief-propagation techniques to study the equilibrium behavior of a bipartite spin glass, with interactions between two sets of N and P =? N spins each having an arbitrary degree, i.e., number of interaction partners in the opposite set. An equivalent view is then of a system of N neurons storing P diluted patterns via Hebbian learning, in the high storage regime. Our method allows analysis of parallel pattern processing on a broad class of graphs, including those with pattern asymmetry and heterogeneous dilution; previous replica approaches assumed homogeneity. We show that in a large part of the parameter space of noise, dilution, and storage load, delimited by a critical surface, the network behaves as an extensive parallel processor, retrieving all P patterns in parallel without falling into spurious states due to pattern cross talk, as would be typical of the structural glassiness built into the network. Parallel extensive retrieval is more robust for homogeneous degree distributions, and is not disrupted by asymmetric pattern distributions. For scale-free pattern degree distributions, Hebbian learning induces modularity in the neural network; thus, our Letter gives the first theoretical description for extensive information processing on modular and scale-free networks.

  17. Parallel Processing at the High School Level.

    ERIC Educational Resources Information Center

    Sheary, Kathryn Anne

    This study investigated the ability of high school students to cognitively understand and implement parallel processing. Data indicates that most parallel processing is being taught at the university level. Instructional modules on C, Linux, and the parallel processing language, P4, were designed to show that high school students are highly…

  18. Parallel Computation Of Forward Dynamics Of Manipulators

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Bejczy, Antal K.

    1993-01-01

    Report presents parallel algorithms and special parallel architecture for computation of forward dynamics of robotics manipulators. Products of effort to find best method of parallel computation to achieve required computational efficiency. Significant speedup of computation anticipated as well as cost reduction.

  19. Inductive Information Retrieval Using Parallel Distributed Computation.

    ERIC Educational Resources Information Center

    Mozer, Michael C.

    This paper reports on an application of parallel models to the area of information retrieval and argues that massively parallel, distributed models of computation, called connectionist, or parallel distributed processing (PDP) models, offer a new approach to the representation and manipulation of knowledge. Although this document focuses on…

  20. Parallel Computing Using Web Servers and "Servlets".

    ERIC Educational Resources Information Center

    Lo, Alfred; Bloor, Chris; Choi, Y. K.

    2000-01-01

    Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common…

  1. Coordination in serial-parallel image processing

    NASA Astrophysics Data System (ADS)

    Wójcik, Waldemar; Dubovoi, Vladymyr M.; Duda, Marina E.; Romaniuk, Ryszard S.; Yesmakhanova, Laura; Kozbakova, Ainur

    2015-12-01

    Serial-parallel systems used to convert the image. The control of their work results with the need to solve coordination problem. The paper summarizes the model of coordination of resource allocation in relation to the task of synchronizing parallel processes; the genetic algorithm of coordination developed, its adequacy verified in relation to the process of parallel image processing.

  2. Identifying, Quantifying, Extracting and Enhancing Implicit Parallelism

    ERIC Educational Resources Information Center

    Agarwal, Mayank

    2009-01-01

    The shift of the microprocessor industry towards multicore architectures has placed a huge burden on the programmers by requiring explicit parallelization for performance. Implicit Parallelization is an alternative that could ease the burden on programmers by parallelizing applications "under the covers" while maintaining sequential semantics…

  3. Xyce parallel electronic simulator design.

    SciTech Connect

    Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.

    2010-09-01

    This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.

  4. Characterizations of parallel complexity classes

    SciTech Connect

    Venkateswaran, H.

    1986-01-01

    A new two-person pebble game that abstracts the control structure of many parallel algorithms is defined and studied. This game extends the two-person pebble game defined by Dymond and Tompa (JCSS, Vol. 30, no.2, 1985, pp. 149-161) in two ways: (a) the game is played on a Boolean circuit rather than on an unlabeled graph, and takes into consideration the types of the gates in the circuit, and (b) the two players' roles are completely symmetric. The new game is used to study the relationship between two natural parallel complexity classes, namely LOGCFL and AC/sup 1/. LOGCFL is the class of languages log space reducible to context-free languages. AC/sup 1/ is the class of languages accepted by an alternating Turning machine in space O(log n) and alternation depth O(log n). LOGCFL is a subclass of AC/sup 1/, but it is not known whether the inclusion is proper. For many problems in LOGCFL the algorithms that show their membership in that class also show their membership in AC/sup 1/. However, these algorithms do not use the full power of AC/sup 1/ computations. The two-person game defined here provides a model of computation in which this perceived difference can be quantified. This is done by characterizing the two classes using the same measures of resources in the game model.

  5. Parallel computation of electromagnetic fields

    SciTech Connect

    Madsen, N.K.

    1997-05-21

    The DSI3D code is designed to numerically solve electromagnetics problems involving complex objects by solving Maxwell`s curl equations in the time-domain and in three space dimensions. The code has been designed to run on the new parallel processing computers as well as on conventional serial computers. The DSI3D code is unique for the following reasons: It runs efficiently on a variety of parallel computers, Allows the use of unstructured non-orthogonal grids, Allows a variety of cell or element types, Reduces to be the Finite Difference Time Domain (FDID) method when orthogonal grids are used, Preserves charge or divergence locally (and globally), Is non- dissipative, and Is accurate for non-orthogonal grids. This method is derived using a Discrete Surface Integration (DSI) technique. As formulated, the DSI technique can be used with essentially arbitrary unstructured grids composed of convex polyhedral cells. This implementation of the DSI algorithm allows the use of unstructured grids that are composed of combinations of non-orthogonal hexahedrons, tetrahedrons, triangular prisms and pyramids. This algorithm reduces to the conventional FDTD method when applied on a structured orthogonal hexahedral grid.

  6. Parallels between wind and crowd loading of bridges.

    PubMed

    McRobie, Allan; Morgenthal, Guido; Abrams, Danny; Prendergast, John

    2013-06-28

    Parallels between the dynamic response of flexible bridges under the action of wind and under the forces induced by crowds allow each field to inform the other. Wind-induced behaviour has been traditionally classified into categories such as flutter, galloping, vortex-induced vibration and buffeting. However, computational advances such as the vortex particle method have led to a more general picture where effects may occur simultaneously and interact, such that the simple semantic demarcations break down. Similarly, the modelling of individual pedestrians has progressed the understanding of human-structure interaction, particularly for large-amplitude lateral oscillations under crowd loading. In this paper, guided by the interaction of flutter and vortex-induced vibration in wind engineering, a framework is presented, which allows various human-structure interaction effects to coexist and interact, thereby providing a possible synthesis of previously disparate experimental and theoretical results. PMID:23690640

  7. Efficient parallel global garbage collection on massively parallel computers

    SciTech Connect

    Kamada, Tomio; Matsuoka, Satoshi; Yonezawa, Akinori

    1994-12-31

    On distributed-memory high-performance MPPs where processors are interconnected by an asynchronous network, efficient Garbage Collection (GC) becomes difficult due to inter-node references and references within pending, unprocessed messages. The parallel global GC algorithm (1) takes advantage of reference locality, (2) efficiently traverses references over nodes, (3) admits minimum pause time of ongoing computations, and (4) has been shown to scale up to 1024 node MPPs. The algorithm employs a global weight counting scheme to substantially reduce message traffic. The two methods for confirming the arrival of pending messages are used: one counts numbers of messages and the other uses network `bulldozing.` Performance evaluation in actual implementations on a multicomputer with 32-1024 nodes, Fujitsu AP1000, reveals various favorable properties of the algorithm.

  8. Parallel multiscale simulations of a brain aneurysm.

    PubMed

    Grinberg, Leopold; Fedosov, Dmitry A; Karniadakis, George Em

    2013-07-01

    Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multi-scale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver εκ αr . The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers ( εκ αr and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work. PMID:23734066

  9. Parallel multiscale simulations of a brain aneurysm

    NASA Astrophysics Data System (ADS)

    Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em

    2013-07-01

    Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multiscale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver NɛκTαr. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers (NɛκTαr and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300 K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work.

  10. Parallel multiscale simulations of a brain aneurysm

    SciTech Connect

    Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em

    2013-07-01

    Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multiscale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier–Stokes solver NεκTαr. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers (NεκTαr and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300 K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work.

  11. Utilizing parallel optimization in computational fluid dynamics

    NASA Astrophysics Data System (ADS)

    Kokkolaras, Michael

    1998-12-01

    General problems of interest in computational fluid dynamics are investigated by means of optimization. Specifically, in the first part of the dissertation, a method of optimal incremental function approximation is developed for the adaptive solution of differential equations. Various concepts and ideas utilized by numerical techniques employed in computational mechanics and artificial neural networks (e.g. function approximation and error minimization, variational principles and weighted residuals, and adaptive grid optimization) are combined to formulate the proposed method. The basis functions and associated coefficients of a series expansion, representing the solution, are optimally selected by a parallel direct search technique at each step of the algorithm according to appropriate criteria; the solution is built sequentially. In this manner, the proposed method is adaptive in nature, although a grid is neither built nor adapted in the traditional sense using a-posteriori error estimates. Variational principles are utilized for the definition of the objective function to be extremized in the associated optimization problems, ensuring that the problem is well-posed. Complicated data structures and expensive remeshing algorithms and systems solvers are avoided. Computational efficiency is increased by using low-order basis functions and concurrent computing. Numerical results and convergence rates are reported for a range of steady-state problems, including linear and nonlinear differential equations associated with general boundary conditions, and illustrate the potential of the proposed method. Fluid dynamics applications are emphasized. Conclusions are drawn by discussing the method's limitations, advantages, and possible extensions. The second part of the dissertation is concerned with the optimization of the viscous-inviscid-interaction (VII) mechanism in an airfoil flow analysis code. The VII mechanism is based on the concept of a transpiration velocity boundary condition, whose convergence to steady state is accelerated. The number of variables in the associated optimization problem is reduced by means of function approximation concepts to ensure high number of parallel processors to number of necessary function evaluations ratio. Numerical results are presented for the NACA-0012 and the supercritical RAE-2822 airfoils subject to transonic flow conditions using a parallel direct search technique. They exhibit a satisfactory level of accuracy. Speed-up depends on the number of available computational units and increases for more challenging flow conditions and airfoil geometries. The enhanced code constitutes a useful tool for airfoil flow analysis and design and an acceptable alternative to computationally expensive high fidelity codes.

  12. Kinetic theory of turbulence for parallel propagation revisited: Formal results

    SciTech Connect

    Yoon, Peter H.

    2015-08-15

    In a recent paper, Gaelzer et al. [Phys. Plasmas 22, 032310 (2015)] revisited the second-order nonlinear kinetic theory for turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field. The original work was according to Yoon and Fang [Phys. Plasmas 15, 122312 (2008)], but Gaelzer et al. noted that the terms pertaining to discrete-particle effects in Yoon and Fang's theory did not enjoy proper dimensionality. The purpose of Gaelzer et al. was to restore the dimensional consistency associated with such terms. However, Gaelzer et al. was concerned only with linear wave-particle interaction terms. The present paper completes the analysis by considering the dimensional correction to nonlinear wave-particle interaction terms in the wave kinetic equation.

  13. Implementation and performance of parallelized elegant.

    SciTech Connect

    Wang, Y.; Borland, M.; Accelerator Systems Division

    2008-01-01

    The program elegant is widely used for design and modeling of linacs for free-electron lasers and energy recovery linacs, as well as storage rings and other applications. As part of a multi-year effort, we have parallelized many aspects of the code, including single-particle dynamics, wakefields, and coherent synchrotron radiation. We report on the approach used for gradual parallelization, which proved very beneficial in getting parallel features into the hands of users quickly. We also report details of parallelization of collective effects. Finally, we discuss performance of the parallelized code in various applications.

  14. A parallel execution model for Prolog

    SciTech Connect

    Fagin, B.

    1987-01-01

    In this thesis a new parallel execution model for Prolog is presented: The PPP model or Parallel Prolog Processor. The PPP supports AND-parallelism, OR- parallelism, and intelligent backtracking. An implementation of the PPP is described, through the extension of an existing Prolog abstract machine architecture. Several examples of PPP execution are presented and compilation to the PPP abstract instructions set is discussed. The performance effects of this model are reported, based on a simulation of a large benchmark set. The implications of these results for parallel Prolog systems are discussed, and directions for future work are indicated.

  15. A CS1 pedagogical approach to parallel thinking

    NASA Astrophysics Data System (ADS)

    Rague, Brian William

    Almost all collegiate programs in Computer Science offer an introductory course in programming primarily devoted to communicating the foundational principles of software design and development. The ACM designates this introduction to computer programming course for first-year students as CS1, during which methodologies for solving problems within a discrete computational context are presented. Logical thinking is highlighted, guided primarily by a sequential approach to algorithm development and made manifest by typically using the latest, commercially successful programming language. In response to the most recent developments in accessible multicore computers, instructors of these introductory classes may wish to include training on how to design workable parallel code. Novel issues arise when programming concurrent applications which can make teaching these concepts to beginning programmers a seemingly formidable task. Student comprehension of design strategies related to parallel systems should be monitored to ensure an effective classroom experience. This research investigated the feasibility of integrating parallel computing concepts into the first-year CS classroom. To quantitatively assess student comprehension of parallel computing, an experimental educational study using a two-factor mixed group design was conducted to evaluate two instructional interventions in addition to a control group: (1) topic lecture only, and (2) topic lecture with laboratory work using a software visualization Parallel Analysis Tool (PAT) specifically designed for this project. A new evaluation instrument developed for this study, the Perceptions of Parallelism Survey (PoPS), was used to measure student learning regarding parallel systems. The results from this educational study show a statistically significant main effect among the repeated measures, implying that student comprehension levels of parallel concepts as measured by the PoPS improve immediately after the delivery of any initial three-week CS1 level module when compared with student comprehension levels just prior to starting the course. Survey results measured during the ninth week of the course reveal that performance levels remained high compared to pre-course performance scores. A second result produced by this study reveals no statistically significant interaction effect between the intervention method and student performance as measured by the evaluation instrument over three separate testing periods. However, visual inspection of survey score trends and the low p-value generated by the interaction analysis (0.062) indicate that further studies may verify improved concept retention levels for the lecture w/PAT group.

  16. GROMACS: A message-passing parallel molecular dynamics implementation

    NASA Astrophysics Data System (ADS)

    Berendsen, H. J. C.; van der Spoel, D.; van Drunen, R.

    1995-09-01

    A parallel message-passing implementation of a molecular dynamics (MD) program that is useful for bio(macro)molecules in aqueous environment is described. The software has been developed for a custom-designed 32-processor ring GROMACS (GROningen MAchine for Chemical Simulation) with communication to and from left and right neighbours, but can run on any parallel system onto which a a ring of processors can be mapped and which supports PVM-like block send and receive calls. The GROMACS software consists of a preprocessor, a parallel MD and energy minimization program that can use an arbitrary number of processors (including one), an optional monitor, and several analysis tools. The programs are written in ANSI C and available by ftp (information: gromacs@chem.rug.nl). The functionality is based on the GROMOS (GROningen MOlecular Simulation) package (van Gunsteren and Berendsen, 1987; BIOMOS B.V., Nijenborgh 4, 9747 AG Groningen). Conversion programs between GROMOS and GROMACS formats are included. The MD program can handle rectangular periodic boundary conditions with temperature and pressure scaling. The interactions that can be handled without modification are variable non-bonded pair interactions with Coulomb and Lennard-Jones or Buckingham potentials, using a twin-range cut-off based on charge groups, and fixed bonded interactions of either harmonic or constraint type for bonds and bond angles and either periodic or cosine power series interactions for dihedral angles. Special forces can be added to groups of particles (for non-equilibrium dynamics or for position restraining) or between particles (for distance restraints). The parallelism is based on particle decomposition. Interprocessor communication is largely limited to position and force distribution over the ring once per time step.

  17. Automated Instrumentation and Monitoring of Data Movement for Parallel Programs

    NASA Technical Reports Server (NTRS)

    Sarukkai, Sekhar; Yan, Jerry C.; Schmidt, Melisa; Tucker, Deanne (Technical Monitor)

    1994-01-01

    Writing efficient parallel programs is complicated by the need to select the right data structure alignments and distributions, which determine the nature and volume of inter-processor communications. A large number of performance tools for parallel programs have been developed recently to expose these inter-processor communications. However, none of them support performance views or provide statistics in terms of inter-processor data structure interactions. A performance tool that tracks the interaction between individual data structures and the context of these interactions is essential for understanding the performance of both explicit message passing programs and data-parallel languages such as HPF. In this paper we discuss the use of compiler front end tools for automatically tracking data structure movements in message passing programs, and low-overhead monitoring and postprocessing of such codes. We demonstrate that robust instrumentation and low overhead monitoring of inter-processor data structure movements is possible, with the use of a number of NAS benchmark codes, run on the i860 hypercube. We also show that the data so collected can be used effectively by post processing tools that expose performance bottlenecks using graphical displays and performance statistics.

  18. Parallel Rendering of Large Time-Varying Volume Data

    NASA Technical Reports Server (NTRS)

    Garbutt, Alexander E.

    2005-01-01

    Interactive visualization of large time-varying 3D volume datasets has been and still is a great challenge to the modem computational world. It stretches the limits of the memory capacity, the disk space, the network bandwidth and the CPU speed of a conventional computer. In this SURF project, we propose to develop a parallel volume rendering program on SGI's Prism, a cluster computer equipped with state-of-the-art graphic hardware. The proposed program combines both parallel computing and hardware rendering in order to achieve an interactive rendering rate. We use 3D texture mapping and a hardware shader to implement 3D volume rendering on each workstation. We use SGI's VisServer to enable remote rendering using Prism's graphic hardware. And last, we will integrate this new program with ParVox, a parallel distributed visualization system developed at JPL. At the end of the project, we Will demonstrate remote interactive visualization using this new hardware volume renderer on JPL's Prism System using a time-varying dataset from selected JPL applications.

  19. Information hiding in parallel programs

    SciTech Connect

    Foster, I.

    1992-01-30

    A fundamental principle in program design is to isolate difficult or changeable design decisions. Application of this principle to parallel programs requires identification of decisions that are difficult or subject to change, and the development of techniques for hiding these decisions. We experiment with three complex applications, and identify mapping, communication, and scheduling as areas in which decisions are particularly problematic. We develop computational abstractions that hide such decisions, and show that these abstractions can be used to develop elegant solutions to programming problems. In particular, they allow us to encode common structures, such as transforms, reductions, and meshes, as software cells and templates that can reused in different applications. An important characteristic of these structures is that they do not incorporate mapping, communication, or scheduling decisions: these aspects of the design are specified separately, when composing existing structures to form applications. This separation of concerns allows the same cells and templates to be reused in different contexts.

  20. Device for balancing parallel strings

    DOEpatents

    Mashikian, Matthew S.

    1985-01-01

    A battery plant is described which features magnetic circuit means in association with each of the battery strings in the battery plant for balancing the electrical current flow through the battery strings by equalizing the voltage across each of the battery strings. Each of the magnetic circuit means generally comprises means for sensing the electrical current flow through one of the battery strings, and a saturable reactor having a main winding connected electrically in series with the battery string, a bias winding connected to a source of alternating current and a control winding connected to a variable source of direct current controlled by the sensing means. Each of the battery strings is formed by a plurality of batteries connected electrically in series, and these battery strings are connected electrically in parallel across common bus conductors.

  1. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1991-01-01

    The main contribution of the effort in the last two years is the introduction of the MOPPS system. After doing extensive literature search, we introduced the system which is described next. MOPPS employs a new solution to the problem of managing programs which solve scientific and engineering applications on a distributed processing environment. Autonomous computers cooperate efficiently in solving large scientific problems with this solution. MOPPS has the advantage of not assuming the presence of any particular network topology or configuration, computer architecture, or operating system. It imposes little overhead on network and processor resources while efficiently managing programs concurrently. The core of MOPPS is an intelligent program manager that builds a knowledge base of the execution performance of the parallel programs it is managing under various conditions. The manager applies this knowledge to improve the performance of future runs. The program manager learns from experience.

  2. Parallel discovery of Alzheimer's therapeutics.

    PubMed

    Lo, Andrew W; Ho, Carole; Cummings, Jayna; Kosik, Kenneth S

    2014-06-18

    As the prevalence of Alzheimer's disease (AD) grows, so do the costs it imposes on society. Scientific, clinical, and financial interests have focused current drug discovery efforts largely on the single biological pathway that leads to amyloid deposition. This effort has resulted in slow progress and disappointing outcomes. Here, we describe a "portfolio approach" in which multiple distinct drug development projects are undertaken simultaneously. Although a greater upfront investment is required, the probability of at least one success should be higher with "multiple shots on goal," increasing the efficiency of this undertaking. However, our portfolio simulations show that the risk-adjusted return on investment of parallel discovery is insufficient to attract private-sector funding. Nevertheless, the future cost savings of an effective AD therapy to Medicare and Medicaid far exceed this investment, suggesting that government funding is both essential and financially beneficial. PMID:24944190

  3. Hybrid Optimization Parallel Search PACKage

    SciTech Connect

    2009-11-10

    HOPSPACK is open source software for solving optimization problems without derivatives. Application problems may have a fully nonlinear objective function, bound constraints, and linear and nonlinear constraints. Problem variables may be continuous, integer-valued, or a mixture of both. The software provides a framework that supports any derivative-free type of solver algorithm. Through the framework, solvers request parallel function evaluation, which may use MPI (multiple machines) or multithreading (multiple processors/cores on one machine). The framework provides a Cache and Pending Cache of saved evaluations that reduces execution time and facilitates restarts. Solvers can dynamically create other algorithms to solve subproblems, a useful technique for handling multiple start points and integer-valued variables. HOPSPACK ships with the Generating Set Search (GSS) algorithm, developed at Sandia as part of the APPSPACK open source software project.

  4. Hybrid Optimization Parallel Search PACKage

    Energy Science and Technology Software Center (ESTSC)

    2009-11-10

    HOPSPACK is open source software for solving optimization problems without derivatives. Application problems may have a fully nonlinear objective function, bound constraints, and linear and nonlinear constraints. Problem variables may be continuous, integer-valued, or a mixture of both. The software provides a framework that supports any derivative-free type of solver algorithm. Through the framework, solvers request parallel function evaluation, which may use MPI (multiple machines) or multithreading (multiple processors/cores on one machine). The framework providesmore » a Cache and Pending Cache of saved evaluations that reduces execution time and facilitates restarts. Solvers can dynamically create other algorithms to solve subproblems, a useful technique for handling multiple start points and integer-valued variables. HOPSPACK ships with the Generating Set Search (GSS) algorithm, developed at Sandia as part of the APPSPACK open source software project.« less

  5. Parallel computing in enterprise modeling.

    SciTech Connect

    Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.

    2008-08-01

    This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.

  6. Integrated Task and Data Parallel Programming

    NASA Technical Reports Server (NTRS)

    Grimshaw, A. S.

    1998-01-01

    This research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers 1995 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program. Additional 1995 Activities During the fall I collaborated with Andrew Grimshaw and Adam Ferrari to write a book chapter which will be included in Parallel Processing in C++ edited by Gregory Wilson. I also finished two courses, Compilers and Advanced Compilers, in 1995. These courses complete my class requirements at the University of Virginia. I have only my dissertation research and defense to complete.

  7. STALK : an interactive virtual molecular docking system.

    SciTech Connect

    Levine, D.; Facello, M.; Hallstrom, P.; Reeder, G.; Walenz, B.; Stevens, F.; Univ. of Illinois

    1997-04-01

    Several recent technologies-genetic algorithms, parallel and distributed computing, virtual reality, and high-speed networking-underlie a new approach to the computational study of how biomolecules interact or 'dock' together. With the Stalk system, a user in a virtual reality environment can interact with a genetic algorithm running on a parallel computer to help in the search for likely geometric configurations.

  8. Fully Parallel MHD Stability Analysis Tool

    NASA Astrophysics Data System (ADS)

    Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang

    2014-10-01

    Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Initial results of the code parallelization will be reported. Work is supported by the U.S. DOE SBIR program.

  9. Parallel computation of invariant measures

    SciTech Connect

    Ding, J.; Liu, Y.

    1995-12-01

    A parallel numerical algorithm for computing invariant measures is presented. Let I{sup N} {triple_bond} [0,1]{sup N} be the unit N-cube in R N and let S : I{sup N}{r_arrow} I{sup N} be a nonsingular transformation, that is, S is Borel-measurable and m(A) = 0 implies m(S{sup -1}(A)) = 0, where m is the Lebesgue measure. The motivation of this study is the parallel computation of an absolutely continuous invariant measure {mu} under S, that is, {mu} {much_lt} m and {mu}(A) = {mu}(S{sup -1}(A)) for all Borel sets A {contained_in} I{sup N}. It is well-known that an absolutely continuous finite invariant measure {mu} can be obtained by computing a fixed density of the Frobenius-Perron operator Ps: L{sup 1} (I{sup N}) {r_arrow} L{sup 1}(I{sup N}) associated with S which is defined by (1) {integral}{sub A} P{sub S}fdm = {integral}{sub s-1(A)} fdm, {forall}f {element_of} L{sup 1} (I{sup N}). Using any suitable discretization scheme, the infinite dimensional eigenvector problem P{sub S}f = f in L{sup 1}(I{sup N}) can be approximated by an algebraic eigenvector problem P{sub l}f{sub l} = f{sub l} in {gradient}{sub l}, where P{sub l} is a finite approximation of P{sub s} associated with a finite element subspace {gradient}{sub l} of L{sup l} (I{sup N}) {intersection} L{sup {infinity}} (I{sup N}). It has been shown that for P{sub l} arising from Galerkin`s projection principle or the Markov finite approximation principle, there always exists a eigenvector f{sub l} to P{sub l}, and that a sequence of normalized eigenvectors (f{sub l}) converges to the density of an absolutely continuous probability invariant measure {mu} for a class of piecewise C{sup 2} expanding maps of I{sup N} under which the existence of {mu} is guaranteed by Gora-Boyarsky`s theorem which is reduced to Lasota-Yorke`s thoerem when N = 1.

  10. Parallel automated adaptive procedures for unstructured meshes

    NASA Technical Reports Server (NTRS)

    Shephard, M. S.; Flaherty, J. E.; Decougny, H. L.; Ozturan, C.; Bottasso, C. L.; Beall, M. W.

    1995-01-01

    Consideration is given to the techniques required to support adaptive analysis of automatically generated unstructured meshes on distributed memory MIMD parallel computers. The key areas of new development are focused on the support of effective parallel computations when the structure of the numerical discretization, the mesh, is evolving, and in fact constructed, during the computation. All the procedures presented operate in parallel on already distributed mesh information. Starting from a mesh definition in terms of a topological hierarchy, techniques to support the distribution, redistribution and communication among the mesh entities over the processors is given, and algorithms to dynamically balance processor workload based on the migration of mesh entities are given. A procedure to automatically generate meshes in parallel, starting from CAD geometric models, is given. Parallel procedures to enrich the mesh through local mesh modifications are also given. Finally, the combination of these techniques to produce a parallel automated finite element analysis procedure for rotorcraft aerodynamics calculations is discussed and demonstrated.

  11. Parallel computing for probabilistic fatigue analysis

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Lua, Yuan J.; Smith, Mark D.

    1993-01-01

    This paper presents the results of Phase I research to investigate the most effective parallel processing software strategies and hardware configurations for probabilistic structural analysis. We investigate the efficiency of both shared and distributed-memory architectures via a probabilistic fatigue life analysis problem. We also present a parallel programming approach, the virtual shared-memory paradigm, that is applicable across both types of hardware. Using this approach, problems can be solved on a variety of parallel configurations, including networks of single or multiprocessor workstations. We conclude that it is possible to effectively parallelize probabilistic fatigue analysis codes; however, special strategies will be needed to achieve large-scale parallelism to keep large number of processors busy and to treat problems with the large memory requirements encountered in practice. We also conclude that distributed-memory architecture is preferable to shared-memory for achieving large scale parallelism; however, in the future, the currently emerging hybrid-memory architectures will likely be optimal.

  12. Design considerations for parallel graphics libraries

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1994-01-01

    Applications which run on parallel supercomputers are often characterized by massive datasets. Converting these vast collections of numbers to visual form has proven to be a powerful aid to comprehension. For a variety of reasons, it may be desirable to provide this visual feedback at runtime. One way to accomplish this is to exploit the available parallelism to perform graphics operations in place. In order to do this, we need appropriate parallel rendering algorithms and library interfaces. This paper provides a tutorial introduction to some of the issues which arise in designing parallel graphics libraries and their underlying rendering algorithms. The focus is on polygon rendering for distributed memory message-passing systems. We illustrate our discussion with examples from PGL, a parallel graphics library which has been developed on the Intel family of parallel systems.

  13. A generic fine-grained parallel C

    NASA Technical Reports Server (NTRS)

    Hamet, L.; Dorband, John E.

    1988-01-01

    With the present availability of parallel processors of vastly different architectures, there is a need for a common language interface to multiple types of machines. The parallel C compiler, currently under development, is intended to be such a language. This language is based on the belief that an algorithm designed around fine-grained parallelism can be mapped relatively easily to different parallel architectures, since a large percentage of the parallelism has been identified. The compiler generates a FORTH-like machine-independent intermediate code. A machine-dependent translator will reside on each machine to generate the appropriate executable code, taking advantage of the particular architectures. The goal of this project is to allow a user to run the same program on such machines as the Massively Parallel Processor, the CRAY, the Connection Machine, and the CYBER 205 as well as serial machines such as VAXes, Macintoshes and Sun workstations.

  14. Towards Distributed Memory Parallel Program Analysis

    SciTech Connect

    Quinlan, D; Barany, G; Panas, T

    2008-06-17

    This paper presents a parallel attribute evaluation for distributed memory parallel computer architectures where previously only shared memory parallel support for this technique has been developed. Attribute evaluation is a part of how attribute grammars are used for program analysis within modern compilers. Within this work, we have extended ROSE, a open compiler infrastructure, with a distributed memory parallel attribute evaluation mechanism to support user defined global program analysis required for some forms of security analysis which can not be addressed by a file by file view of large scale applications. As a result, user defined security analyses may now run in parallel without the user having to specify the way data is communicated between processors. The automation of communication enables an extensible open-source parallel program analysis infrastructure.

  15. Linearly exact parallel closures for slab geometry

    SciTech Connect

    Ji, Jeong-Young; Held, Eric D.; Jhang, Hogun

    2013-08-15

    Parallel closures are obtained by solving a linearized kinetic equation with a model collision operator using the Fourier transform method. The closures expressed in wave number space are exact for time-dependent linear problems to within the limits of the model collision operator. In the adiabatic, collisionless limit, an inverse Fourier transform is performed to obtain integral (nonlocal) parallel closures in real space; parallel heat flow and viscosity closures for density, temperature, and flow velocity equations replace Braginskii's parallel closure relations, and parallel flow velocity and heat flow closures for density and temperature equations replace Spitzer's parallel transport relations. It is verified that the closures reproduce the exact linear response function of Hammett and Perkins [Phys. Rev. Lett. 64, 3019 (1990)] for Landau damping given a temperature gradient. In contrast to their approximate closures where the vanishing viscosity coefficient numerically gives an exact response, our closures relate the heat flow and nonvanishing viscosity to temperature and flow velocity (gradients)

  16. Parallel Activities in the Classroom

    ERIC Educational Resources Information Center

    Koole, Tom

    2007-01-01

    This paper reports on a study of classroom interaction as a multi-party and multi-activity phenomenon. On the basis of video-recorded lessons in secondary education schools in the Netherlands, observational records were made of the behaviour of individual students throughout lessons. The main argument in this paper is that when students engage in…

  17. Multipactor saturation in parallel-plate waveguides

    SciTech Connect

    Sorolla, E.; Mattes, M.

    2012-07-15

    The saturation stage of a multipactor discharge is considered of interest, since it can guide towards a criterion to assess the multipactor onset. The electron cloud under multipactor regime within a parallel-plate waveguide is modeled by a thin continuous distribution of charge and the equations of motion are calculated taking into account the space charge effects. The saturation is identified by the interaction of the electron cloud with its image charge. The stability of the electron population growth is analyzed and two mechanisms of saturation to explain the steady-state multipactor for voltages near above the threshold onset are identified. The impact energy in the collision against the metal plates decreases during the electron population growth due to the attraction of the electron sheet on the image through the initial plate. When this growth remains stable till the impact energy reaches the first cross-over point, the electron surface density tends to a constant value. When the stability is broken before reaching the first cross-over point the surface charge density oscillates chaotically bounded within a certain range. In this case, an expression to calculate the maximum electron surface charge density is found whose predictions agree with the simulations when the voltage is not too high.

  18. Toward an automated parallel computing environment for geosciences

    NASA Astrophysics Data System (ADS)

    Zhang, Huai; Liu, Mian; Shi, Yaolin; Yuen, David A.; Yan, Zhenzhen; Liang, Guoping

    2007-08-01

    Software for geodynamic modeling has not kept up with the fast growing computing hardware and network resources. In the past decade supercomputing power has become available to most researchers in the form of affordable Beowulf clusters and other parallel computer platforms. However, to take full advantage of such computing power requires developing parallel algorithms and associated software, a task that is often too daunting for geoscience modelers whose main expertise is in geosciences. We introduce here an automated parallel computing environment built on open-source algorithms and libraries. Users interact with this computing environment by specifying the partial differential equations, solvers, and model-specific properties using an English-like modeling language in the input files. The system then automatically generates the finite element codes that can be run on distributed or shared memory parallel machines. This system is dynamic and flexible, allowing users to address different problems in geosciences. It is capable of providing web-based services, enabling users to generate source codes online. This unique feature will facilitate high-performance computing to be integrated with distributed data grids in the emerging cyber-infrastructures for geosciences. In this paper we discuss the principles of this automated modeling environment and provide examples to demonstrate its versatility.

  19. Racing in parallel: Quantum versus Classical

    NASA Astrophysics Data System (ADS)

    Steiger, Damian S.; Troyer, Matthias

    In a fair comparison of the performance of a quantum algorithm to a classical one it is important to treat them on equal footing, both regarding resource usage and parallelism. We show how one may otherwise mistakenly attribute speedup due to parallelism as quantum speedup. We apply such an analysis both to analog quantum devices (quantum annealers) and gate model algorithms and give several examples where a careful analysis of parallelism makes a significant difference in the comparison between classical and quantum algorithms.

  20. Six-Degree-Of-Freedom Parallel Minimanipulator

    NASA Technical Reports Server (NTRS)

    Tahmasebi, Farhad; Tsai, Lung-Wen

    1994-01-01

    Six-degree-of-freedom parallel minimanipulator stiffer and simpler than earlier six-degree-of-freedom manipulators. Includes only three inextensible limbs with universal joints at ends. Limbs have equal lengths and act in parallel as they share load on manipulated platform. Designed to provide high resolution and high stiffness for fine control of position and force in hybrid serial/parallel-manipulator system.

  1. Automatic Multilevel Parallelization Using OpenMP

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

    2002-01-01

    In this paper we describe the extension of the CAPO parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report first results for several benchmark codes and one full application that have been parallelized using our system.

  2. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1995-01-01

    The scope of this project dealt with the investigation of the requirements to support distributed computing of scientific computations over a cluster of cooperative workstations. Various experiments on computations for the solution of simultaneous linear equations were performed in the early phase of the project to gain experience in the general nature and requirements of scientific applications. A specification of a distributed integrated computing environment, DICE, based on a distributed shared memory communication paradigm has been developed and evaluated. The distributed shared memory model facilitates porting existing parallel algorithms that have been designed for shared memory multiprocessor systems to the new environment. The potential of this new environment is to provide supercomputing capability through the utilization of the aggregate power of workstations cooperating in a cluster interconnected via a local area network. Workstations, generally, do not have the computing power to tackle complex scientific applications, making them primarily useful for visualization, data reduction, and filtering as far as complex scientific applications are concerned. There is a tremendous amount of computing power that is left unused in a network of workstations. Very often a workstation is simply sitting idle on a desk. A set of tools can be developed to take advantage of this potential computing power to create a platform suitable for large scientific computations. The integration of several workstations into a logical cluster of distributed, cooperative, computing stations presents an alternative to shared memory multiprocessor systems. In this project we designed and evaluated such a system.

  3. Vectoring of parallel synthetic jets

    NASA Astrophysics Data System (ADS)

    Berk, Tim; Ganapathisubramani, Bharathram; Gomit, Guillaume

    2015-11-01

    A pair of parallel synthetic jets can be vectored by applying a phase difference between the two driving signals. The resulting jet can be merged or bifurcated and either vectored towards the actuator leading in phase or the actuator lagging in phase. In the present study, the influence of phase difference and Strouhal number on the vectoring behaviour is examined experimentally. Phase-locked vorticity fields, measured using Particle Image Velocimetry (PIV), are used to track vortex pairs. The physical mechanisms that explain the diversity in vectoring behaviour are observed based on the vortex trajectories. For a fixed phase difference, the vectoring behaviour is shown to be primarily influenced by pinch-off time of vortex rings generated by the synthetic jets. Beyond a certain formation number, the pinch-off timescale becomes invariant. In this region, the vectoring behaviour is determined by the distance between subsequent vortex rings. We acknowledge the financial support from the European Research Council (ERC grant agreement no. 277472).

  4. Parallel processing network and method

    SciTech Connect

    DeBenedictis, E.P.

    1988-08-23

    This patent describes a parallel processing system including a plurality of processing nodes interconnected in a prescribed manner for passing messages directly between the processing nodes and message passing protocol apparatus at each node for sending and receiving the messages, the messages containing information identifying a protocol type to which a message belongs, the protocol type identifying rules for processing the message, an identification of a task being processed using the protocol type and data, the protocol apparatus comprising: input means for receiving a message from a first node connected to the input means and including means for signaling to the first node that the input means is empty, output means for sending a message to the input means for a second connected node only when the input means at the second node signals that it is empty, a protocol processor for independently performing protocol input functions and output functions on messages received and to be transmitted, respectively, according to the protocol type identified in the message; and a memory accessible by the protocol processor for storing state information individually pertaining to each task being processed and containing protocol status information for controlling the operations of the input and output functions pertaining to each task, the input and output functions operating to change the state information.

  5. Applications of Parallel Processing to Astrodynamics

    NASA Astrophysics Data System (ADS)

    Coffey, S.; Healy, L.; Neal, H.

    1996-03-01

    Parallel processing is being used to improve the catalog of earth orbiting satellites and for problems associated with the catalog. Initial efforts centered around using SIMD parallel processors to perform debris conjunction analysis and satellite dynamics studies. More recently, the availability of cheap supercomputing processors and parallel processing software such as PVM have enabled the reutilization of existing astrodynamics software in distributed parallel processing environments, Computations once taking many days with traditional mainframes are now being performed in only a few hours. Efforts underway for the US Naval Space Command include conjunction prediction, uncorrelated target processing and a new space object catalog based on orbit determination and prediction with special perturbations methods.

  6. Parallelization of Apriori algorithm using Charm++ library

    NASA Astrophysics Data System (ADS)

    Puścian, Marek; Grabski, Waldemar

    2015-09-01

    This paper deals with the problem of adapting sequential frequent item sets mining algorithm to parallel processing. The original Bodon's Apriori algorithm has been partitioned into loosely coupled tasks and prepared to be executed on several computation nodes using Charm++ library. Variety of optimization methods have been proposed and successfully implemented in parallel environment. The work provides enhancements to achieve good efficiency during parallelization of existing solutions, e.g.: how to organize communication between tasks. The presented approach has been illustrated with many experiments and measurements performed on parallelized algorithm.

  7. Parallel auto-correlative statistics with VTK.

    SciTech Connect

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2013-08-01

    This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.

  8. Parallel Computation for Natural Convection in Cavities

    NASA Technical Reports Server (NTRS)

    Wang, P.; Ferraro, R. D.

    1995-01-01

    Parallel computation for thermal convective flows in cavities with adiabatic horizontal boundaries and driven by differential heating of the two vertical end walls, is investigated using supercomputers. A parallel computation code has been implemented using a finite-difference method with a multigrid elliptic solver and a Dufort-Frankel scheme. The domain decomposition techniques are discussed in detail. The parallel code is numerically stable, computationally efficient, and portable to various parallel architectures which support either PVM or NX libraries for communications. Finally, numerical results for various Rayleigh numbers and Prandtl numbers are presented.

  9. Conformal pure radiation with parallel rays

    NASA Astrophysics Data System (ADS)

    Leistner, Thomas; Nurowski, Paweł

    2012-03-01

    We define pure radiation metrics with parallel rays to be n-dimensional pseudo-Riemannian metrics that admit a parallel null line bundle K and whose Ricci tensor vanishes on vectors that are orthogonal to K. We give necessary conditions in terms of the Weyl, Cotton and Bach tensors for a pseudo-Riemannian metric to be conformal to a pure radiation metric with parallel rays. Then, we derive conditions in terms of the tractor calculus that are equivalent to the existence of a pure radiation metric with parallel rays in a conformal class. We also give analogous results for n-dimensional pseudo-Riemannian pp-waves.

  10. Parallel debugging using graphical views. Technical report

    SciTech Connect

    Bailey, M.; Socha, D.; Notkin, D.

    1988-03-01

    Graphical views are essential for debugging parallel programs because of the large quantity of state information contained in parallel programs. Voyeur, a prototype system for creating graphical views of parallel programs, provides a cost-effective way to construct such views for any parallel-programming system. We illustrate Voyeur by discussing four views created for debugging Poker programs. One is a general trace facility for any Poker program. The other three are tailored to display a specific type of algorithmic information. Each of these views has been instrumental in detecting bugs that would have been difficult to detect otherwise, yet were obvious with the views.

  11. A parallel algorithm for global routing

    NASA Technical Reports Server (NTRS)

    Brouwer, Randall J.; Banerjee, Prithviraj

    1990-01-01

    A Parallel Hierarchical algorithm for Global Routing (PHIGURE) is presented. The router is based on the work of Burstein and Pelavin, but has many extensions for general global routing and parallel execution. Main features of the algorithm include structured hierarchical decomposition into separate independent tasks which are suitable for parallel execution and adaptive simplex solution for adding feedthroughs and adjusting channel heights for row-based layout. Alternative decomposition methods and the various levels of parallelism available in the algorithm are examined closely. The algorithm is described and results are presented for a shared-memory multiprocessor implementation.

  12. Use Computer-Aided Tools to Parallelize Large CFD Applications

    NASA Technical Reports Server (NTRS)

    Jin, H.; Frumkin, M.; Yan, J.

    2000-01-01

    Porting applications to high performance parallel computers is always a challenging task. It is time consuming and costly. With rapid progressing in hardware architectures and increasing complexity of real applications in recent years, the problem becomes even more sever. Today, scalability and high performance are mostly involving handwritten parallel programs using message-passing libraries (e.g. MPI). However, this process is very difficult and often error-prone. The recent reemergence of shared memory parallel (SMP) architectures, such as the cache coherent Non-Uniform Memory Access (ccNUMA) architecture used in the SGI Origin 2000, show good prospects for scaling beyond hundreds of processors. Programming on an SMP is simplified by working in a globally accessible address space. The user can supply compiler directives, such as OpenMP, to parallelize the code. As an industry standard for portable implementation of parallel programs for SMPs, OpenMP is a set of compiler directives and callable runtime library routines that extend Fortran, C and C++ to express shared memory parallelism. It promises an incremental path for parallel conversion of existing software, as well as scalability and performance for a complete rewrite or an entirely new development. Perhaps the main disadvantage of programming with directives is that inserted directives may not necessarily enhance performance. In the worst cases, it can create erroneous results. While vendors have provided tools to perform error-checking and profiling, automation in directive insertion is very limited and often failed on large programs, primarily due to the lack of a thorough enough data dependence analysis. To overcome the deficiency, we have developed a toolkit, CAPO, to automatically insert OpenMP directives in Fortran programs and apply certain degrees of optimization. CAPO is aimed at taking advantage of detailed inter-procedural dependence analysis provided by CAPTools, developed by the University of Greenwich, to reduce potential errors made by users. Earlier tests on NAS Benchmarks and ARC3D have demonstrated good success of this tool. In this study, we have applied CAPO to parallelize three large applications in the area of computational fluid dynamics (CFD): OVERFLOW, TLNS3D and INS3D. These codes are widely used for solving Navier-Stokes equations with complicated boundary conditions and turbulence model in multiple zones. Each one comprises of from 50K to 1,00k lines of FORTRAN77. As an example, CAPO took 77 hours to complete the data dependence analysis of OVERFLOW on a workstation (SGI, 175MHz, R10K processor). A fair amount of effort was spent on correcting false dependencies due to lack of necessary knowledge during the analysis. Even so, CAPO provides an easy way for user to interact with the parallelization process. The OpenMP version was generated within a day after the analysis was completed. Due to sequential algorithms involved, code sections in TLNS3D and INS3D need to be restructured by hand to produce more efficient parallel codes. An included figure shows preliminary test results of the generated OVERFLOW with several test cases in single zone. The MPI data points for the small test case were taken from a handcoded MPI version. As we can see, CAPO's version has achieved 18 fold speed up on 32 nodes of the SGI O2K. For the small test case, it outperformed the MPI version. These results are very encouraging, but further work is needed. For example, although CAPO attempts to place directives on the outer- most parallel loops in an interprocedural framework, it does not insert directives based on the best manual strategy. In particular, it lacks the support of parallelization at the multi-zone level. Future work will emphasize on the development of methodology to work in a multi-zone level and with a hybrid approach. Development of tools to perform more complicated code transformation is also needed.

  13. Dynamic force spectroscopy of parallel individual mucin1-antibody bonds

    SciTech Connect

    Sulchek, T A; Friddle, R W; Langry, K; Lau, E; Albrecht, H; Ratto, T; DeNardo, S; Colvin, M E; Noy, A

    2005-05-02

    We used atomic force microscopy (AFM) to measure the binding forces between Mucin1 (MUC1) peptide and a single chain antibody fragment (scFv) selected from a scFv library screened against MUC1. This binding interaction is central to the design of the molecules for targeted delivery of radioimmunotherapeutic agents for prostate and breast cancer treatment. Our experiments separated the specific binding interaction from non-specific interactions by tethering the antibody and MUC1 molecules to the AFM tip and sample surface with flexible polymer spacers. Rupture force magnitude and elastic characteristics of the spacers allowed identification of the bond rupture events corresponding to different number of interacting proteins. We used dynamic force spectroscopy to estimate the intermolecular potential widths and equivalent thermodynamic off rates for mono-, bi-, and tri-valent interactions. Measured interaction potential parameters agree with the results of molecular docking simulation. Our results demonstrate that an increase of the interaction valency leads to a precipitous decline in the dissociation rate. Binding forces measured for mono and multivalent interactions match the predictions of a Markovian model for the strength of multiple uncorrelated bonds in parallel configuration. Our approach is promising for comparison of the specific effects of molecular modifications as well as for determination of the best configuration of antibody-based multivalent targeting agents.

  14. Parallel genotypic adaptation: when evolution repeats itself

    PubMed Central

    Wood, Troy E.; Burke, John M.; Rieseberg, Loren H.

    2008-01-01

    Until recently, parallel genotypic adaptation was considered unlikely because phenotypic differences were thought to be controlled by many genes. There is increasing evidence, however, that phenotypic variation sometimes has a simple genetic basis and that parallel adaptation at the genotypic level may be more frequent than previously believed. Here, we review evidence for parallel genotypic adaptation derived from a survey of the experimental evolution, phylogenetic, and quantitative genetic literature. The most convincing evidence of parallel genotypic adaptation comes from artificial selection experiments involving microbial populations. In some experiments, up to half of the nucleotide substitutions found in independent lineages under uniform selection are the same. Phylogenetic studies provide a means for studying parallel genotypic adaptation in non-experimental systems, but conclusive evidence may be difficult to obtain because homoplasy can arise for other reasons. Nonetheless, phylogenetic approaches have provided evidence of parallel genotypic adaptation across all taxonomic levels, not just microbes. Quantitative genetic approaches also suggest parallel genotypic evolution across both closely and distantly related taxa, but it is important to note that this approach cannot distinguish between parallel changes at homologous loci versus convergent changes at closely linked non-homologous loci. The finding that parallel genotypic adaptation appears to be frequent and occurs at all taxonomic levels has important implications for phylogenetic and evolutionary studies. With respect to phylogenetic analyses, parallel genotypic changes, if common, may result in faulty estimates of phylogenetic relationships. From an evolutionary perspective, the occurrence of parallel genotypic adaptation provides increasing support for determinism in evolution and may provide a partial explanation for how species with low levels of gene flow are held together. PMID:15881688

  15. The language parallel Pascal and other aspects of the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  16. Parallelization of MRCI based on hole-particle symmetry.

    PubMed

    Suo, Bing; Zhai, Gaohong; Wang, Yubin; Wen, Zhenyi; Hu, Xiangqian; Li, Lemin

    2005-01-15

    The parallel implementation of multireference configuration interaction program based on the hole-particle symmetry is described. The platform to implement the parallelization is an Intel-Architectural cluster consisting of 12 nodes, each of which is equipped with two 2.4-G XEON processors, 3-GB memory, and 36-GB disk, and are connected by a Gigabit Ethernet Switch. The dependence of speedup on molecular symmetries and task granularities is discussed. Test calculations show that the scaling with the number of nodes is about 1.9 (for C1 and Cs), 1.65 (for C2v), and 1.55 (for D2h) when the number of nodes is doubled. The largest calculation performed on this cluster involves 5.6 x 10(8) CSFs. PMID:15538769

  17. Reducing neural network training time with parallel processing

    NASA Technical Reports Server (NTRS)

    Rogers, James L., Jr.; Lamarsh, William J., II

    1995-01-01

    Obtaining optimal solutions for engineering design problems is often expensive because the process typically requires numerous iterations involving analysis and optimization programs. Previous research has shown that a near optimum solution can be obtained in less time by simulating a slow, expensive analysis with a fast, inexpensive neural network. A new approach has been developed to further reduce this time. This approach decomposes a large neural network into many smaller neural networks that can be trained in parallel. Guidelines are developed to avoid some of the pitfalls when training smaller neural networks in parallel. These guidelines allow the engineer: to determine the number of nodes on the hidden layer of the smaller neural networks; to choose the initial training weights; and to select a network configuration that will capture the interactions among the smaller neural networks. This paper presents results describing how these guidelines are developed.

  18. Single-cell mechanics: the parallel plates technique.

    PubMed

    Bufi, Nathalie; Durand-Smet, Pauline; Asnacios, Atef

    2015-01-01

    We describe here the parallel plates technique which enables quantifying single-cell mechanics, either passive (cell deformability) or active (whole-cell traction forces). Based on the bending of glass microplates of calibrated stiffness, it is easy to implement on any microscope, and benefits from protocols and equipment already used in biology labs (coating of glass slides, pipette pullers, micromanipulators, etc.). We first present the principle of the technique, the design and calibration of the microplates, and various surface coatings corresponding to different cell-substrate interactions. Then we detail the specific cell preparation for the assays, and the different mechanical assays that can be carried out. Finally, we discuss the possible technical simplifications and the specificities of each mechanical protocol, as well as the possibility of extending the use of the parallel plates to investigate the mechanics of cell aggregates or tissues. PMID:25640430

  19. Parallel discrete event simulation: A shared memory approach

    NASA Technical Reports Server (NTRS)

    Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.

    1987-01-01

    With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models.

  20. On the dimensionally correct kinetic theory of turbulence for parallel propagation

    SciTech Connect

    Gaelzer, R. E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br; Ziebell, L. F. E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br; Yoon, P. H. E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br; Kim, Sunjung E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br

    2015-03-15

    Yoon and Fang [Phys. Plasmas 15, 122312 (2008)] formulated a second-order nonlinear kinetic theory that describes the turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field. Their theory also includes discrete-particle effects, or the effects due to spontaneously emitted thermal fluctuations. However, terms associated with the spontaneous fluctuations in particle and wave kinetic equations in their theory contain proper dimensionality only for an artificial one-dimensional situation. The present paper extends the analysis and re-derives the dimensionally correct kinetic equations for three-dimensional case. The new formalism properly describes the effects of spontaneous fluctuations emitted in three-dimensional space, while the collectively emitted turbulence propagates predominantly in directions parallel/anti-parallel to the ambient magnetic field. As a first step, the present investigation focuses on linear wave-particle interaction terms only. A subsequent paper will include the dimensionally correct nonlinear wave-particle interaction terms.

  1. Computing Flow Transition On Parallel Processors

    NASA Technical Reports Server (NTRS)

    Bokhari, S.; Erlebacher, G.; Hussaini, M. Y.

    1993-01-01

    Parallel algorithm developed on multiple-microprocessor computer. Program initiated to develop computer codes capable of directly simulating and mathematically modeling transition process at mach numbers ranging from subsonic to hypersonic. Parallel computers potentially offer reduction of processing time; processing time inversely proportional to number of available processors.

  2. MULTIOBJECTIVE PARALLEL GENETIC ALGORITHM FOR WASTE MINIMIZATION

    EPA Science Inventory

    In this research we have developed an efficient multiobjective parallel genetic algorithm (MOPGA) for waste minimization problems. This MOPGA integrates PGAPack (Levine, 1996) and NSGA-II (Deb, 2000) with novel modifications. PGAPack is a master-slave parallel implementation of a...

  3. On parallel search of DNA sequence databases

    SciTech Connect

    Guan, Xiaogun; Mann, R.; Mural, R.; Uberbacher, E.

    1991-01-01

    This paper describes the development of large scale parallel search methods for DNA databases using dynamic programming algorithm on an Intel iPCS/860 parallel computer. The performance of these methods has been measured and several strategies for improving performance are discussed. 6 refs., 2 figs., 2 tabs.

  4. Parallel Computing Strategies for Irregular Algorithms

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

    2002-01-01

    Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.

  5. Parallel unstructured grid generation for computational aerosciences

    NASA Technical Reports Server (NTRS)

    Shephard, Mark S.

    1993-01-01

    The objective of this research project is to develop efficient parallel automatic grid generation procedures for use in computational aerosciences. This effort is focused on a parallel version of the Finite Octree grid generator. Progress made during the first six months is reported.

  6. RAM-Based parallel-output controller

    NASA Technical Reports Server (NTRS)

    Niswander, J. K.; Stattel, R. J.

    1980-01-01

    Selected bit strings in serial-data link are extracted for processing. Controller is programmable interface between serial-data link and peripherals that accept parallel data. It can be used to drive displays, printers, plotters, digital-to-analog converters, and parallel-output ports.

  7. Parallel computation of recoupling coefficients using transputers

    NASA Astrophysics Data System (ADS)

    Fack, V.; Van der Jeugt, J.; Rao, K. Srinivasa

    1992-09-01

    Parallel algorithms for the computation of angular momentum recoupling coefficients are discussed. The first situation where parallelisation has a remarkable impact is for the computation of the 9- j coefficient. A parallel program in C for the numerical calculation of the 9- j coefficient is presented and compared with sequential programs in C.

  8. Parallel Activation in Bilingual Phonological Processing

    ERIC Educational Resources Information Center

    Lee, Su-Yeon

    2011-01-01

    In bilingual language processing, the parallel activation hypothesis suggests that bilinguals activate their two languages simultaneously during language processing. Support for the parallel activation mainly comes from studies of lexical (word-form) processing, with relatively less attention to phonological (sound) processing. According to…

  9. Calculating real Delbrck amplitudes on parallel processors

    NASA Astrophysics Data System (ADS)

    Kahane, Sylvian

    1991-12-01

    Calculation of the real Delbrck scattering amplitudes is parallelized by concurent evaluation of 20 four-dimensional integrals. Two approaches were used: (a) a farm of master and workers tasks, and (b) the Cubix concept of parallelization. We discuss load balancing, timing and the efficiency of the implementation.

  10. Multilevel Parallelization of AutoDock 4.2

    PubMed Central

    2011-01-01

    Background Virtual (computational) screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions. AutoDock is a serial application, though several previous efforts have parallelized various aspects of the program. In this paper, we report on a multi-level parallelization of AutoDock 4.2 (mpAD4). Results Using MPI and OpenMP, AutoDock 4.2 was parallelized for use on MPI-enabled systems and to multithread the execution of individual docking jobs. In addition, code was implemented to reduce input/output (I/O) traffic by reusing grid maps at each node from docking to docking. Performance of mpAD4 was examined on two multiprocessor computers. Conclusions Using MPI with OpenMP multithreading, mpAD4 scales with near linearity on the multiprocessor systems tested. In situations where I/O is limiting, reuse of grid maps reduces both system I/O and overall screening time. Multithreading of AutoDock's Lamarkian Genetic Algorithm with OpenMP increases the speed of execution of individual docking jobs, and when combined with MPI parallelization can significantly reduce the execution time of virtual screens. This work is significant in that mpAD4 speeds the execution of certain molecular docking workloads and allows the user to optimize the degree of system-level (MPI) and node-level (OpenMP) parallelization to best fit both workloads and computational resources. PMID:21527034

  11. National Combustion Code: Parallel Implementation and Performance

    NASA Technical Reports Server (NTRS)

    Quealy, A.; Ryder, R.; Norris, A.; Liu, N.-S.

    2000-01-01

    The National Combustion Code (NCC) is being developed by an industry-government team for the design and analysis of combustion systems. CORSAIR-CCD is the current baseline reacting flow solver for NCC. This is a parallel, unstructured grid code which uses a distributed memory, message passing model for its parallel implementation. The focus of the present effort has been to improve the performance of the NCC flow solver to meet combustor designer requirements for model accuracy and analysis turnaround time. Improving the performance of this code contributes significantly to the overall reduction in time and cost of the combustor design cycle. This paper describes the parallel implementation of the NCC flow solver and summarizes its current parallel performance on an SGI Origin 2000. Earlier parallel performance results on an IBM SP-2 are also included. The performance improvements which have enabled a turnaround of less than 15 hours for a 1.3 million element fully reacting combustion simulation are described.

  12. Broadcasting a message in a parallel computer

    DOEpatents

    Berg, Jeremy E.; Faraj, Ahmad A.

    2011-08-02

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.

  13. Implementation and performance of parallel Prolog interpreter

    SciTech Connect

    Wei, S.; Kale, L.V.; Balkrishna, R. . Dept. of Computer Science)

    1988-01-01

    In this paper, the authors discuss the implementation of a parallel Prolog interpreter on different parallel machines. The implementation is based on the REDUCE--OR process model which exploits both AND and OR parallelism in logic programs. It is machine independent as it runs on top of the chare-kernel--a machine-independent parallel programming system. The authors also give the performance of the interpreter running a diverse set of benchmark pargrams on parallel machines including shared memory systems: an Alliant FX/8, Sequent and a MultiMax, and a non-shared memory systems: Intel iPSC/32 hypercube, in addition to its performance on a multiprocessor simulation system.

  14. Differences Between Distributed and Parallel Systems

    SciTech Connect

    Brightwell, R.; Maccabe, A.B.; Rissen, R.

    1998-10-01

    Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.

  15. Parallel Tempering for the Traveling Salesman Problem

    NASA Astrophysics Data System (ADS)

    Wang, Chiaming; Hyman, Jeffrey D.; Percus, Allon; Caflisch, Russel

    We explore the potential of parallel tempering as a combinatorial optimization method, applying it to the traveling salesman problem. We compare simulation results of parallel tempering with a benchmark implementation of simulated annealing, and study how different choices of parameters affect the relative performance of the two methods. We find that a straightforward implementation of parallel tempering can outperform simulated annealing in several crucial respects. When parameters are chosen appropriately, both methods yield close approximation to the actual minimum distance for an instance with 200 nodes. However, parallel tempering yields more consistently accurate results when a series of independent simulations are performed. Our results suggest that parallel tempering might offer a simple but powerful alternative to simulated annealing for combinatorial optimization problems.

  16. Parallel hypergraph partitioning for scientific computing.

    SciTech Connect

    Heaphy, Robert; Devine, Karen Dragon; Catalyurek, Umit; Bisseling, Robert; Hendrickson, Bruce Alan; Boman, Erik Gunnar

    2005-07-01

    Graph partitioning is often used for load balancing in parallel computing, but it is known that hypergraph partitioning has several advantages. First, hypergraphs more accurately model communication volume, and second, they are more expressive and can better represent nonsymmetric problems. Hypergraph partitioning is particularly suited to parallel sparse matrix-vector multiplication, a common kernel in scientific computing. We present a parallel software package for hypergraph (and sparse matrix) partitioning developed at Sandia National Labs. The algorithm is a variation on multilevel partitioning. Our parallel implementation is novel in that it uses a two-dimensional data distribution among processors. We present empirical results that show our parallel implementation achieves good speedup on several large problems (up to 33 million nonzeros) with up to 64 processors on a Linux cluster.

  17. Configuration space representation in parallel coordinates

    NASA Technical Reports Server (NTRS)

    Fiorini, Paolo; Inselberg, Alfred

    1989-01-01

    By means of a system of parallel coordinates, a nonprojective mapping from R exp N to R squared is obtained for any positive integer N. In this way multivariate data and relations can be represented in the Euclidean plane (embedded in the projective plane). Basically, R squared with Cartesian coordinates is augmented by N parallel axes, one for each variable. The N joint variables of a robotic device can be represented graphically by using parallel coordinates. It is pointed out that some properties of the relation are better perceived visually from the parallel coordinate representation, and that new algorithms and data structures can be obtained from this representation. The main features of parallel coordinates are described, and an example is presented of their use for configuration space representation of a mechanical arm (where Cartesian coordinates cannot be used).

  18. Integrated Optoelectronics for Parallel Microbioanalysis

    NASA Technical Reports Server (NTRS)

    Stirbl, Robert; Moynihan, Philip; Bearman, Gregory; Lane, Arthur

    2003-01-01

    Miniature, relatively inexpensive microbioanalytical systems ("laboratory-on-achip" devices) have been proposed for the detection of hazardous microbes and toxic chemicals. Each system of this type would include optoelectronic sensors and sensor-output-processing circuitry that would simultaneously look for the optical change, fluorescence, delayed fluorescence, or phosphorescence signatures from multiple redundant sites that have interacted with the test biomolecules in order to detect which one(s) was present in a given situation. These systems could be used in a variety of settings that could include doctors offices, hospitals, hazardous-material laboratories, biological-research laboratories, military operations, and chemical-processing plants.

  19. The parallel diffusion of cosmic rays in a random magnetic field.

    NASA Technical Reports Server (NTRS)

    Klimas, A.; Sandri, G.

    1973-01-01

    Within the quasi-linear approximation, the existence of the parallel diffusion coefficient for cosmic rays in a random magnetic field (homogeneous, isotropic), despite the slow decay of the interaction between particles and random field, is demonstrated. As an example, the results of a numerical calculation of the parallel diffusion coefficient for a Gaussian random-field correlation function are presented. The numerical results are corroborated by asymptotic analysis and are compared to those of other theories.

  20. Conservation of writhe helicity under anti-parallel reconnection

    NASA Astrophysics Data System (ADS)

    Laing, Christian E.; Ricca, Renzo L.; Sumners, De Witt L.

    2015-03-01

    Reconnection is a fundamental event in many areas of science, from the interaction of vortices in classical and quantum fluids, and magnetic flux tubes in magnetohydrodynamics and plasma physics, to the recombination in polymer physics and DNA biology. By using fundamental results in topological fluid mechanics, the helicity of a flux tube can be calculated in terms of writhe and twist contributions. Here we show that the writhe is conserved under anti-parallel reconnection. Hence, for a pair of interacting flux tubes of equal flux, if the twist of the reconnected tube is the sum of the original twists of the interacting tubes, then helicity is conserved during reconnection. Thus, any deviation from helicity conservation is entirely due to the intrinsic twist inserted or deleted locally at the reconnection site. This result has important implications for helicity and energy considerations in various physical contexts.

  1. Conservation of writhe helicity under anti-parallel reconnection

    PubMed Central

    Laing, Christian E.; Ricca, Renzo L.; Sumners, De Witt L.

    2015-01-01

    Reconnection is a fundamental event in many areas of science, from the interaction of vortices in classical and quantum fluids, and magnetic flux tubes in magnetohydrodynamics and plasma physics, to the recombination in polymer physics and DNA biology. By using fundamental results in topological fluid mechanics, the helicity of a flux tube can be calculated in terms of writhe and twist contributions. Here we show that the writhe is conserved under anti-parallel reconnection. Hence, for a pair of interacting flux tubes of equal flux, if the twist of the reconnected tube is the sum of the original twists of the interacting tubes, then helicity is conserved during reconnection. Thus, any deviation from helicity conservation is entirely due to the intrinsic twist inserted or deleted locally at the reconnection site. This result has important implications for helicity and energy considerations in various physical contexts. PMID:25820408

  2. Virtual reality visualization of parallel molecular dynamics simulation

    SciTech Connect

    Disz, T.; Papka, M.; Stevens, R.; Pellegrino, M.; Taylor, V.

    1995-12-31

    When performing communications mapping experiments for massively parallel processors, it is important to be able to visualize the mappings and resulting communications. In a molecular dynamics model, visualization of the atom to atom interaction and the processor mappings provides insight into the effectiveness of the communications algorithms. The basic quantities available for visualization in a model of this type are the number of molecules per unit volume, the mass, and velocity of each molecule. The computational information available for visualization is the atom to atom interaction within each time step, the atom to processor mapping, and the energy resealing events. We use the CAVE (CAVE Automatic Virtual Environment) to provide interactive, immersive visualization experiences.

  3. Parallelism extraction and program restructuring for parallel simulation of digital systems

    SciTech Connect

    Vellandi, B.L.

    1990-01-01

    Two topics currently of interest to the computer aided design (CADF) for the very-large-scale integrated circuit (VLSI) community are using the VHSIC Hardware Description Language (VHDL) effectively and decreasing simulation times of VLSI designs through parallel execution of the simulator. The goal of this research is to increase the degree of parallelism obtainable in VHDL simulation, and consequently to decrease simulation times. The research targets simulation on massively parallel architectures. Experimentation and instrumentation were done on the SIMD Connection Machine. The author discusses her method used to extract parallelism and restructure a VHDL program, experimental results using this method, and requirements for a parallel architecture for fast simulation.

  4. On the Scalability of Parallel UCT

    NASA Astrophysics Data System (ADS)

    Segal, Richard B.

    The parallelization of MCTS across multiple-machines has proven surprisingly difficult. The limitations of existing algorithms were evident in the 2009 Computer Olympiad where Zen using a single four-core machine defeated both Fuego with ten eight-core machines, and Mogo with twenty thirty-two core machines. This paper investigates the limits of parallel MCTS in order to understand why distributed parallelism has proven so difficult and to pave the way towards future distributed algorithms with better scaling. We first analyze the single-threaded scaling of Fuego and find that there is an upper bound on the play-quality improvements which can come from additional search. We then analyze the scaling of an idealized N-core shared memory machine to determine the maximum amount of parallelism supported by MCTS. We show that parallel speedup depends critically on how much time is given to each player. We use this relationship to predict parallel scaling for time scales beyond what can be empirically evaluated due to the immense computation required. Our results show that MCTS can scale nearly perfectly to at least 64 threads when combined with virtual loss, but without virtual loss scaling is limited to just eight threads. We also find that for competition time controls scaling to thousands of threads is impossible not necessarily due to MCTS not scaling, but because high levels of parallelism can start to bump up against the upper performance bound of Fuego itself.

  5. Applications of Parallel Processing in Configuration Analyses

    NASA Technical Reports Server (NTRS)

    Sundaram, Ppchuraman; Hager, James O.; Biedron, Robert T.

    1999-01-01

    The paper presents the recent progress made towards developing an efficient and user-friendly parallel environment for routine analysis of large CFD problems. The coarse-grain parallel version of the CFL3D Euler/Navier-Stokes analysis code, CFL3Dhp, has been ported onto most available parallel platforms. The CFL3Dhp solution accuracy on these parallel platforms has been verified with the CFL3D sequential analyses. User-friendly pre- and post-processing tools that enable a seamless transfer from sequential to parallel processing have been written. Static load balancing tool for CFL3Dhp analysis has also been implemented for achieving good parallel efficiency. For large problems, load balancing efficiency as high as 95% can be achieved even when large number of processors are used. Linear scalability of the CFL3Dhp code with increasing number of processors has also been shown using a large installed transonic nozzle boattail analysis. To highlight the fast turn-around time of parallel processing, the TCA full configuration in sideslip Navier-Stokes drag polar at supersonic cruise has been obtained in a day. CFL3Dhp is currently being used as a production analysis tool.

  6. Parallel computation of manipulator inverse dynamics

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Bejczy, Antal K.

    1991-01-01

    In this article, parallel computation of manipulator inverse dynamics is investigated. A hierarchical graph-based mapping approach is devised to analyze the inherent parallelism in the Newton-Euler formulation at several computational levels, and to derive the features of an abstract architecture for exploitation of parallelism. At each level, a parallel algorithm represents the application of a parallel model of computation that transforms the computation into a graph whose structure defines the features of an abstract architecture, i.e., number of processors, communication structure, etc. Data-flow analysis is employed to derive the time lower bound in the computation as well as the sequencing of the abstract architecture. The features of the target architecture are defined by optimization of the abstract architecture to exploit maximum parallelism while minimizing architectural complexity. An architecture is designed and implemented that is capable of efficient exploitation of parallelism at several computational levels. The computation time of the Newton-Euler formulation for a 6-degree-of-freedom (dof) general manipulator is measured as 187 microsec. The increase in computation time for each additional dof is 23 microsec, which leads to a computation time of less than 500 microsec, even for a 12-dof redundant arm.

  7. High-Throughput parallel blind Virtual Screening using BINDSURF

    PubMed Central

    2012-01-01

    Background Virtual Screening (VS) methods can considerably aid clinical research, predicting how ligands interact with drug targets. Most VS methods suppose a unique binding site for the target, usually derived from the interpretation of the protein crystal structure. However, it has been demonstrated that in many cases, diverse ligands interact with unrelated parts of the target and many VS methods do not take into account this relevant fact. Results We present BINDSURF, a novel VS methodology that scans the whole protein surface in order to find new hotspots, where ligands might potentially interact with, and which is implemented in last generation massively parallel GPU hardware, allowing fast processing of large ligand databases. Conclusions BINDSURF is an efficient and fast blind methodology for the determination of protein binding sites depending on the ligand, that uses the massively parallel architecture of GPUs for fast pre-screening of large ligand databases. Its results can also guide posterior application of more detailed VS methods in concrete binding sites of proteins, and its utilization can aid in drug discovery, design, repurposing and therefore help considerably in clinical research. PMID:23095663

  8. Xyce parallel electronic simulator : users' guide.

    SciTech Connect

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2011-05-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

  9. Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

    2000-01-01

    The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.

  10. Parallel path aspects of transmission modeling

    SciTech Connect

    Kavicky, J.A.; Shahidehpour, S.M.

    1996-11-01

    This paper examines the present methods and modeling techniques available to address the effects of parallel flows resulting from various firm and short-term energy transactions. A survey of significant methodologies is conducted to determine the present status of parallel flow transaction modeling. The strengths and weaknesses of these approaches are identified to suggest areas of further modeling improvements. The motivating force behind this research is to improve transfer capability assessment accuracy by suggesting a real-time modeling environment that adequately represents the influences of parallel flows while recognizing operational constraints and objectives.

  11. Knowledge representation into Ada parallel processing

    NASA Technical Reports Server (NTRS)

    Masotto, Tom; Babikyan, Carol; Harper, Richard

    1990-01-01

    The Knowledge Representation into Ada Parallel Processing project is a joint NASA and Air Force funded project to demonstrate the execution of intelligent systems in Ada on the Charles Stark Draper Laboratory fault-tolerant parallel processor (FTPP). Two applications were demonstrated - a portion of the adaptive tactical navigator and a real time controller. Both systems are implemented as Activation Framework Objects on the Activation Framework intelligent scheduling mechanism developed by Worcester Polytechnic Institute. The implementations, results of performance analyses showing speedup due to parallelism and initial efficiency improvements are detailed and further areas for performance improvements are suggested.

  12. Parallel Climate Analysis Toolkit (ParCAT)

    Energy Science and Technology Software Center (ESTSC)

    2013-06-30

    The parallel analysis toolkit (ParCAT) provides parallel statistical processing of large climate model simulation datasets. ParCAT provides parallel point-wise average calculations, frequency distributions, sum/differences of two datasets, and difference-of-average and average-of-difference for two datasets for arbitrary subsets of simulation time. ParCAT is a command-line utility that can be easily integrated in scripts or embedded in other application. ParCAT supports CMIP5 post-processed datasets as well as non-CMIP5 post-processed datasets. ParCAT reads and writes standard netCDF files.

  13. Distributed parallel messaging for multiprocessor systems

    DOEpatents

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  14. Parallelization of the Implicit RPLUS Algorithm

    NASA Technical Reports Server (NTRS)

    Orkwis, Paul D.

    1997-01-01

    The multiblock reacting Navier-Stokes flow solver RPLUS2D was modified for parallel implementation. Results for non-reacting flow calculations of this code indicate parallelization efficiencies greater than 84% are possible for a typical test problem. Results tend to improve as the size of the problem increases. The convergence rate of the scheme is degraded slightly when additional artificial block boundaries are included for the purpose of parallelization. However, this degradation virtually disappears if the solution is converged near to machine zero. Recommendations are made for further code improvements to increase efficiency, correct bugs in the original version, and study decomposition effectiveness.

  15. Parallelization of the Implicit RPLUS Algorithm

    NASA Technical Reports Server (NTRS)

    Orkwis, Paul D.

    1994-01-01

    The multiblock reacting Navier-Stokes flow-solver RPLUS2D was modified for parallel implementation. Results for non-reacting flow calculations of this code indicate parallelization efficiencies greater than 84% are possible for a typical test problem. Results tend to improve as the size of the problem increases. The convergence rate of the scheme is degraded slightly when additional artificial block boundaries are included for the purpose of parallelization. However, this degradation virtually disappears if the solution is converged near to machine zero. Recommendations are made for further code improvements to increase efficiency, correct bugs in the original version, and study decomposition effectiveness.

  16. Language constructs for modular parallel programs

    SciTech Connect

    Foster, I.

    1996-03-01

    We describe programming language constructs that facilitate the application of modular design techniques in parallel programming. These constructs allow us to isolate resource management and processor scheduling decisions from the specification of individual modules, which can themselves encapsulate design decisions concerned with concurrence, communication, process mapping, and data distribution. This approach permits development of libraries of reusable parallel program components and the reuse of these components in different contexts. In particular, alternative mapping strategies can be explored without modifying other aspects of program logic. We describe how these constructs are incorporated in two practical parallel programming languages, PCN and Fortran M. Compilers have been developed for both languages, allowing experimentation in substantial applications.

  17. Semi-automatic process partitioning for parallel computation

    NASA Technical Reports Server (NTRS)

    Koelbel, Charles; Mehrotra, Piyush; Vanrosendale, John

    1988-01-01

    On current multiprocessor architectures one must carefully distribute data in memory in order to achieve high performance. Process partitioning is the operation of rewriting an algorithm as a collection of tasks, each operating primarily on its own portion of the data, to carry out the computation in parallel. A semi-automatic approach to process partitioning is considered in which the compiler, guided by advice from the user, automatically transforms programs into such an interacting task system. This approach is illustrated with a picture processing example written in BLAZE, which is transformed into a task system maximizing locality of memory reference.

  18. Runtime system library for parallel finite difference models with nesting

    SciTech Connect

    Michalakes, J.

    1997-03-01

    RSL is a parallel run-time system library for implementing regular-grid models with nesting on distributed memory parallel computers. RSL provides support for automatically decomposing multiple model domains and for redistributing work between processors at run time for dynamic load balancing. A unique feature of RSL is that processor subdomains need not be rectangular patches; rather, grid points are independently allocated to processors, allowing more precisely balanced allocation of work to processors. Communication mechanisms are tailored to the application: RSL provides an efficient high-level stencil exchange operation for updating subdomain ghost areas and interdomain communication to support two-way interaction between nest levels. RSL also provides run-time support for local iteration over subdomains, global-local index translation, and distributed I/O from ordinary Fortran record-blocked data sets. The interface to RSL supports Fortran77 and Fortran90. RSL has been used to parallelize the NCAR/Penn State Mesoscale Model (MM5).

  19. Parallel-vector computation for CSI-design code

    NASA Technical Reports Server (NTRS)

    Nguyen, Duc T.

    1990-01-01

    Computational aspects of Control-Structure Interaction (CSI) DESIGN code is reviewed. Numerical intensive computation portions of CSI-DESIGN code were identified. Improvements in computational speed for the CSI-DESIGN code can be achieved by exploiting parallel and vector capabilities offered by modern computers, such as the Alliant, Convex, Cray-2, and Cray-YMP. Four options to generate the coefficient stiffness matrix and to solve the system of linear, simultaneous equations are currently available in the CSI-DESIGN code. A preprocessor to use RCM (Reverse Cuthill-Mackee) algorithm for bandwidth minimization was also developed for the CSI-DESIGN code. Preliminary results obtained by solving a small-scale, 97 node CSI finite element model (for eigensolution) have indicated that this new CSI-DESIGN code is 5 to 6 times faster (using 1 Alliant processor) than the old version of CSI-DESIGN code. This speed-up was achieved due to the RCM algorithm and the use of a new skyline solver. Efforts are underway to further improve the vector speed for CSI-DESIGN code, to evaluate its performance on a larger scale CSI model (such as phase zero CSI model) to make the code run efficiently on multiprocessor, parallel computer environment, and to make the code portable among different parallel computers available at NASA LaRC, such as Alliant, Convex, and Cray computers.

  20. Parallel universe, dark matter and invisible Higgs decays

    NASA Astrophysics Data System (ADS)

    Chakdar, Shreyashi; Ghosh, Kirtiman; Nandi, S.

    2014-05-01

    The existence of the dark matter with amount about five times the ordinary matter is now well established experimentally. There are now many candidates for this dark matter. However, dark matter could be just like the ordinary matter in a parallel universe. If both universes are described by a non-abelian gauge symmetries, then there will be no kinetic mixing between the ordinary photon and the dark photon, and the dark proton, dark electron and the corresponding dark nuclei, belonging to the parallel universe, will be stable. If the strong coupling constant, ( in the parallel universe is five times that of αs, then the dark proton will be about five time heavier, explaining why the dark matter is five times the ordinary matter. However, the two sectors will still interact via the Higgs boson of the two sectors. This will lead to the existence of a second light Higgs boson, just like the Standard Model Higgs boson. This gives rise to the invisible decay modes of the Higgs boson which can be tested at the LHC, and the proposed ILC.