Science.gov

Sample records for parallel blade-vortex interaction

  1. Vortex dynamics during blade-vortex interactions

    NASA Astrophysics Data System (ADS)

    Peng, Di; Gregory, James W.

    2015-05-01

    Vortex dynamics during parallel blade-vortex interactions (BVIs) were investigated in a subsonic wind tunnel using particle image velocimetry (PIV). Vortices were generated by applying a rapid pitch-up motion to an airfoil through a pneumatic system, and the subsequent interactions with a downstream, unloaded target airfoil were studied. The blade-vortex interactions may be classified into three categories in terms of vortex behavior: close interaction, very close interaction, and collision. For each type of interaction, the vortex trajectory and strength variation were obtained from phase-averaged PIV data. The PIV results revealed the mechanisms of vortex decay and the effects of several key parameters on vortex dynamics, including separation distance (h/c), Reynolds number, and vortex sense. Generally, BVI has two main stages: interaction between vortex and leading edge (vortex-LE interaction) and interaction between vortex and boundary layer (vortex-BL interaction). Vortex-LE interaction, with its small separation distance, is dominated by inviscid decay of vortex strength due to pressure gradients near the leading edge. Therefore, the decay rate is determined by separation distance and vortex strength, but it is relatively insensitive to Reynolds number. Vortex-LE interaction will become a viscous-type interaction if there is enough separation distance. Vortex-BL interaction is inherently dominated by viscous effects, so the decay rate is dependent on Reynolds number. Vortex sense also has great impact on vortex-BL interaction because it changes the velocity field and shear stress near the surface.

  2. Rotorcraft Blade-Vortex Interaction Controller

    NASA Technical Reports Server (NTRS)

    Schmitz, Fredric H. (Inventor)

    1995-01-01

    Blade-vortex interaction noises, sometimes referred to as 'blade slap', are avoided by increasing the absolute value of inflow to the rotor system of a rotorcraft. This is accomplished by creating a drag force which causes the angle of the tip-path plane of the rotor system to become more negative or more positive.

  3. Experimental Study of Vortex Dynamics during Blade-Vortex Interactions

    NASA Astrophysics Data System (ADS)

    Peng, Di; Gregory, James

    2013-11-01

    Vortices incident upon bodies, such as cylinders, airfoils, and rotor blades, can give rise to substantial unsteady loading, sound generation, and vibration in a variety of engineering applications. A comprehensive study on vortex dynamics during blade-vortex interaction (BVI) is performed in this work. Evidence has been found in previous studies that the vortex behavior during BVI varies with Reynolds number, but the effects are not clear. In the current study, the experiments are performed in a 3' 5' low speed wind tunnel where the Reynolds number can be varied from 6 104 to 8 105 by adjusting freestream speed and airfoil size. The vortex is generated by the pitching motion of a wing, which is driven by an air cylinder. Another wing is placed downstream to initiate parallel interactions with the generated vortices. Smoke visualization is used originally to characterize the vortex. Then the BVI problem is studied in detail using time-resolved PIV and unsteady pressure measurements on the downstream target airfoil. The vortex behaviors at selected Reynolds numbers are investigated. The influence of other factors on vortex behavior, such as vortex strength and core size, is also discussed.

  4. Rotating hot-wire investigation of the vortex responsible for blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Fontana, Richard Remo

    1988-01-01

    This distribution of the circumferential velocity of the vortex responsible for blade-vortex interaction noise was measured using a rotating hot-wire rake synchronously meshed with a model helicopter rotor at the blade passage frequency. Simultaneous far-field acoustic data and blade differential pressure measurements were obtained. Results show that the shape of the measured far-field acoustic blade-vortex interaction signature depends on the blade-vortex interaction geometry. The experimental results are compared with the Widnall-Wolf model for blade-vortex interaction noise.

  5. Rotor blade system with reduced blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Leishman, John G. (Inventor); Han, Yong Oun (Inventor)

    2005-01-01

    A rotor blade system with reduced blade-vortex interaction noise includes a plurality of tube members embedded in proximity to a tip of each rotor blade. The inlets of the tube members are arrayed at the leading edge of the blade slightly above the chord plane, while the outlets are arrayed at the blade tip face. Such a design rapidly diffuses the vorticity contained within the concentrated tip vortex because of enhanced flow mixing in the inner core, which prevents the development of a laminar core region.

  6. Flow visualizations of perpendicular blade vortex interactions

    NASA Technical Reports Server (NTRS)

    Rife, Michael C.; Davenport, William J.

    1992-01-01

    Helium bubble flow visualizations have been performed to study perpendicular interaction of a turbulent trailing vortex and a rectangular wing in the Virginia Tech Stability Tunnel. Many combinations of vortex strength, vortex-blade separation (Z(sub s)) and blade angle of attack were studied. Photographs of representative cases are presented. A range of phenomena were observed. For Z(sub s) greater than a few percent chord the vortex is deflected as it passes the blade under the influence of the local streamline curvature and its image in the blade. Initially the interaction appears to have no influence on the core. Downstream, however, the vortex core begins to diffuse and grow, presumably as a consequence of its interaction with the blade wake. The magnitude of these effects increases with reduction in Z(sub s). For Z(sub s) near zero the form of the interaction changes and becomes dependent on the vortex strength. For lower strengths the vortex appears to split into two filaments on the leading edge of the blade, one passing on the pressure and one passing on the suction side. At higher strengths the vortex bursts in the vicinity of the leading edge. In either case the core of its remnants then rapidly diffuse with distance downstream. Increase in Reynolds number did not qualitatively affect the flow apart from decreasing the amplitude of the small low-frequency wandering motions of the vortex. Changes in wing tip geometry and boundary layer trip had very little effect.

  7. A Novel Method for Reducing Rotor Blade-Vortex Interaction

    NASA Technical Reports Server (NTRS)

    Glinka, A. T.

    2000-01-01

    One of the major hindrances to expansion of the rotorcraft market is the high-amplitude noise they produce, especially during low-speed descent, where blade-vortex interactions frequently occur. In an attempt to reduce the noise levels caused by blade-vortex interactions, the flip-tip rotor blade concept was devised. The flip-tip rotor increases the miss distance between the shed vortices and the rotor blades, reducing BVI noise. The distance is increased by rotating an outboard portion of the rotor tip either up or down depending on the flight condition. The proposed plan for the grant consisted of a computational simulation of the rotor aerodynamics and its wake geometry to determine the effectiveness of the concept, coupled with a series of wind tunnel experiments exploring the value of the device and validating the computer model. The computational model did in fact show that the miss distance could be increased, giving a measure of the effectiveness of the flip-tip rotor. However, the wind experiments were not able to be conducted. Increased outside demand for the 7'x lO' wind tunnel at NASA Ames and low priority at Ames for this project forced numerous postponements of the tests, eventually pushing the tests beyond the life of the grant. A design for the rotor blades to be tested in the wind tunnel was completed and an analysis of the strength of the model blades based on predicted loads, including dynamic forces, was done.

  8. Recent studies of rotorcraft blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Preisser, J. S.; Brooks, T. F.; Martin, R. M.

    1994-01-01

    Recent results are presented from several research efforts aimed at the understanding of rotorcraft blade-vortex interaction (BVI) in terms of the noise generation, directivity, and control. The results are based on work performed by NASA Langley Research Center researchers, both alone and in collaboration with other research organizations. Based on analysis of a simplified physical model, the critical parameters controlling BVI noise generation have been identified. The detailed mapping of the acoustic radiation field of a model rotor in a wind tunnel has revealed the extreme sensitivity of directivity to rotor advance ratio and disk attitude. The control and reduction of BVI noise through the use of higher harmonic pitch control is discussed.

  9. Noise reduction for transonic blade-vortex interactions

    NASA Technical Reports Server (NTRS)

    Xue, Y.; Lyrintzis, A. S.

    1991-01-01

    Several ideas for noise reduction of transonic blade-vortex interactions (BVI) are being introduced and tested using numerical simulation. The model used is the two-dimensional high frequency transonic small disturbance equation with regions of distributed vorticity (VTRAN2 code). The far-field noise signals are obtained by using the Kirchhoff method which extends the numerical two-dimensional near-field aerodynamic results to the linear acoustic three-dimensional far-field. The BVI noise mechanisms are explained and the effects of vortex type and strength, and angle of attack are studied. Particularly, airfoil shape modifications which lead to noise reduction are investigated here. The results presented are expected to be helpful for better understanding of the nature of the BVI noise and better blade design.

  10. Noise reduction for transonic blade-vortex interactions

    NASA Astrophysics Data System (ADS)

    Xue, Y.; Lyrintzis, A. S.

    1991-05-01

    Several ideas for noise reduction of transonic blade-vortex interactions (BVI) are being introduced and tested using numerical simulation. The model used is the two-dimensional high frequency transonic small disturbance equation with regions of distributed vorticity (VTRAN2 code). The far-field noise signals are obtained by using the Kirchhoff method which extends the numerical two-dimensional near-field aerodynamic results to the linear acoustic three-dimensional far-field. The BVI noise mechanisms are explained and the effects of vortex type and strength, and angle of attack are studied. Particularly, airfoil shape modifications which lead to noise reduction are investigated here. The results presented are expected to be helpful for better understanding of the nature of the BVI noise and better blade design.

  11. A parametric study of transonic blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Lyrintzis, A. S.

    1991-01-01

    Several parameters of transonic blade-vortex interactions (BVI) are being studied and some ideas for noise reduction are introduced and tested using numerical simulation. The model used is the two-dimensional high frequency transonic small disturbance equation with regions of distributed vorticity (VTRAN2 code). The far-field noise signals are obtained by using the Kirchhoff method with extends the numerical 2-D near-field aerodynamic results to the linear acoustic 3-D far-field. The BVI noise mechanisms are explained and the effects of vortex type and strength, and angle of attack are studied. Particularly, airfoil shape modifications which lead to noise reduction are investigated. The results presented are expected to be helpful for better understanding of the nature of the BVI noise and better blade design.

  12. Transonic blade-vortex interactions noise: A parametric study

    NASA Technical Reports Server (NTRS)

    Lyrintzis, A. S.; Xue, Y.

    1990-01-01

    Transonic Blade-Vortex Interactions (BVI) are simulated numerically and the noise mechanisms are investigated. The 2-D high frequency transonic small disturbance equation is solved numerically (VTRAN2 code). An Alternating Direction Implicit (ADI) scheme with monotone switches is used; viscous effects are included on the boundary and the vortex is simulated by the cloud-in-cell method. The Kirchoff method is used for the extension of the numerical 2-D near field aerodynamic results to the linear acoustic 3-D far field. The viscous effect (shock/boundary layer interaction) on BVI is investigated. The different types of shock motion are identified and compared. Two important disturbances with different directivity exist in the pressure signal and are believed to be related to the fluctuating lift and drag forces. Noise directivity for different cases is shown. The maximum radiation occurs at an angle between 60 and 90 deg below the horizontal for an airfoil fixed coordinate system and depends on the details of the airfoil shape. Different airfoil shapes are studied and classified according to the BVI noise produced.

  13. HART-II: Prediction of Blade-Vortex Interaction Loading

    NASA Technical Reports Server (NTRS)

    Lim, Joon W.; Tung, Chee; Yu, Yung H.; Burley, Casey L.; Brooks, Thomas; Boyd, Doug; vanderWall, Berend; Schneider, Oliver; Richard, Hugues; Raffel, Markus

    2003-01-01

    During the HART-I data analysis, the need for comprehensive wake data was found including vortex creation and aging, and its re-development after blade-vortex interaction. In October 2001, US Army AFDD, NASA Langley, German DLR, French ONERA and Dutch DNW performed the HART-II test as an international joint effort. The main objective was to focus on rotor wake measurement using a PIV technique along with the comprehensive data of blade deflections, airloads, and acoustics. Three prediction teams made preliminary correlation efforts with HART-II data: a joint US team of US Army AFDD and NASA Langley, German DLR, and French ONERA. The predicted results showed significant improvements over the HART-I predicted results, computed about several years ago, which indicated that there has been better understanding of complicated wake modeling in the comprehensive rotorcraft analysis. All three teams demonstrated satisfactory prediction capabilities, in general, though there were slight deviations of prediction accuracies for various disciplines.

  14. Euler solutions for self-generated rotor blade-vortex interactions

    NASA Technical Reports Server (NTRS)

    Hassan, A. A.; Tung, C.; Sankar, L. N.

    1990-01-01

    A finite-difference procedure was developed, on the basis of the conservation form of the unsteady three-dimensional Euler equations, for the prediction of rotor blade-vortex interactions (BVIs). Numerical solution procedures were obtained for the analysis of the model parallel BVIs and the more realistic helicopter self-generated-rotor BVIs. It was found that, for self-generated subcritical interactions, the accuracy of the predicted leading edge pressures relied heavily on the user-specified vortex core radius and on the CAMRAD-code-predicted geometry of the interaction vortex elements and their relative orientation with respect to the blade. It was also found that the free-wake model used in CAMRAD to predict the tip vortex trajectory for use in the Euler solution yields lower streamwise and higher axial wake convective velocities than those inferred from the experimental data.

  15. The effect of tip vortex structure on helicopter noise due to blade/vortex interaction

    NASA Technical Reports Server (NTRS)

    Wolf, T. L.; Widnall, S. E.

    1978-01-01

    A potential cause of helicopter impulsive noise, commonly called blade slap, is the unsteady lift fluctuation on a rotor blade due to interaction with the vortex trailed from another blade. The relationship between vortex structure and the intensity of the acoustic signal is investigated. The analysis is based on a theoretical model for blade/vortex interaction. Unsteady lift on the blades due to blade/vortex interaction is calculated using linear unsteady aerodynamic theory, and expressions are derived for the directivity, frequency spectrum, and transient signal of the radiated noise. An inviscid rollup model is used to calculate the velocity profile in the trailing vortex from the spanwise distribution of blade tip loading. A few cases of tip loading are investigated, and numerical results are presented for the unsteady lift and acoustic signal due to blade/vortex interaction. The intensity of the acoustic signal is shown to be quite sensitive to changes in tip vortex structure.

  16. Reduction of Helicopter Blade-Vortex Interaction Noise by Active Rotor Control Technology

    NASA Technical Reports Server (NTRS)

    Yu, Yung H.; Gmelin, Bernd; Splettstoesser, Wolf; Brooks, Thomas F.; Philippe, Jean J.; Prieur, Jean

    1997-01-01

    Helicopter blade-vortex interaction noise is one of the most severe noise sources and is very important both in community annoyance and military detection. Research over the decades has substantially improved basic physical understanding of the mechanisms generating rotor blade-vortex interaction noise and also of controlling techniques, particularly using active rotor control technology. This paper reviews active rotor control techniques currently available for rotor blade vortex interaction noise reduction, including higher harmonic pitch control, individual blade control, and on-blade control technologies. Basic physical mechanisms of each active control technique are reviewed in terms of noise reduction mechanism and controlling aerodynamic or structural parameters of a blade. Active rotor control techniques using smart structures/materials are discussed, including distributed smart actuators to induce local torsional or flapping deformations, Published by Elsevier Science Ltd.

  17. Rotorcraft blade/vortex interaction noise - Its generation, radiation, and control

    NASA Technical Reports Server (NTRS)

    Preisser, J. S.; Brooks, T. F.; Martin, R. M.

    1990-01-01

    Recent results are presented from several research efforts aimed at the understanding of rotorcraft blade-vortex interaction noise generation, directivity, and control. The results are based on work performed by researches at the NASA Langley Research Center, both alone and in collaboration with other research organizations. Based on analysis of a simplified physical model, the critical parameters controlling the noise generation are identified. Detailed mapping of the acoustic radiation field reveals the extreme sensitivity of directivity to rotor advance ratio and disk attitude. A means of controlling blade-vortex interaction noise by higher harmonic pitch control is discussed.

  18. Helicopter blade-vortex interaction locations: Scale-model acoustics and free-wake analysis results

    NASA Technical Reports Server (NTRS)

    Hoad, Danny R.

    1987-01-01

    The results of a model rotor acoustic test in the Langley 4by 7-Meter Tunnel are used to evaluate a free-wake analytical technique. An acoustic triangulation technique is used to locate the position in the rotor disk where the blade-vortex interaction noise originates. These locations, along with results of the rotor free-wake analysis, are used to define the geometry of the blade-vortex interaction noise phenomena as well as to determine if the free-wake analysis is a capable diagnostic tool. Data from tests of two teetering rotor systems are used in these analyses.

  19. A comparison of model helicopter rotor Primary and Secondary blade/vortex interaction blade slap

    NASA Technical Reports Server (NTRS)

    Hubbard, J. E., Jr.; Leighton, K. P.

    1983-01-01

    A study of the relative importance of blade/vortex interactions which occur on the retreating side of a model helicopter rotor disk is described. Some of the salient characteristics of this phenomenon are presented and discussed. It is shown that the resulting Secondary blade slap may be of equal or greater intensity than the advancing side (Primary) blade slap. Instrumented model helicopter rotor data is presented which reveals the nature of the retreating blade/vortex interaction. The importance of Secondary blade slap as it applies to predictive techniques or approaches is discussed. When Secondary blade slap occurs it acts to enlarge the window of operating conditions for which blade slap exists.

  20. Helicopter Blade-Vortex Interaction Noise with Comparisons to CFD Calculations

    NASA Technical Reports Server (NTRS)

    McCluer, Megan S.

    1996-01-01

    A comparison of experimental acoustics data and computational predictions was performed for a helicopter rotor blade interacting with a parallel vortex. The experiment was designed to examine the aerodynamics and acoustics of parallel Blade-Vortex Interaction (BVI) and was performed in the Ames Research Center (ARC) 80- by 120-Foot Subsonic Wind Tunnel. An independently generated vortex interacted with a small-scale, nonlifting helicopter rotor at the 180 deg azimuth angle to create the interaction in a controlled environment. Computational Fluid Dynamics (CFD) was used to calculate near-field pressure time histories. The CFD code, called Transonic Unsteady Rotor Navier-Stokes (TURNS), was used to make comparisons with the acoustic pressure measurement at two microphone locations and several test conditions. The test conditions examined included hover tip Mach numbers of 0.6 and 0.7, advance ratio of 0.2, positive and negative vortex rotation, and the vortex passing above and below the rotor blade by 0.25 rotor chords. The results show that the CFD qualitatively predicts the acoustic characteristics very well, but quantitatively overpredicts the peak-to-peak sound pressure level by 15 percent in most cases. There also exists a discrepancy in the phasing (about 4 deg) of the BVI event in some cases. Additional calculations were performed to examine the effects of vortex strength, thickness, time accuracy, and directionality. This study validates the TURNS code for prediction of near-field acoustic pressures of controlled parallel BVI.

  1. On the Use of Vortex-Fitting in the Numerical Simulation of Blade-Vortex Interaction

    NASA Technical Reports Server (NTRS)

    Srinivasan, G. R.; VanDalsem, William (Technical Monitor)

    1997-01-01

    The usefulness of vortex-fitting in the computational fluid dynamics (CFD) methods to preserve the vortex strength and structure while convecting in a uniform free stream is demonstrated through the numerical simulations of two- and three-dimensional blade-vortex interactions. The fundamental premise of the formulation is the velocity and pressure field of the interacting vortex are unaltered either in the presence of an airfoil or a rotor blade or by the resulting nonlinear interactional flowfield. Although, the governing Euler and Navier-Stokes equations are nonlinear and independent solutions cannot be superposed, the interactional flowfield can be accurately captured by adding and subtracting the flowfield of the convecting vortex at each instant. The aerodynamics and aeroacoustics of two- and three-dimensional blade-vortex interactions have been calculated in Refs. 1-6 using this concept. Some of the results from these publications and similar other published material will be summarized in this paper.

  2. Calculation of helicopter rotor blade/vortex interaction by Navier-Stokes procedures

    NASA Technical Reports Server (NTRS)

    Kim, Y.-N.; Shamroth, S. J.; Buggeln, R. C.

    1987-01-01

    Interactions of a modern rotor blade with concentrated tip vortices from the previous blades can have a significant influence on the airloads and the aeroacoustics of a helicopter. A better understanding of the blade/vortex interaction process and a method of analyzing its flow field would provide valuable help in the design of helicopters. The work discussed herein represents an initial effort in applying a 3-D, time-dependent Navier-Stokes simulation to the blade vortex interaction problem. The numerical approach is the Linearized Block Implicit (LBI) technique. In this initial effort, consideration is given to the interaction of a wing of idealized geometry and a vortex whose axis is aligned at an arbitrary angle to the wing. The calculations are made for laminar, subsonic flow, and show the time dependent pressure distribution and flow fields resulting from the interaction.

  3. Simulation of realistic rotor blade-vortex interactions using a finite-difference technique

    NASA Technical Reports Server (NTRS)

    Hassan, Ahmed A.; Charles, Bruce D.

    1989-01-01

    A numerical finite-difference code has been used to predict helicopter blade loads during realistic self-generated three-dimensional blade-vortex interactions. The velocity field is determined via a nonlinear superposition of the rotor flowfield. Data obtained from a lifting-line helicopter/rotor trim code are used to determine the instantaneous position of the interaction vortex elements with respect to the blade. Data obtained for three rotor advance ratios show a reasonable correlation with wind tunnel data.

  4. Flow structure generated by perpendicular blade vortex interaction and implications for helicopter noise predictions

    NASA Technical Reports Server (NTRS)

    Devenport, William J.; Glegg, Stewart A. L.

    1995-01-01

    This report summarizes accomplishments and progress for the period ending April 1995. Much of the work during this period has concentrated on preparation for an analysis of data produced by an extensive wind tunnel test. Time has also been spent further developing an empirical theory to account for the effects of blade-vortex interaction upon the circulation distribution of the vortex and on preliminary measurements aimed at controlling the vortex core size.

  5. Full-Potential Modeling of Blade-Vortex Interactions. Degree awarded by George Washington Univ., Feb. 1987

    NASA Technical Reports Server (NTRS)

    Jones, Henry E.

    1997-01-01

    A study of the full-potential modeling of a blade-vortex interaction was made. A primary goal of this study was to investigate the effectiveness of the various methods of modeling the vortex. The model problem restricts the interaction to that of an infinite wing with an infinite line vortex moving parallel to its leading edge. This problem provides a convenient testing ground for the various methods of modeling the vortex while retaining the essential physics of the full three-dimensional interaction. A full-potential algorithm specifically tailored to solve the blade-vortex interaction (BVI) was developed to solve this problem. The basic algorithm was modified to include the effect of a vortex passing near the airfoil. Four different methods of modeling the vortex were used: (1) the angle-of-attack method, (2) the lifting-surface method, (3) the branch-cut method, and (4) the split-potential method. A side-by-side comparison of the four models was conducted. These comparisons included comparing generated velocity fields, a subcritical interaction, and a critical interaction. The subcritical and critical interactions are compared with experimentally generated results. The split-potential model was used to make a survey of some of the more critical parameters which affect the BVI.

  6. Prediction of blade-vortex interaction noise using airloads generated by a finite-difference technique

    NASA Technical Reports Server (NTRS)

    Tadghighi, Hormoz; Hassan, Ahmed A.; Charles, Bruce

    1990-01-01

    The present numerical finite-difference scheme for helicopter blade-load prediction during realistic, self-generated three-dimensional blade-vortex interactions (BVI) derives the velocity field through a nonlinear superposition of the rotor flow-field yielded by the full potential rotor flow solver RFS2 for BVI, on the one hand, over the rotational vortex flow field computed with the Biot-Savart law. Despite the accurate prediction of the acoustic waveforms, peak amplitudes are found to have been persistently underpredicted. The inclusion of BVI noise source in the acoustic analysis significantly improved the perceived noise level-corrected tone prediction.

  7. An Euler code calculation of blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Hardin, J. C.; Lamkin, S. L.

    1987-01-01

    An Euler code has been developed for calculation of noise radiation due to the interaction of a distributed vortex with a Joukowski airfoil. THe time-dependent incompressible flow field is first determined and then integrated to yield the resulting sound production through use of the elegant low-frequency Green's function approach. This code has several interesting numerical features involved in the vortex motion and in continuous satisfaction of the Kutta condition. In addition, it removes the limitations on Reynolds number and is much more efficient than an earlier Navier-Stokes code. Results indicate that the noise production is due to the deceleration and subsequent acceleration of the vortex as it approaches and passes the airfoil. Predicted acoustic levels and frequencies agree with measured data although a precise comparison would require the strength, size, and position of the incoming vortex to be known.

  8. Flow field and acoustics of two-dimensional transonic blade-vortex interactions

    NASA Astrophysics Data System (ADS)

    George, A. R.; Chang, S.-B.

    1984-10-01

    Blade-vortex interaction noise from full-scale helicopters is shown to involve unsteady transonic flow phenomena which can be modeled as two-dimensional. An unsteady, small-disturbance-theory, numerical analysis, is used to model the interaction of an airfoil with a finite-core, locally-convected vortex using the vortex-in-cell method with multiple branch cuts accounting for the distributed vortices' potential jumps. Strong disturbances propagating from the blade-vortex interaction are associated with occurrence of Tijdeman's Type C flow on the airfoil's lower surface. In this type of flow, the shock which initially terminates a supersonic zone propagates through it and forward off the airfoil. The effects of airfoil shape, angle of attack, Mach number, vortex strength, and vortex miss distance on the flow and on waves radiated forward are investigated. It is found that stronger radiated waves are associated with narrow supersonic regions and near-sonic base flow. Also, stronger vortices generate stronger radiated waves, but miss distance is not as important a factor.

  9. A study of blade-vortex interaction sound generation and directionality

    NASA Technical Reports Server (NTRS)

    Ringler, Todd D.; George, Albert R.; Steele, James B.

    1991-01-01

    The directionality and strength of blade-vortex interactions (BVI) is explained through the radiation cone concept. BVI acoustic radiation is primarily the result of two sound mechanisms: the tip effect, and the radiation cone effect. The radiation cone effect is a highly directional mechanism which results when a lift distribution moves supersonically with respect to the fluid. After a physical explanation of the BVI mechanisms, sample cases using translating and rotating blades interacting with a straight line vortex are shown. The radiation cone concept is then applied to specific rotorcraft cases where it helps to explain zones of intense sound pressure level found in experimental results for the XV-15 tiltrotor and for a BO-105 helicopter scale model.

  10. Helicopter Model Rotor-Blade Vortex Interaction Impulsive Noise: Scalability and Parametric Variations

    NASA Technical Reports Server (NTRS)

    Boxwell, D. A.; Schmitz, F. H.; Splettstoesser, W. R.; Schultz, K. J.

    1987-01-01

    Acoustic data taken in the anechoic Deutsch-Niederlaendischer Windkanal (DNW) have documented the blade-vortex interaction (BVI) impulsive noise radiated from a 1/7-scale model main rotor of the AH-1 series helicopter. Averaged model-scale data were compared with averaged full-scale, in-flight acoustic data under similar non-dimensional test conditions using an improved data analysis technique. At low advance ratios (mu = 0.164 - 0.194), the BVI impulsive noise data scale remarkably well in level, waveform, and directivity patterns. At moderate advance ratios (mu = 0.224 - 0.270), the scaling deteriorates, suggesting that the model-scale rotor is not adequately simulating the full-scale BVI noise. Presently, no proved explanation of this discrepancy exists. Measured BVI noise radiation is highly sensitive to all of the four governing nondimensional parameters--hover tip Mach number, advance ratio, local inflow ratio, and thrust coefficient.

  11. Mach number scaling of helicopter rotor blade/vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Leighton, Kenneth P.; Harris, Wesley L.

    1985-01-01

    A parametric study of model helicopter rotor blade slap due to blade vortex interaction (BVI) was conducted in a 5 by 7.5-foot anechoic wind tunnel using model helicopter rotors with two, three, and four blades. The results were compared with a previously developed Mach number scaling theory. Three- and four-bladed rotor configurations were found to show very good agreement with the Mach number to the sixth power law for all conditions tested. A reduction of conditions for which BVI blade slap is detected was observed for three-bladed rotors when compared to the two-bladed baseline. The advance ratio boundaries of the four-bladed rotor exhibited an angular dependence not present for the two-bladed configuration. The upper limits for the advance ratio boundaries of the four-bladed rotors increased with increasing rotational speed.

  12. Reduction of Blade-Vortex Interaction (BVI) noise through X-force control

    NASA Technical Reports Server (NTRS)

    Schmitz, Fredric H.

    1995-01-01

    Momentum theory and the longitudinal force balance equations of a single rotor helicopter are used to develop simple expressions to describe tip-path-plane tilt and uniform inflow to the rotor. The uniform inflow is adjusted to represent the inflow at certain azimuthal locations where strong Blade-Vortex Interaction (BVI) is likely to occur. This theoretical model is then used to describe the flight conditions where BVI is likely to occur and to explore those flight variables that can be used to minimize BVI noise radiation. A new X-force control is introduced to help minimize BVI noise. Several methods of generating the X-force are presented that can be used to alter the inflow to the rotor and thus increasing the likelihood of avoiding BVI during approaches to a landing.

  13. Prediction of blade-vortex interaction noise using measured blade pressures

    NASA Astrophysics Data System (ADS)

    Joshi, Mahendra C.; Liu, Sandy R.; Boxwell, Donald A.

    1987-10-01

    In the study reported here, blade-vortex interaction noise was predicted using a simplified model of blade pressures measured on a one-seventh scale model AH-1/OLS main rotor. The methods used for the acoustic prediction are based on the acoustic analogy and have been developed by Nakamura (1981) and by Brentner, Nystrom, and Farassat (referred to as the WOPWOP method). The waveforms predicted by the two methods are in good agreement with each other and with the measurements in terms of the number of pulses, the pulse widths, and the separation times between the pulses. The peak amplitude of the dominant pulse may, however, be underpredicted by up to 40 percent, depending on flight conditions. Ways of improving the accuracy of the prediction methods are suggested.

  14. Prediction of blade-vortex interaction noise using measured blade pressures

    NASA Technical Reports Server (NTRS)

    Joshi, Mahendra C.; Liu, Sandy R.; Boxwell, Donald A.

    1987-01-01

    In the study reported here, blade-vortex interaction noise was predicted using a simplified model of blade pressures measured on a one-seventh scale model AH-1/OLS main rotor. The methods used for the acoustic prediction are based on the acoustic analogy and have been developed by Nakamura (1981) and by Brentner, Nystrom, and Farassat (referred to as the WOPWOP method). The waveforms predicted by the two methods are in good agreement with each other and with the measurements in terms of the number of pulses, the pulse widths, and the separation times between the pulses. The peak amplitude of the dominant pulse may, however, be underpredicted by up to 40 percent, depending on flight conditions. Ways of improving the accuracy of the prediction methods are suggested.

  15. Effect of leading-edge porosity on blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Lee, Soogab

    1993-01-01

    The effect of the porous leading-edge of an airfoil on the blade-vortex interaction noise, which dominates far-field acoustic spectrum of the helicopter, is investigated. The thin-layer Navier-Stokes equations are solved with a high-order upwind-biased scheme and a multizonal grid system. The Baldwin-Lomax turbulence model is modified for considering transpiration on the surface. The amplitudes of the propagating acoustic wave in the near-field are calculated directly from the computation. The porosity effect on the surface is modeled. Results show leading-edge transpiration can suppress pressure fluctuations at the leading-edge during BVI, and consequently reduce the amplitude of propagating noise by 30 percent at maximum in the near-field. The effect of porosity factor on the noise level is also investigated.

  16. Blade-Vortex Interaction (BVI) Noise and Airload Prediction Using Loose Aerodynamic/Structural Coupling

    NASA Technical Reports Server (NTRS)

    Sim, B. W.; Lim, J. W.

    2007-01-01

    Predictions of blade-vortex interaction (BVI) noise, using blade airloads obtained from a coupled aerodynamic and structural methodology, are presented. This methodology uses an iterative, loosely-coupled trim strategy to cycle information between the OVERFLOW-2 (CFD) and CAMRAD-II (CSD) codes. Results are compared to the HART-II baseline, minimum noise and minimum vibration conditions. It is shown that this CFD/CSD state-of-the-art approach is able to capture blade airload and noise radiation characteristics associated with BVI. With the exception of the HART-II minimum noise condition, predicted advancing and retreating side BVI for the baseline and minimum vibration conditions agrees favorably with measured data. Although the BVI airloads and noise amplitudes are generally under-predicted, this CFD/CSD methodology provides an overall noteworthy improvement over the lifting line aerodynamics and free-wake models typically used in CSD comprehensive analysis codes.

  17. Studies of blade-vortex interaction noise reduction by rotor blade modification

    NASA Technical Reports Server (NTRS)

    Brooks, Thomas F.

    1993-01-01

    Blade-vortex interaction (BVI) noise is one of the most objectionable types of helicopter noise. This impulsive blade-slap noise can be particularly intense during low-speed landing approach and maneuvers. Over the years, a number of flight and model rotor tests have examined blade tip modification and other blade design changes to reduce this noise. Many times these tests have produced conflicting results. In the present paper, a number of these studies are reviewed in light of the current understanding of the BVI noise problem. Results from one study in particular are used to help establish the noise reduction potential and to shed light on the role of blade design. Current blade studies and some new concepts under development are also described.

  18. Tip-path-plane angle effects on rotor blade-vortex interaction noise levels and directivity

    NASA Astrophysics Data System (ADS)

    Burley, Casey L.; Martin, Ruth M.

    Acoustic data of a scale model BO-105 main rotor acquired in a large aeroacoustic wind tunnel are presented to investigate the parametric effects of rotor operating conditions on blade-vortex interaction (BVI) impulsive noise. Contours of a BVI noise metric are employed to quantify the effects of rotor advance ratio and tip-path-plane angle on BVI noise directivity and amplitude. Acoustic time history data are presented to illustrate the variations in impulsive characteristics. The directionality, noise levels and impulsive content of both advancing and retreating side BVI are shown to vary significantly with tip-path-plane angle and advance ratio over the range of low and moderate flight speeds considered.

  19. Rotor blade-vortex interaction impulsive noise source identification and correlation with rotor wake predictions

    NASA Astrophysics Data System (ADS)

    Splettstoesser, W. R.; Schultz, K. J.; Martin, Ruth M.

    1987-10-01

    An acoustic source localization scheme applicable to noncompact moving sources is developed and applied to the blade-vortex interaction (BVI) noise data of a 40-percent scale BO-105 model rotor. A generalized rotor wake code is employed to predict possible VBI locations on the rotor disk and is found quite useful in interpreting the acoustic localization results. The highly varying directivity patterns of different BVI impulses generated at the same test condition are explained by both the localization results and predicted tip vortex trajectories. The effects of rotor tip-path-plane angle and advance ratio on the BVI source positions is studied. Decreasing tip-path-plane angle (at constant advance ratio) moves the general interaction region upwind on the rotor disk, significantly changing the interaction geometry. Increasing advance ratio (at constant tip-path-plane angle) shifts the general source region downwind on the rotor disk with the increased convection of the vortices until about 60 deg azimuth, where the BVI sources appear to become acoustically less effective. The region of strongest BVI sources lies between 60 and 70 deg azimuth and 80 and 90 percent radius for the moderate range of advance ratios studied.

  20. Acoustic measurements from a rotor blade-vortex interaction noise experiment in the German-Dutch Wind Tunnel (DNW)

    NASA Technical Reports Server (NTRS)

    Martin, Ruth M.; Splettstoesser, W. R.; Elliott, J. W.; Schultz, K.-J.

    1988-01-01

    Acoustic data are presented from a 40 percent scale model of the 4-bladed BO-105 helicopter main rotor, measured in the large European aeroacoustic wind tunnel, the DNW. Rotor blade-vortex interaction (BVI) noise data in the low speed flight range were acquired using a traversing in-flow microphone array. The experimental apparatus, testing procedures, calibration results, and experimental objectives are fully described. A large representative set of averaged acoustic signals is presented.

  1. Acoustic measurements from a rotor blade-vortex interaction noise experiment in the German-Dutch Wind Tunnel (DNW)

    NASA Astrophysics Data System (ADS)

    Martin, Ruth M.; Splettstoesser, W. R.; Elliott, J. W.; Schultz, K.-J.

    1988-03-01

    Acoustic data are presented from a 40 percent scale model of the 4-bladed BO-105 helicopter main rotor, measured in the large European aeroacoustic wind tunnel, the DNW. Rotor blade-vortex interaction (BVI) noise data in the low speed flight range were acquired using a traversing in-flow microphone array. The experimental apparatus, testing procedures, calibration results, and experimental objectives are fully described. A large representative set of averaged acoustic signals is presented.

  2. A study of the noise mechanisms of transonic blade-vortex interactions

    NASA Technical Reports Server (NTRS)

    Lyrintzis, Anastasios S.; Xue, Y.

    1990-01-01

    Transonic blade-vortex interactions (BVI) are simulated numerically and the noise mechanisms are investigated. The two-dimensional high frequency transonic small disturbance equation is solved numerically (VTRAN2 code). An ADI scheme with monotone switches is used; viscous effects are included on the boundary, and the vortex is simulated by the cloud in cell method. The Kirchhoff method is used for the extension of the numerical two-dimensional near-field aerodynamic results to the linear acoustic three dimensional far field. The viscous effects (shock/boundary layer interactions) on BVI is investigated. The different types of shock motion are identified and compared. Two important disturbances with different directivity exist in the pressure signal and are believed to be related to the fluctuating lift and drag forces. Noise directivity for different cases is shown. The maximum radiation occurs at an angle between 60 and 90 degrees below the horizontal for an airfoil-fixed coordinate system and depends on the details of the airfoil shape. Different airfoil shapes are studied and classified according to the BVI noise produced.

  3. Helicopter model rotor-blade vortex interaction impulsive noise: Scalability and parametric variations

    NASA Technical Reports Server (NTRS)

    Splettstoesser, W. R.; Schultz, K. J.; Boxwell, D. A.; Schmitz, F. H.

    1984-01-01

    Acoustic data taken in the anechoic Deutsch-Niederlaendischer Windkanal (DNW) have documented the blade vortex interaction (BVI) impulsive noise radiated from a 1/7-scale model main rotor of the AH-1 series helicopter. Averaged model scale data were compared with averaged full scale, inflight acoustic data under similar nondimensional test conditions. At low advance ratios (mu = 0.164 to 0.194), the data scale remarkable well in level and waveform shape, and also duplicate the directivity pattern of BVI impulsive noise. At moderate advance ratios (mu = 0.224 to 0.270), the scaling deteriorates, suggesting that the model scale rotor is not adequately simulating the full scale BVI noise; presently, no proved explanation of this discrepancy exists. Carefully performed parametric variations over a complete matrix of testing conditions have shown that all of the four governing nondimensional parameters - tip Mach number at hover, advance ratio, local inflow ratio, and thrust coefficient - are highly sensitive to BVI noise radiation.

  4. Advancing-side directivity and retreating-side interactions of model rotor blade-vortex interaction noise

    NASA Astrophysics Data System (ADS)

    Martin, R. M.; Splettstoesser, W. R.; Elliott, J. W.; Schultz, K.-J.

    1988-05-01

    Acoustic data are presented from a 40 percent scale model of the four-bladed BO-105 helicopter main rotor, tested in a large aerodynamic wind tunnel. Rotor blade-vortex interaction (BVI) noise data in the low-speed flight range were acquired using a traversing in-flow microphone array. Acoustic results presented are used to assess the acoustic far field of BVI noise, to map the directivity and temporal characteristics of BVI impulsive noise, and to show the existence of retreating-side BVI signals. The characterics of the acoustic radiation patterns, which can often be strongly focused, are found to be very dependent on rotor operating condition. The acoustic signals exhibit multiple blade-vortex interactions per blade with broad impulsive content at lower speeds, while at higher speeds, they exhibit fewer interactions per blade, with much sharper, higher amplitude acoustic signals. Moderate-amplitude BVI acoustic signals measured under the aft retreating quadrant of the rotor are shown to originate from the retreating side of the rotor.

  5. Signal Analysis of Helicopter Blade-Vortex-Interaction Acoustic Noise Data

    NASA Technical Reports Server (NTRS)

    Rogers, James C.; Dai, Renshou

    1998-01-01

    Blade-Vortex-Interaction (BVI) produces annoying high-intensity impulsive noise. NASA Ames collected several sets of BVI noise data during in-flight and wind tunnel tests. The goal of this work is to extract the essential features of the BVI signals from the in-flight data and examine the feasibility of extracting those features from BVI noise recorded inside a large wind tunnel. BVI noise generating mechanisms and BVI radiation patterns an are considered and a simple mathematical-physical model is presented. It allows the construction of simple synthetic BVI events that are comparable to free flight data. The boundary effects of the wind tunnel floor and ceiling are identified and more complex synthetic BVI events are constructed to account for features observed in the wind tunnel data. It is demonstrated that improved recording of BVI events can be attained by changing the geometry of the rotor hub, floor, ceiling and microphone. The Euclidean distance measure is used to align BVI events from each blade and improved BVI signals are obtained by time-domain averaging the aligned data. The differences between BVI events for individual blades are then apparent. Removal of wind tunnel background noise by optimal Wiener-filtering is shown to be effective provided representative noise-only data have been recorded. Elimination of wind tunnel reflections by cepstral and optimal filtering deconvolution is examined. It is seen that the cepstral method is not applicable but that a pragmatic optimal filtering approach gives encouraging results. Recommendations for further work include: altering measurement geometry, real-time data observation and evaluation, examining reflection signals (particularly those from the ceiling) and performing further analysis of expected BVI signals for flight conditions of interest so that microphone placement can be optimized for each condition.

  6. The effects of vortex modeling on blade-vortex interaction noise prediction

    NASA Technical Reports Server (NTRS)

    Gallman, Judith M.; Tung, Chee; Low, Scott L.

    1995-01-01

    The use of a blade vortex interaction noise prediction scheme, based on CAMRAD/JA, FPR and RAPP, quantifies the effects of errors and assumptions in the modeling of the helicopter's shed vortex on the acoustic predictions. CAMRAD/JA computes the wake geometry and inflow angles that are used in FPR to solve for the aerodynamic surface pressures. RAPP uses these surface pressures to predict the acoustic pressure. Both CAMRAD/JA and FPR utilize the Biot-Savart Law to determine the influence of the vortical velocities on the blade loading and both codes use an algebraic vortex model for the solid body rotation of the vortex core. Large changes in the specification of the vortex core size do not change the inplane wake geometry calculated by CAMRAD/JA and only slightly affect the out-of-plane wake geometry. However, the aerodynamic surface pressure calculated by FPR changes in both magnitude and character with small changes to the core size used by the FPR calculations. This in turn affects the acoustic predictions. Shifting the CAMRAD/JA wake geometry away from the rotor plane by 1/4 chord produces drastic changes in the acoustic predictions indicating that the prediction of acoustic pressure is extremely sensitive to the miss distance between the vortex and the blade and that this distance must be calculated as accurately as possible for acceptable noise predictions. The inclusion or exclusion of a vortex in the FPR-RAPP calculation allows for the determination of the relative importance of that vortex as a BVI noise source.

  7. New techniques for experimental generation of two-dimensional blade-vortex interaction at low Reynolds numbers

    NASA Technical Reports Server (NTRS)

    Booth, E., Jr.; Yu, J. C.

    1986-01-01

    An experimental investigation of two dimensional blade vortex interaction was held at NASA Langley Research Center. The first phase was a flow visualization study to document the approach process of a two dimensional vortex as it encountered a loaded blade model. To accomplish the flow visualization study, a method for generating two dimensional vortex filaments was required. The numerical study used to define a new vortex generation process and the use of this process in the flow visualization study were documented. Additionally, photographic techniques and data analysis methods used in the flow visualization study are examined.

  8. The location of acoustic blade-vortex interaction - A further step toward an understanding of helicopter noise

    NASA Astrophysics Data System (ADS)

    Heller, Hanno; Splettstoesser, Wolf; Schultz, Klaus J.

    1991-02-01

    An ongoing DLR program to determine the sites of helicopter-rotor blade-vortex interactions (BVIs) by means of wind-tunnel experiments is described, and typical results are presented in graphs. A 40-percent-scale model of the BO-105 main rotor is mounted in the main test section of the German-Dutch Wind Tunnel so as to permit undisturbed measurements of the downward-directed acoustic field with a microphone array, and a novel iterative procedure is used to estimate the BVI source regions. This procedure has been validated by comparing the predicted source regions with (1) direct measurements using a rotor model equipped with pressure sensors and (2) the predictions of a three-dimensional aerodynamic blade-tip-vortex wake model.

  9. Effects of a trailing edge flap on the aerodynamics and acoustics of rotor blade-vortex interactions

    NASA Technical Reports Server (NTRS)

    Charles, B. D.; Tadghighi, H.; Hassan, A. A.

    1992-01-01

    The use of a trailing edge flap on a helicopter rotor has been numerically simulated to determine if such a device can mitigate the acoustics of blade vortex interactions (BVI). The numerical procedure employs CAMRAD/JA, a lifting-line helicopter rotor trim code, in conjunction with RFS2, an unsteady transonic full-potential flow solver, and WOPWOP, an acoustic model based on Farassat's formulation 1A. The codes were modified to simulate trailing edge flap effects. The CAMRAD/JA code was used to compute the far wake inflow effects and the vortex wake trajectories and strengths which are utilized by RFS2 to predict the blade surface pressure variations. These pressures were then analyzed using WOPWOP to determine the high frequency acoustic response at several fixed observer locations below the rotor disk. Comparisons were made with different flap deflection amplitudes and rates to assess flap effects on BVI. Numerical experiments were carried out using a one-seventh scale AH-1G rotor system for flight conditions simulating BVI encountered during low speed descending flight with and without flaps. Predicted blade surface pressures and acoustic sound pressure levels obtained have shown good agreement with the baseline no-flap test data obtained in the DNW wind tunnel. Numerical results indicate that the use of flaps is beneficial in reducing BVI noise.

  10. Flow structure generated by perpendicular blade-vortex interaction and implications for helicopter noise prediction. Volume 1: Measurements

    NASA Technical Reports Server (NTRS)

    Wittmer, Kenneth S.; Devenport, William J.

    1996-01-01

    The perpendicular interaction of a streamwise vortex with an infinite span helicopter blade was modeled experimentally in incompressible flow. Three-component velocity and turbulence measurements were made using a sub-miniature four sensor hot-wire probe. Vortex core parameters (radius, peak tangential velocity, circulation, and centerline axial velocity deficit) were determined as functions of blade-vortex separation, streamwise position, blade angle of attack, vortex strength, and vortex size. The downstream development of the flow shows that the interaction of the vortex with the blade wake is the primary cause of the changes in the core parameters. The blade sheds negative vorticity into its wake as a result of the induced angle of attack generated by the passing vortex. Instability in the vortex core due to its interaction with this negative vorticity region appears to be the catalyst for the magnification of the size and intensity of the turbulent flowfield downstream of the interaction. In general, the core radius increases while peak tangential velocity decreases with the effect being greater for smaller separations. These effects are largely independent of blade angle of attack; and if these parameters are normalized on their undisturbed values, then the effects of the vortex strength appear much weaker. Two theoretical models were developed to aid in extending the results to other flow conditions. An empirical model was developed for core parameter prediction which has some rudimentary physical basis, implying usefulness beyond a simple curve fit. An inviscid flow model was also created to estimate the vorticity shed by the interaction blade, and to predict the early stages of its incorporation into the interacting vortex.

  11. Evaluation of helicopter noise due to b blade-vortex interaction for five tip configurations. [conducted in the Langley V/STOL tunnel

    NASA Technical Reports Server (NTRS)

    Hoad, D. R.

    1979-01-01

    The effect of tip shape modification on blade vortex interaction induced helicopter blade slap noise was investigated. Simulated flight and descent velocities which have been shown to produce blade slap were tested. Aerodynamic performance parameters of the rotor system were monitored to ensure properly matched flight conditions among the tip shapes. The tunnel was operated in the open throat configuration with treatment to improve the acoustic characteristics of the test chamber. Four promising tips were used along with a standard square tip as a baseline configuration. A detailed acoustic evaluation on the same rotor system of the relative applicability of the various tip configurations for blade slap noise reduction is provided.

  12. Perpendicular blade vortex interaction and its implications for helicopter noise prediction: Wave-number frequency spectra in a trailing vortex for BWI noise prediction

    NASA Technical Reports Server (NTRS)

    Devenport, William J.; Glegg, Stewart A. L.

    1993-01-01

    Perpendicular blade vortex interactions are a common occurrence in helicopter rotor flows. Under certain conditions they produce a substantial proportion of the acoustic noise. However, the mechanism of noise generation is not well understood. Specifically, turbulence associated with the trailing vortices shed from the blade tips appears insufficient to account for the noise generated. The hypothesis that the first perpendicular interaction experienced by a trailing vortex alters its turbulence structure in such a way as to increase the acoustic noise generated by subsequent interactions is examined. To investigate this hypothesis a two-part investigation was carried out. In the first part, experiments were performed to examine the behavior of a streamwise vortex as it passed over and downstream of a spanwise blade in incompressible flow. Blade vortex separations between +/- one eighth chord were studied for at a chord Reynolds number of 200,000. Three-component velocity and turbulence measurements were made in the flow from 4 chord lengths upstream to 15 chordlengths downstream of the blade using miniature 4-sensor hot wire probes. These measurements show that the interaction of the vortex with the blade and its wake causes the vortex core to loose circulation and diffuse much more rapidly than it otherwise would. Core radius increases and peak tangential velocity decreases with distance downstream of the blade. True turbulence levels within the core are much larger downstream than upstream of the blade. The net result is a much larger and more intense region of turbulent flow than that presented by the original vortex and thus, by implication, a greater potential for generating acoustic noise. In the second part, the turbulence measurements described above were used to derive the necessary inputs to a Blade Wake Interaction (BWI) noise prediction scheme. This resulted in significantly improved agreement between measurements and calculations of the BWI noise spectrum especially for the spectral peak at low frequencies, which previously was poorly predicted.

  13. A parametric study of blade vortex interaction noise for two, three, and four-bladed model rotors at moderate tip speeds Theory and experiment

    NASA Technical Reports Server (NTRS)

    Leighton, K. P.; Harris, W. L.

    1984-01-01

    An investigation of blade slap due to blade vortex interaction (BVI) has been conducted. This investigation consisted of an examination of BVI blade slap for two, three, and four-bladed model rotors at tip Mach numbers ranging from 0.20 to 0.50. Blade slap contours have been obtained for each configuration tested. Differences in blade slap contours, peak sound pressure level, and directivity for each configuration tested are noted. Additional fundamental differences, such as multiple interaction BVI, are observed and occur for only specific rotor blade configurations. The effect of increasing the Mach number on the BVI blade slap for various rotor blade combinations has been quantified. A peak blade slap Mach number scaling law is proposed. Comparison of measured BVI blade slap with theory is made.

  14. Numerical simulation and validation of helicopter blade-vortex interaction using coupled CFD/CSD and three levels of aerodynamic modeling

    NASA Astrophysics Data System (ADS)

    Amiraux, Mathieu

    Rotorcraft Blade-Vortex Interaction (BVI) remains one of the most challenging flow phenomenon to simulate numerically. Over the past decade, the HART-II rotor test and its extensive experimental dataset has been a major database for validation of CFD codes. Its strong BVI signature, with high levels of intrusive noise and vibrations, makes it a difficult test for computational methods. The main challenge is to accurately capture and preserve the vortices which interact with the rotor, while predicting correct blade deformations and loading. This doctoral dissertation presents the application of a coupled CFD/CSD methodology to the problem of helicopter BVI and compares three levels of fidelity for aerodynamic modeling: a hybrid lifting-line/free-wake (wake coupling) method, with modified compressible unsteady model; a hybrid URANS/free-wake method; and a URANS-based wake capturing method, using multiple overset meshes to capture the entire flow field. To further increase numerical correlation, three helicopter fuselage models are implemented in the framework. The first is a high resolution 3D GPU panel code; the second is an immersed boundary based method, with 3D elliptic grid adaption; the last one uses a body-fitted, curvilinear fuselage mesh. The main contribution of this work is the implementation and systematic comparison of multiple numerical methods to perform BVI modeling. The trade-offs between solution accuracy and computational cost are highlighted for the different approaches. Various improvements have been made to each code to enhance physical fidelity, while advanced technologies, such as GPU computing, have been employed to increase efficiency. The resulting numerical setup covers all aspects of the simulation creating a truly multi-fidelity and multi-physics framework. Overall, the wake capturing approach showed the best BVI phasing correlation and good blade deflection predictions, with slightly under-predicted aerodynamic loading magnitudes. However, it proved to be much more expensive than the other two methods. Wake coupling with RANS solver had very good loading magnitude predictions, and therefore good acoustic intensities, with acceptable computational cost. The lifting-line based technique often had over-predicted aerodynamic levels, due to the degree of empiricism of the model, but its very short run-times, thanks to GPU technology, makes it a very attractive approach.

  15. Analysis of helicopter blade vortex structure by laser velocimetry

    NASA Astrophysics Data System (ADS)

    Boutier, A.; Lefèvre, J.; Micheli, F.

    1996-05-01

    In descent flight, helicopter external noise is mainly generated by the Blade Vortex Interaction (BVI). To under-stand the dynamics of this phenomenon, the vortex must be characterized before its interaction with the blade, which means that its viscous core radius, its strength and its distance to the blade have to be determined by non-intrusive measurement techniques. As part of the HART program (Higher Harmonic Control Aeroacoustic Rotor Test, jointly conducted by US Army, NASA, DLR, DNW and ONERA), a series of tests have been made in the German Dutch Wind Tunnel (DNW) on a helicopter rotor with 2 m long blades, rotating at 1040 rpm; several flight configurations, with an advance ratio of 0.15 and a shaft angle of 5.3°, have been studied with different higher harmonic blade pitch angles superposed on the conventional one (corresponding to the baseline case). The flow on the retreating side has been analyzed with an especially designed 3D laser velocimeter, and, simultaneously, the blade tip attitude has been determined in order to get the blade-vortex miss distance, which is a crucial parameter in the noise reduction. A 3D laser velocimeter, in backscatter mode with a working distance of 5 m, was installed on a platform 9 m high, and flow seeding with submicron incense smoke was achieved in the settling chamber using a remotely controlled displacement device. Acquisition of instantaneous velocity vectors by an IFA 750 yielded mean velocity and turbulence maps across the vortex as well as the vortex position, intensity and viscous radius. The blade tip attitude (altitude, jitter, angle of incidence) was recorded by the TART method (Target Attitude in Real Time) which makes use of a CCD camera on which is formed the image of two retroreflecting targets attached to the blade tip and lighted by a flash lamp. In addition to the mean values of the aforementioned quantities, spectra of their fluctuations have been established up to 8 Hz.

  16. Noise Generation of BLADE-VORTEX Resonance

    NASA Astrophysics Data System (ADS)

    LEUNG, R. C. K.; SO, R. M. C.

    2001-08-01

    A numerical study of the aerodynamic noise generated when an airfoil/blade in a uniform flow is excited by an oncoming vortical flow is reported. The vortical flow is modelled by a series of flow convected discrete vortices representative of a Karman vortex street. Such noise generation problems due to fluid-blade interaction occur in helicopter rotor and turbomachinery blades. Interactions with both rigid and elastic airfoil/blade are considered. Under a vortical excitation, aerodynamic resonance of the airfoil/blade at certain excitation frequencies is found to occur and loading noise is generated due to the fluctuations of the aerodynamic loading on the airfoil/blade. For an elastic blade, due the occurrence of structural resonance incited by the flow-induced vibration of the airfoil/blade, a stronger loading noise is generated. The associated thickness effect due to the airfoil/blade vibration is extremely weak. The magnitude of the noise was found to depend on the frequency of the oncoming vortical flow and the geometry and rigidity of the blade.

  17. Rotor system having alternating length rotor blades for reducing blade-vortex interaction (BVI) noise

    NASA Technical Reports Server (NTRS)

    Moffitt, Robert C. (Inventor); Visintainer, Joseph A. (Inventor)

    1997-01-01

    A rotor system (4) having odd and even blade assemblies (O.sub.b, E.sub.b) mounting to and rotating with a rotor hub assembly (6) wherein the odd blade assemblies (O.sub.b) define a radial length R.sub.O, and the even blade assemblies (E.sub.b) define a radial length R.sub.E and wherein the radial length R.sub.E is between about 70% to about 95% of the radial length R.sub.O. Other embodiments of the invention are directed to a Variable Diameter Rotor system (4) which may be configured for operating in various operating modes for optimizing aerodynamic and acoustic performance. The Variable Diameter Rotor system (4) includes odd and even blade assemblies (O.sub.b, E.sub.b) having inboard and outboard blade sections (10, 12) wherein the outboard blade sections (12) telescopically mount to the inboard blade sections (10). The outboard blade sections (12) are positioned with respect to the inboard blade sections (10 such that the radial length R.sub.E of the even blade assemblies (E.sub.b) is equal to the radial length R.sub.O of the odd blade assemblies (O.sub.b) in a first operating mode, and such that the radial length R.sub.E is between about 70% to about 95% of the length R.sub.O in a second operating mode.

  18. Parallel Vegetation Stripe Formation Through Hydrologic Interactions

    NASA Astrophysics Data System (ADS)

    Cheng, Yiwei; Stieglitz, Marc; Turk, Greg; Engel, Victor

    2010-05-01

    It has long been a challenge to theoretical ecologists to describe vegetation pattern formations such as the "tiger bush" stripes and "leopard bush" spots in Niger, and the regular maze patterns often observed in bogs in North America and Eurasia. To date, most of simulation models focus on reproducing the spot and labyrinthine patterns, and on the vegetation bands which form perpendicular to surface and groundwater flow directions. Various hypotheses have been invoked to explain the formation of vegetation patterns: selective grazing by herbivores, fire, and anisotropic environmental conditions such as slope. Recently, short distance facilitation and long distance competition between vegetation (a.k.a scale dependent feedback) has been proposed as a generic mechanism for vegetation pattern formation. In this paper, we test the generality of this mechanism by employing an existing, spatially explicit, advection-reaction-diffusion type model to describe the formation of regularly spaced vegetation bands, including those that are parallel to flow direction. Such vegetation patterns are, for example, characteristic of the ridge and slough habitat in the Florida Everglades and which are thought to have formed parallel to the prevailing surface water flow direction. To our knowledge, this is the first time that a simple model encompassing a nutrient accumulation mechanism along with biomass development and flow is used to demonstrate the formation of parallel stripes. We also explore the interactive effects of plant transpiration, slope and anisotropic hydraulic conductivity on the resulting vegetation pattern. Our results highlight the ability of the short distance facilitation and long distance competition mechanism to explain the formation of the different vegetation patterns beyond semi-arid regions. Therefore, we propose that the parallel stripes, like the other periodic patterns observed in both isotropic and anisotropic environments, are self-organized and form as a result of scale dependent feedback. Results from this study improve upon the current understanding on the formation of parallel stripes and provide a more general theoretical framework for future empirical and modeling efforts.

  19. Parallel Mean Shift for Interactive Volume Segmentation

    NASA Astrophysics Data System (ADS)

    Zhou, Fangfang; Zhao, Ying; Ma, Kwan-Liu

    In this paper we present a parallel dynamic mean shift algorithm based on path transmission for medical volume data segmentation. The algorithm first translates the volume data into a joint position-color feature space subdivided uniformly by bandwidths, and then clusters points in feature space in parallel by iteratively finding its peak point. Over iterations it improves the convergent rate by dynamically updating data points via path transmission and reduces the amount of data points by collapsing overlapping points into one point. The GPU implementation of the algorithm can segment 256x256x256 volume in 6 seconds using an NVIDIA GeForce 8800 GTX card for interactive processing, which is hundreds times faster than its CPU implementation. We also introduce an interactive interface to segment volume data based on this GPU implementation. This interface not only provides the user with the capability to specify segmentation resolution, but also allows the user to operate on the segmented tissues and create desired visualization results.

  20. Parallelized Stochastic Cutoff Method for Long-Range Interacting Systems

    NASA Astrophysics Data System (ADS)

    Endo, Eishin; Toga, Yuta; Sasaki, Munetaka

    2015-07-01

    We present a method of parallelizing the stochastic cutoff (SCO) method, which is a Monte-Carlo method for long-range interacting systems. After interactions are eliminated by the SCO method, we subdivide a lattice into noninteracting interpenetrating sublattices. This subdivision enables us to parallelize the Monte-Carlo calculation in the SCO method. Such subdivision is found by numerically solving the vertex coloring of a graph created by the SCO method. We use an algorithm proposed by Kuhn and Wattenhofer to solve the vertex coloring by parallel computation. This method was applied to a two-dimensional magnetic dipolar system on an L × L square lattice to examine its parallelization efficiency. The result showed that, in the case of L = 2304, the speed of computation increased about 102 times by parallel computation with 288 processors.

  1. Interactive Imaging Science on Parallel Computers: Getting Immediate Results

    SciTech Connect

    Perrine, Kenneth A.; Jones, Donald R.

    2003-04-01

    Gigapixel-size images are used in calculations on parallel machines using the Pacific Northwest National Laboratory (PNNL) Parallel Computational Environment for Imaging Science (PiCEIS). The PiCEIS image browser allows the user to view real-time images as calculations are performed. The user can interact with the images, assign regions of interest to accelerate feedback, and alter algorithm parameters. The images may be displayed on an X11 terminal or parallel compositing hardware. The fast feedback and interactive features available within the image browser component of PiCEIS are valuable tools for imaging science.

  2. An Interactive Parallel Visualization Framework for Distributed Data

    SciTech Connect

    Perrine, Kenneth A.; Jones, Donald R.; Hochschild, Peter; Swetz, Richard A.

    2002-01-20

    A framework for parallel visualization at Pacific Northwest National Laboratory (PNNL) is being developed that utilizes the IBM Scaleable Graphics Engine (SGE) and IBM SP parallel computers. The SGE allows disjoint regions of pixel data to be transferred simultaneously from multiple compute nodes into a unified frame buffer. The joined graphics data is displayed on monitors attached to the SGE. Three parallel applications have been developed that write pixel data directly to local buffers and transfer the buffers to the SGE. A library is being developed to allow OpenGL applications to run in parallel and utilize the SGE. The library and SGE hardware will be an interactive framework for parallel visualization applications.

  3. On the Vacuum-Interaction of Two Parallel Cosmic Strings

    NASA Astrophysics Data System (ADS)

    Bordag, M.

    Cosmic strings are well known solutions of the Einstein equations. In classical physics there is no interaction between such strings. In quantum physics there is an interaction due to vacuum fluctuations like the well known Casimir effect. The interaction energy is calculated in the case of two parallel cosmic strings and shows an attractive force between them.Translated AbstractZur Vakuumwechselwirkung zweier paralleler kosmischer StringsKosmische Strings sind wohlbekannte Lsungen der Einsteinschen Gleichungen. Im Rahmen der klassischen Physik gibt es keine Wechselwirkungen zwischen den Strings. In der Quantenphysik erhalten wir eine Wechselwirkung infolge Vakuumfluktuationen wie im Fall des Casimir-Effekts. Wir berechnen die Wechselwirkungsenergie fr den Fall zweier paralleler kosmischer Strings und zeigen, da eine anziehende Kraft zwischen ihnen besteht.

  4. An interactive parallel programming environment applied in atmospheric science

    NASA Technical Reports Server (NTRS)

    vonLaszewski, G.

    1996-01-01

    This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

  5. An interactive parallel programming environment applied in atmospheric science

    SciTech Connect

    Laszewski, G. von

    1996-12-31

    This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

  6. Parallel Graphics and Interactivity with the Scaleable Graphics Engine

    SciTech Connect

    Perrine, Kenneth A.; Jones, Donald R.

    2001-11-10

    A parallel rendering environment is being developed to utilize the IBM Scaleable Graphics Engine (SGE), a hardware frame buffer for parallel computers. Goals of this software development effort include finding efficient ways of producing and displaying graphics generated on SP nodes and of assisting programmers in adapting or creating scientific simulation applications to use the SGE. Four software development phases are discussed utilize the SGE: tunneling, SMP Rendering, graphics API development using an OpenGL API implementation which utilizes the SGE in the parallel environment, and additions to the SGE-enabled OpenGL API implementation that uses threads. The SGE's ability to accept pixel data from multiple nodes simultaneously makes it a viable tool for use. With the performance observed in the test applications and performance optimizations gained programmers writing applications for IBM SPs and Linux clusters will be able to support high-speed output of graphics and be able to interact with data.

  7. IPython: components for interactive and parallel computing across disciplines. (Invited)

    NASA Astrophysics Data System (ADS)

    Perez, F.; Bussonnier, M.; Frederic, J. D.; Froehle, B. M.; Granger, B. E.; Ivanov, P.; Kluyver, T.; Patterson, E.; Ragan-Kelley, B.; Sailer, Z.

    2013-12-01

    Scientific computing is an inherently exploratory activity that requires constantly cycling between code, data and results, each time adjusting the computations as new insights and questions arise. To support such a workflow, good interactive environments are critical. The IPython project (http://ipython.org) provides a rich architecture for interactive computing with: 1. Terminal-based and graphical interactive consoles. 2. A web-based Notebook system with support for code, text, mathematical expressions, inline plots and other rich media. 3. Easy to use, high performance tools for parallel computing. Despite its roots in Python, the IPython architecture is designed in a language-agnostic way to facilitate interactive computing in any language. This allows users to mix Python with Julia, R, Octave, Ruby, Perl, Bash and more, as well as to develop native clients in other languages that reuse the IPython clients. In this talk, I will show how IPython supports all stages in the lifecycle of a scientific idea: 1. Individual exploration. 2. Collaborative development. 3. Production runs with parallel resources. 4. Publication. 5. Education. In particular, the IPython Notebook provides an environment for "literate computing" with a tight integration of narrative and computation (including parallel computing). These Notebooks are stored in a JSON-based document format that provides an "executable paper": notebooks can be version controlled, exported to HTML or PDF for publication, and used for teaching.

  8. Framework for Interactive Parallel Dataset Analysis on the Grid

    SciTech Connect

    Alexander, David A.; Ananthan, Balamurali; Johnson, Tony; Serbo, Victor; /SLAC

    2007-01-10

    We present a framework for use at a typical Grid site to facilitate custom interactive parallel dataset analysis targeting terabyte-scale datasets of the type typically produced by large multi-institutional science experiments. We summarize the needs for interactive analysis and show a prototype solution that satisfies those needs. The solution consists of desktop client tool and a set of Web Services that allow scientists to sign onto a Grid site, compose analysis script code to carry out physics analysis on datasets, distribute the code and datasets to worker nodes, collect the results back to the client, and to construct professional-quality visualizations of the results.

  9. Investigation of helicopter rotor blade/wake interactive impulsive noise

    NASA Technical Reports Server (NTRS)

    Miley, S. J.; Hall, G. F.; Vonlavante, E.

    1987-01-01

    An analysis of the Tip Aerodynamic/Aeroacoustic Test (TAAT) data was performed to identify possible aerodynamic sources of blade/vortex interaction (BVI) impulsive noise. The identification is based on correlation of measured blade pressure time histories with predicted blade/vortex intersections for the flight condition(s) where impulsive noise was detected. Due to the location of the recording microphones, only noise signatures associated with the advancing blade were available, and the analysis was accordingly restricted to the first and second azimuthal quadrants. The results show that the blade tip region is operating transonically in the azimuthal range where previous BVI experiments indicated the impulsive noise to be. No individual blade/vortex encounter is identifiable in the pressure data; however, there is indication of multiple intersections in the roll-up region which could be the origin of the noise. Discrete blade/vortex encounters are indicated in the second quadrant; however, if impulsive noise were produced here, the directivity pattern would be such that it was not recorded by the microphones. It is demonstrated that the TAAT data base is a valuable resource in the investigation of rotor aerodynamic/aeroacoustic behavior.

  10. Nanoparticle-Target Interactions Parallel Antibody-Protein Interactions

    PubMed Central

    Koh, Isaac; Hong, Rui; Weissleder, Ralph; Josephson, Lee

    2009-01-01

    Magnetic particles can act as magnetic relaxation switches (MRSw's) when they bind to target analytes, and switch between their dispersed and aggregated states resulting in changes in the spin-spin relaxation time (T2) of their surrounding water protons. Both nanoparticles (NPs, 10-100 nm) and micron-sized particles (MPs) have been employed as MRSw's, to sense drugs, metabolites, oligonucleotides, proteins, bacteria and mammalian cells. To better understand how NPs or MPs interact with targets, we employed as a molecular recognition system the reaction between the Tag peptide of the influenza virus hemagglutinin and a monoclonal antibody to that peptide (anti-Tag). To obtain targets of different size and valency, we attached the Tag peptide to BSA (Mw= 65000 Daltons, diameter = 8 nm) and to Latex spheres (diameter = 900 nm). To obtain magnetic probes of very different sizes, anti-Tag was conjugated to 40 nm NPs and 1 ?m MPs. MP and NP probes reacted with Tag peptide targets in a manner similar to antibody/antigen reactions in solution, exhibiting so-called prozone effects. MPs detected all types of targets with higher sensitivity than NPs with targets of higher valency being better detected than those of lower valency. The Tag/anti-tag recognition system can be used to synthesize combinations of molecular targets and magnetic probes, to more fully understand the aggregation reaction that occurs when probes bind targets in solution and the ensuing changes in water relaxation times that result. PMID:19323458

  11. Parallel algorithms for interactive manipulation of digital terrain models

    NASA Technical Reports Server (NTRS)

    Davis, E. W.; Mcallister, D. F.; Nagaraj, V.

    1988-01-01

    Interactive three-dimensional graphics applications, such as terrain data representation and manipulation, require extensive arithmetic processing. Massively parallel machines are attractive for this application since they offer high computational rates, and grid connected architectures provide a natural mapping for grid based terrain models. Presented here are algorithms for data movement on the massive parallel processor (MPP) in support of pan and zoom functions over large data grids. It is an extension of earlier work that demonstrated real-time performance of graphics functions on grids that were equal in size to the physical dimensions of the MPP. When the dimensions of a data grid exceed the processing array size, data is packed in the array memory. Windows of the total data grid are interactively selected for processing. Movement of packed data is needed to distribute items across the array for efficient parallel processing. Execution time for data movement was found to exceed that for arithmetic aspects of graphics functions. Performance figures are given for routines written in MPP Pascal.

  12. Long-range interactions and parallel scalability in molecular simulations

    NASA Astrophysics Data System (ADS)

    Patra, Michael; Hyvönen, Marja T.; Falck, Emma; Sabouri-Ghomi, Mohsen; Vattulainen, Ilpo; Karttunen, Mikko

    2007-01-01

    Typical biomolecular systems such as cellular membranes, DNA, and protein complexes are highly charged. Thus, efficient and accurate treatment of electrostatic interactions is of great importance in computational modeling of such systems. We have employed the GROMACS simulation package to perform extensive benchmarking of different commonly used electrostatic schemes on a range of computer architectures (Pentium-4, IBM Power 4, and Apple/IBM G5) for single processor and parallel performance up to 8 nodes—we have also tested the scalability on four different networks, namely Infiniband, GigaBit Ethernet, Fast Ethernet, and nearly uniform memory architecture, i.e. communication between CPUs is possible by directly reading from or writing to other CPUs' local memory. It turns out that the particle-mesh Ewald method (PME) performs surprisingly well and offers competitive performance unless parallel runs on PC hardware with older network infrastructure are needed. Lipid bilayers of sizes 128, 512 and 2048 lipid molecules were used as the test systems representing typical cases encountered in biomolecular simulations. Our results enable an accurate prediction of computational speed on most current computing systems, both for serial and parallel runs. These results should be helpful in, for example, choosing the most suitable configuration for a small departmental computer cluster.

  13. A multimodal parallel architecture: A cognitive framework for multimodal interactions.

    PubMed

    Cohn, Neil

    2016-01-01

    Human communication is naturally multimodal, and substantial focus has examined the semantic correspondences in speech-gesture and text-image relationships. However, visual narratives, like those in comics, provide an interesting challenge to multimodal communication because the words and/or images can guide the overall meaning, and both modalities can appear in complicated "grammatical" sequences: sentences use a syntactic structure and sequential images use a narrative structure. These dual structures create complexity beyond those typically addressed by theories of multimodality where only a single form uses combinatorial structure, and also poses challenges for models of the linguistic system that focus on single modalities. This paper outlines a broad theoretical framework for multimodal interactions by expanding on Jackendoff's (2002) parallel architecture for language. Multimodal interactions are characterized in terms of their component cognitive structures: whether a particular modality (verbal, bodily, visual) is present, whether it uses a grammatical structure (syntax, narrative), and whether it "dominates" the semantics of the overall expression. Altogether, this approach integrates multimodal interactions into an existing framework of language and cognition, and characterizes interactions between varying complexity in the verbal, bodily, and graphic domains. The resulting theoretical model presents an expanded consideration of the boundaries of the "linguistic" system and its involvement in multimodal interactions, with a framework that can benefit research on corpus analyses, experimentation, and the educational benefits of multimodality. PMID:26491835

  14. Parallel Force Assay for Protein-Protein Interactions

    PubMed Central

    Aschenbrenner, Daniela; Pippig, Diana A.; Klamecka, Kamila; Limmer, Katja; Leonhardt, Heinrich; Gaub, Hermann E.

    2014-01-01

    Quantitative proteome research is greatly promoted by high-resolution parallel format assays. A characterization of protein complexes based on binding forces offers an unparalleled dynamic range and allows for the effective discrimination of non-specific interactions. Here we present a DNA-based Molecular Force Assay to quantify protein-protein interactions, namely the bond between different variants of GFP and GFP-binding nanobodies. We present different strategies to adjust the maximum sensitivity window of the assay by influencing the binding strength of the DNA reference duplexes. The binding of the nanobody Enhancer to the different GFP constructs is compared at high sensitivity of the assay. Whereas the binding strength to wild type and enhanced GFP are equal within experimental error, stronger binding to superfolder GFP is observed. This difference in binding strength is attributed to alterations in the amino acids that form contacts according to the crystal structure of the initial wild type GFP-Enhancer complex. Moreover, we outline the potential for large-scale parallelization of the assay. PMID:25546146

  15. Highly parallel characterization of IgG Fc binding interactions.

    PubMed

    Boesch, Austin W; Brown, Eric P; Cheng, Hao D; Ofori, Maame Ofua; Normandin, Erica; Nigrovic, Peter A; Alter, Galit; Ackerman, Margaret E

    2014-01-01

    Because the variable ability of the antibody constant (Fc) domain to recruit innate immune effector cells and complement is a major factor in antibody activity in vivo, convenient means of assessing these binding interactions is of high relevance to the development of enhanced antibody therapeutics, and to understanding the protective or pathogenic antibody response to infection, vaccination, and self. Here, we describe a highly parallel microsphere assay to rapidly assess the ability of antibodies to bind to a suite of antibody receptors. Fc and glycan binding proteins such as FcγR and lectins were conjugated to coded microspheres and the ability of antibodies to interact with these receptors was quantified. We demonstrate qualitative and quantitative assessment of binding preferences and affinities across IgG subclasses, Fc domain point mutants, and antibodies with variant glycosylation. This method can serve as a rapid proxy for biophysical methods that require substantial sample quantities, high-end instrumentation, and serial analysis across multiple binding interactions, thereby offering a useful means to characterize monoclonal antibodies, clinical antibody samples, and antibody mimics, or alternatively, to investigate the binding preferences of candidate Fc receptors. PMID:24927273

  16. Social interaction shapes babbling: Testing parallels between birdsong and speech

    NASA Astrophysics Data System (ADS)

    Goldstein, Michael H.; King, Andrew P.; West, Meredith J.

    2003-06-01

    Birdsong is considered a model of human speech development at behavioral and neural levels. Few direct tests of the proposed analogs exist, however. Here we test a mechanism of phonological development in human infants that is based on social shaping, a selective learning process first documented in songbirds. By manipulating mothers' reactions to their 8-month-old infants' vocalizations, we demonstrate that phonological features of babbling are sensitive to nonimitative social stimulation. Contingent, but not noncontingent, maternal behavior facilitates more complex and mature vocal behavior. Changes in vocalizations persist after the manipulation. The data show that human infants use social feedback, facilitating immediate transitions in vocal behavior. Social interaction creates rapid shifts to developmentally more advanced sounds. These transitions mirror the normal development of speech, supporting the predictions of the avian social shaping model. These data provide strong support for a parallel in function between vocal precursors of songbirds and infants. Because imitation is usually considered the mechanism for vocal learning in both taxa, the findings introduce social shaping as a general process underlying the development of speech and song.

  17. VORTEX-SURFACE Interaction Noise: a Compendium of Worked Examples

    NASA Astrophysics Data System (ADS)

    ABOU-HUSSEIN, H.; DEBENEDICTIS, A.; HARRISON, N.; KIM, M.; RODRIGUES, M. A.; ZAGADOU, F.; HOWE, M. S.

    2002-05-01

    Students attending a graduate course on the Theory of Vortex Sound given recently at Boston University were required to investigate the low Mach number unsteady flow and the accompanying acoustic radiation for a selection of idealized flow-structure interactions. These included linear and non-linear parallel blade-vortex interactions for two-dimensional airfoils, and for finite span airfoils of variable chord; interactions between line vortices and surface projections from a plane wall; bluff-body interactions involving line and ring vortices impinging on circular cylindrical and spherical bodies, and vortex motion in the neighborhood of a wall aperture. In all cases, the effective source region was localized in either two or three dimensions, and could be regarded as acoustically compact, and the sound was calculated by routine numerical methods using the theory of compact Green functions. The results are collected together in this paper as a compendium of canonical solutions that provide qualitative and quantitative insight into the mechanisms responsible for sound production, and a database that can be used to validate predictions of more generally applicable numerical schemes.

  18. Bayesian seismic tomography by parallel interacting Markov chains

    NASA Astrophysics Data System (ADS)

    Gesret, Alexandrine; Bottero, Alexis; Romary, Thomas; Noble, Mark; Desassis, Nicolas

    2014-05-01

    The velocity field estimated by first arrival traveltime tomography is commonly used as a starting point for further seismological, mineralogical, tectonic or similar analysis. In order to interpret quantitatively the results, the tomography uncertainty values as well as their spatial distribution are required. The estimated velocity model is obtained through inverse modeling by minimizing an objective function that compares observed and computed traveltimes. This step is often performed by gradient-based optimization algorithms. The major drawback of such local optimization schemes, beyond the possibility of being trapped in a local minimum, is that they do not account for the multiple possible solutions of the inverse problem. They are therefore unable to assess the uncertainties linked to the solution. Within a Bayesian (probabilistic) framework, solving the tomography inverse problem aims at estimating the posterior probability density function of velocity model using a global sampling algorithm. Markov chains Monte-Carlo (MCMC) methods are known to produce samples of virtually any distribution. In such a Bayesian inversion, the total number of simulations we can afford is highly related to the computational cost of the forward model. Although fast algorithms have been recently developed for computing first arrival traveltimes of seismic waves, the complete browsing of the posterior distribution of velocity model is hardly performed, especially when it is high dimensional and/or multimodal. In the latter case, the chain may even stay stuck in one of the modes. In order to improve the mixing properties of classical single MCMC, we propose to make interact several Markov chains at different temperatures. This method can make efficient use of large CPU clusters, without increasing the global computational cost with respect to classical MCMC and is therefore particularly suited for Bayesian inversion. The exchanges between the chains allow a precise sampling of the high probability zones of the model space while avoiding the chains to end stuck in a probability maximum. This approach supplies thus a robust way to analyze the tomography imaging uncertainties. The interacting MCMC approach is illustrated on two synthetic examples of tomography of calibration shots such as encountered in induced microseismic studies. On the second application, a wavelet based model parameterization is presented that allows to significantly reduce the dimension of the problem, making thus the algorithm efficient even for a complex velocity model.

  19. Highly parallel measurements of interaction kinetic constants with a microfabricated optomechanical device

    NASA Astrophysics Data System (ADS)

    Bates, Steven R.; Quake, Stephen R.

    2009-08-01

    We used mechanical trapping of molecular interactions to demonstrate a highly parallel approach to measure the kinetics of biomolecular interactions. This approach consumes 25 fmol of material per measurement and permits 320 measurements in a single experiment. We measured association and dissociation curves for the interactions of 6-His and T7 epitope tags with their antibodies, from which we determined the off rates, on rates, and dissociation constants.

  20. Solar wind interaction with Venus and Mars in a parallel hybrid code

    NASA Astrophysics Data System (ADS)

    Jarvinen, Riku; Sandroos, Arto

    2013-04-01

    We discuss the development and applications of a new parallel hybrid simulation, where ions are treated as particles and electrons as a charge-neutralizing fluid, for the interaction between the solar wind and Venus and Mars. The new simulation code under construction is based on the algorithm of the sequential global planetary hybrid model developed at the Finnish Meteorological Institute (FMI) and on the Corsair parallel simulation platform also developed at the FMI. The FMI's sequential hybrid model has been used for studies of plasma interactions of several unmagnetized and weakly magnetized celestial bodies for more than a decade. Especially, the model has been used to interpret in situ particle and magnetic field observations from plasma environments of Mars, Venus and Titan. Further, Corsair is an open source MPI (Message Passing Interface) particle and mesh simulation platform, mainly aimed for simulations of diffusive shock acceleration in solar corona and interplanetary space, but which is now also being extended for global planetary hybrid simulations. In this presentation we discuss challenges and strategies of parallelizing a legacy simulation code as well as possible applications and prospects of a scalable parallel hybrid model for the solar wind interactions of Venus and Mars.

  1. Parallel BDD-based monolithic approach for acoustic fluid-structure interaction

    NASA Astrophysics Data System (ADS)

    Minami, Satsuki; Kawai, Hiroshi; Yoshimura, Shinobu

    2012-12-01

    Parallel BDD-based monolithic algorithms for acoustic fluid-structure interaction problems are developed. In a previous study, two schemes, NN-I + CGC-FULL and NN-I + CGC-DIAG, have been proven to be efficient among several BDD-type schemes for one processor. Thus, the parallelization of these schemes is discussed in the present study. These BDD-type schemes consist of the operations of the Schur complement matrix-vector (Sv) product, Neumann-Neumann (NN) preconditioning, and the coarse problem. In the present study, the Sv product and NN preconditioning are parallelized for both schemes, and the parallel implementation of the solid and fluid parts of the coarse problem is considered for NN-I + CGC-DIAG. The results of numerical experiments indicate that both schemes exhibit performances that are almost as good as those of single solid and fluid analyses in the Sv product and NN preconditioning. Moreover, NN-I + CGC-DIAG appears to become more efficient as the problem size becomes large due to the parallel calculation of the coarse problem.

  2. A Theory of Interactive Parallel Processing: New Capacity Measures and Predictions for a Response Time Inequality Series

    ERIC Educational Resources Information Center

    Townsend, James T.; Wenger, Michael J.

    2004-01-01

    The authors present a theory of stochastic interactive parallel processing with special emphasis on channel interactions and their relation to system capacity. The approach is based both on linear systems theory augmented with stochastic elements and decisional operators and on a metatheory of parallel channels' dependencies that incorporates

  3. Interactions between glide dislocations and parallel interfacial dislocations in nanoscale strained layers

    SciTech Connect

    Akasheh, F.; Zbib, H. M.; Hirth, J. P.; Hoagland, R. G.; Misra, A.

    2007-08-01

    Plastic deformation in nanoscale multilayered structures is thought to proceed by the successive propagation of single dislocation loops at the interfaces. Based on this view, we simulate the effect of predeposited interfacial dislocation on the stress (channeling stress) needed to propagate a new loop parallel to existing loops. Single interfacial dislocations as well as finite parallel arrays are considered in the computation. When the gliding dislocation and the predeposited interfacial array have collinear Burgers vectors, the channeling stress increases monotonically as the density of dislocations in the array increases. In the case when their Burgers vectors are inclined at 60 deg. , a regime of perfect plasticity is observed which can be traced back to an instability in the flow stress arising from the interaction between the glide dislocation and a single interfacial dislocation dipole. This interaction leads to a tendency for dislocations of alternating Burgers vectors to propagate during deformation leading to nonuniform arrays. Inclusion of these parallel interactions in the analysis improves the strength predictions as compared with the measured strength of a Cu-Ni multilayered system in the regime where isolated glide dislocation motion controls flow, but does not help to explain the observed strength saturation when the individual layer thickness is in the few nanometer range.

  4. Parallel implementation of three-dimensional molecular dynamic simulation for laser-cluster interaction

    SciTech Connect

    Holkundkar, Amol R.

    2013-11-15

    The objective of this article is to report the parallel implementation of the 3D molecular dynamic simulation code for laser-cluster interactions. The benchmarking of the code has been done by comparing the simulation results with some of the experiments reported in the literature. Scaling laws for the computational time is established by varying the number of processor cores and number of macroparticles used. The capabilities of the code are highlighted by implementing various diagnostic tools. To study the dynamics of the laser-cluster interactions, the executable version of the code is available from the author.

  5. Parallel processing

    SciTech Connect

    Krishnamurthy, E.V. )

    1989-01-01

    This book provides a introduction to the fundamental principles and practice of parallel processing. After a general introduction to the many facets of parallelism, the first part of the book is devoted to the development of a coherent theoretical framework. Particular attention is paid to the modeling, semantics and complexity of interacting parallel processes. The second part of the book considers the more practical aspects such as parallel processor architecture, parallel and distributed programming, and concurrent transaction handling in databases.

  6. Engineering of parallel plasmonic-photonic interactions for on-chip refractive index sensors

    NASA Astrophysics Data System (ADS)

    Lin, Linhan; Zheng, Yuebing

    2015-07-01

    Ultra-narrow linewidth in the extinction spectrum of noble metal nanoparticle arrays induced by the lattice plasmon resonances (LPRs) is of great significance for applications in plasmonic lasers and plasmonic sensors. However, the challenge of sustaining LPRs in an asymmetric environment greatly restricts their practical applications, especially for high-performance on-chip plasmonic sensors. Herein, we fully study the parallel plasmonic-photonic interactions in both the Au nanodisk arrays (NDAs) and the core/shell SiO2/Au nanocylinder arrays (NCAs). Different from the dipolar interactions in the conventionally studied orthogonal coupling, the horizontal propagating electric field introduces the out-of-plane ``hot spots'' and results in electric field delocalization. Through controlling the aspect ratio to manipulate the ``hot spot'' distributions of the localized surface plasmon resonances (LSPRs) in the NCAs, we demonstrate a high-performance refractive index sensor with a wide dynamic range of refractive indexes ranging from 1.0 to 1.5. Both high figure of merit (FOM) and high signal-to-noise ratio (SNR) can be maintained under these detectable refractive indices. Furthermore, the electromagnetic field distributions confirm that the high FOM in the wide dynamic range is attributed to the parallel coupling between the superstrate diffraction orders and the height-induced LSPR modes. Our study on the near-field ``hot-spot'' engineering and far-field parallel coupling paves the way towards improved understanding of the parallel LPRs and the design of high-performance on-chip refractive index sensors.Ultra-narrow linewidth in the extinction spectrum of noble metal nanoparticle arrays induced by the lattice plasmon resonances (LPRs) is of great significance for applications in plasmonic lasers and plasmonic sensors. However, the challenge of sustaining LPRs in an asymmetric environment greatly restricts their practical applications, especially for high-performance on-chip plasmonic sensors. Herein, we fully study the parallel plasmonic-photonic interactions in both the Au nanodisk arrays (NDAs) and the core/shell SiO2/Au nanocylinder arrays (NCAs). Different from the dipolar interactions in the conventionally studied orthogonal coupling, the horizontal propagating electric field introduces the out-of-plane ``hot spots'' and results in electric field delocalization. Through controlling the aspect ratio to manipulate the ``hot spot'' distributions of the localized surface plasmon resonances (LSPRs) in the NCAs, we demonstrate a high-performance refractive index sensor with a wide dynamic range of refractive indexes ranging from 1.0 to 1.5. Both high figure of merit (FOM) and high signal-to-noise ratio (SNR) can be maintained under these detectable refractive indices. Furthermore, the electromagnetic field distributions confirm that the high FOM in the wide dynamic range is attributed to the parallel coupling between the superstrate diffraction orders and the height-induced LSPR modes. Our study on the near-field ``hot-spot'' engineering and far-field parallel coupling paves the way towards improved understanding of the parallel LPRs and the design of high-performance on-chip refractive index sensors. Electronic supplementary information (ESI) available. See DOI: 10.1039/c5nr03159a

  7. Formation of electron kappa distributions due to interactions with parallel propagating whistler waves

    SciTech Connect

    Tao, X. Lu, Q.; Mengcheng National Geophysical Observatory, School of Earth and Space Sciences, University of Science and Technology of China, Hefei, Anhui 230026

    2014-02-15

    In space plasmas, charged particles are frequently observed to possess a high-energy tail, which is often modeled by a kappa-type distribution function. In this work, the formation of the electron kappa distribution in generation of parallel propagating whistler waves is investigated using fully nonlinear particle-in-cell (PIC) simulations. A previous research concluded that the bi-Maxwellian character of electron distributions is preserved in PIC simulations. We now demonstrate that for interactions between electrons and parallel propagating whistler waves, a non-Maxwellian high-energy tail can be formed, and a kappa distribution can be used to fit the electron distribution in time-asymptotic limit. The ?-parameter is found to decrease with increasing initial temperature anisotropy or decreasing ratio of electron plasma frequency to cyclotron frequency. The results might be helpful to understanding the origin of electron kappa distributions observed in space plasmas.

  8. Engineering of parallel plasmonic-photonic interactions for on-chip refractive index sensors.

    PubMed

    Lin, Linhan; Zheng, Yuebing

    2015-07-28

    Ultra-narrow linewidth in the extinction spectrum of noble metal nanoparticle arrays induced by the lattice plasmon resonances (LPRs) is of great significance for applications in plasmonic lasers and plasmonic sensors. However, the challenge of sustaining LPRs in an asymmetric environment greatly restricts their practical applications, especially for high-performance on-chip plasmonic sensors. Herein, we fully study the parallel plasmonic-photonic interactions in both the Au nanodisk arrays (NDAs) and the core/shell SiO2/Au nanocylinder arrays (NCAs). Different from the dipolar interactions in the conventionally studied orthogonal coupling, the horizontal propagating electric field introduces the out-of-plane "hot spots" and results in electric field delocalization. Through controlling the aspect ratio to manipulate the "hot spot" distributions of the localized surface plasmon resonances (LSPRs) in the NCAs, we demonstrate a high-performance refractive index sensor with a wide dynamic range of refractive indexes ranging from 1.0 to 1.5. Both high figure of merit (FOM) and high signal-to-noise ratio (SNR) can be maintained under these detectable refractive indices. Furthermore, the electromagnetic field distributions confirm that the high FOM in the wide dynamic range is attributed to the parallel coupling between the superstrate diffraction orders and the height-induced LSPR modes. Our study on the near-field "hot-spot" engineering and far-field parallel coupling paves the way towards improved understanding of the parallel LPRs and the design of high-performance on-chip refractive index sensors. PMID:26133011

  9. Propeller tip vortex interactions

    NASA Technical Reports Server (NTRS)

    Johnston, Robert T.; Sullivan, John P.

    1990-01-01

    Propeller wakes interacting with aircraft aerodynamic surfaces are a source of noise and vibration. For this reason, flow visualization work on the motion of the helical tip vortex over a wing and through the second stage of a counterrotation propeller (CRP) has been pursued. Initially, work was done on the motion of a propeller helix as it passes over the center of a 9.0 aspect ratio wing. The propeller tip vortex experiences significant spanwise displacements when passing across a lifting wing. A stationary propeller blade or stator was installed behind the rotating propeller to model the blade vortex interaction in a CRP. The resulting vortex interaction was found to depend on the relative vortex strengths and vortex sign.

  10. Nice Guys Finish Fast and Bad Guys Finish Last: Facilitatory vs. Inhibitory Interaction in Parallel Systems

    PubMed Central

    Eidels, Ami; Houpt, Joseph W.; Altieri, Nicholas; Pei, Lei; Townsend, James T.

    2011-01-01

    Systems Factorial Technology is a powerful framework for investigating the fundamental properties of human information processing such as architecture (i.e., serial or parallel processing) and capacity (how processing efficiency is affected by increased workload). The Survivor Interaction Contrast (SIC) and the Capacity Coefficient are effective measures in determining these underlying properties, based on response-time data. Each of the different architectures, under the assumption of independent processing, predicts a specific form of the SIC along with some range of capacity. In this study, we explored SIC predictions of discrete-state (Markov process) and continuous-state (Linear Dynamic) models that allow for certain types of cross-channel interaction. The interaction can be facilitatory or inhibitory: one channel can either facilitate, or slow down processing in its counterpart. Despite the relative generality of these models, the combination of the architecture-oriented plus the capacity oriented analyses provide for precise identification of the underlying system. PMID:21516183

  11. Parallel algorithms and applications of configuration-interaction shell-model code BIGSTICK

    NASA Astrophysics Data System (ADS)

    Krastev, Plamen; Johnson, Calvin; Ormand, Erich

    2010-11-01

    Nuclear shell-model, together with two- and three-body interactions, is a powerful tool for gaining insight for properties of light nuclei. The aid of advanced computer resources is of major importance in such calculations. We report on the latest developments and applications of configuration-interaction shell-model code BIGSTICK -- an efficient parallel on-the-fly code which solves the nuclear many-body problem with both two- and three-body interactions. The US Department of Energy supported this investigation through Contract Nos. DE-FG02-96ER40985 and DE-FC02- 09ER41587 and through Subcontract No. B576152 of the Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344.

  12. Orbital-based insights into parallel-displaced and twisted conformations in ?-? interactions.

    PubMed

    Lutz, Patricia B; Bayse, Craig A

    2013-06-21

    Dispersion and electrostatics are known to stabilize ?-? interactions, but the preference for parallel-displaced (PD) and/or twisted (TW) over sandwiched (S) conformations is not well understood. Orbital interactions are generally believed to play little to no role in ?-stacking. However, orbital analysis of the dimers of benzene, pyridine, cytosine and several polyaromatic hydrocarbons demonstrates that PD and/or TW structures convert one or more ?-type dimer MOs with out-of-phase or antibonding inter-ring character at the S stack to in-phase or bonding in the PD/TW stack. This change in dimer MO character can be described in terms of a qualitative stack bond order (SBO) defined as the difference between the number of occupied in-phase/bonding and out-of-phase/antibonding inter-ring ?-type MOs. The concept of an SBO is introduced here in analogy to the bond order in molecular orbital theory. Thus, whereas the SBO of the S structure is zero, parallel displacement or twisting the stack results in a non-zero SBO and overall bonding character. The shift in bonding/antibonding character found at optimal PD/TW structures maximizes the inter-ring density, as measured by intermolecular Wiberg bond indices (WBIs). Values of WBIs calculated as a function of the parallel-displacement are found to correlate with the dispersion and other contributions to the ?-? interaction energy determined by the highly accurate density-fitting DFT symmetry adapted perturbation theory (DF-DFT-SAPT) method. These DF-DFT-SAPT calculations also suggest that the dispersion and other contributions are maximized at the PD conformation rather than the S when conducted on a potential energy curve where the inter-ring distance is optimized at fixed slip distances. From these results of this study, we conclude that descriptions of the qualitative manner in which orbitals interact within ?-stacking interactions can supplement high-level calculations of the interaction energy and provide an intuitive tool for applications to crystal design, molecular recognition and other fields where non-covalent interactions are important. PMID:23665910

  13. Dynamical interaction effects on an electric dipole moving parallel to a flat solid surface

    SciTech Connect

    Villo-Perez, Isidro; Abril, Isabel; Garcia-Molina, Rafael; Arista, Nestor R.

    2005-05-15

    The interaction experienced by a fast electric dipole moving parallel and close to a flat solid surface is studied using the dielectric formalism. Analytical expressions for the force acting on the dipole, for random and for particular orientations, are obtained. Several features related to the dynamical effects on the induced forces are discussed, and numerical values are obtained for the different cases. The calculated energy loss of the electric dipole provides useful estimations which could be of interest for small-angle scattering experiments using polar molecules.

  14. Parallel on-the-fly configuration-interaction shell-model code

    NASA Astrophysics Data System (ADS)

    Ormand, William; Johnson, Calvin; Krastev, Plamen

    2009-10-01

    Configuration-interaction shell-model codes generally rely on computing and storing the full many-body Hamiltonian matrix, which while sparse, nonetheless push computational memory demands, especially when the number of basis states approach 10^8 and up. On-the-fly algorithms mitigate the memory burden by factorizing both the basis and the Hamiltonian. We describe BIGSTICK, an efficient on-the-fly code designed for large-scale parallel operation with both two- and three-body interactions. We present algorithm developments utilizing MPI, OPENMP, and hybrid schemes. Prepared by LLNL under Contract DE-AC52-07NA27344. Support from U.S. DOE/SC/NP (Work Proposal No. SCW0498) and U.S. DOE Grants DE-FG02-03ER41272 and DE-FC02-09ER41587 is acknowledged.

  15. Efficient Parallel Analysis of Shell-fluid Interaction Problem by Using Monolithic Method Based on Consistent Pressure Poisson Equation

    NASA Astrophysics Data System (ADS)

    Ishihara, Daisuke; Kanei, Shigeo; Yoshimura, Shinobu; Horie, Tomoyoshi

    In this paper, a parallel monolithic method for shell-fluid interaction based on the consistent Pressure Poisson Equation (PPE) is developed and its parallel computational efficiency is demonstrated. The Conjugate Gradient (CG) method without any preconditioner works well to solve the consistent PPE, even though the coefficient matrix of the original coupled equation system becomes ill-conditioned due to (a) the inhomogeneity of submatrix elements between the fluid and the structure and (b) the ill-conditioned submatrix of shell structure. Thus our parallel monolithic method using the consistent PPE and the CG method without any preconditioner is efficient for parallel analyses of shell-fluid interaction problems. The present parallel solution procedure is based on the mesh decomposition. To demonstrate the performances of the developed method, it is applied to simulate the vibration of an elastic plate situated in the wake of a rectangular prism and a flapping elastic wing in quiescent fluid.

  16. Electromagnetic semitransparent δ-function plate: Casimir interaction energy between parallel infinitesimally thin plates

    NASA Astrophysics Data System (ADS)

    Parashar, Prachi; Milton, Kimball A.; Shajesh, K. V.; Schaden, M.

    2012-10-01

    We derive boundary conditions for electromagnetic fields on a δ-function plate. The optical properties of such a plate are shown to necessarily be anisotropic in that they only depend on the transverse properties of the plate. We unambiguously obtain the boundary conditions for a perfectly conducting δ-function plate in the limit of infinite dielectric response. We show that a material does not “optically vanish” in the thin-plate limit. The thin-plate limit of a plasma slab of thickness d with plasma frequency ωp2=ζp/d reduces to a δ-function plate for frequencies (ω=iζ) satisfying ζd≪ζpd≪1. We show that the Casimir interaction energy between two parallel perfectly conducting δ-function plates is the same as that for parallel perfectly conducting slabs. Similarly, we show that the interaction energy between an atom and a perfect electrically conducting δ-function plate is the usual Casimir-Polder energy, which is verified by considering the thin-plate limit of dielectric slabs. The “thick” and “thin” boundary conditions considered by Bordag are found to be identical in the sense that they lead to the same electromagnetic fields.

  17. Parallel changes of taxonomic interaction networks in lacustrine bacterial communities induced by a polymetallic perturbation

    PubMed Central

    Laplante, Karine; Sébastien, Boutin; Derome, Nicolas

    2013-01-01

    Heavy metals released by anthropogenic activities such as mining trigger profound changes to bacterial communities. In this study we used 16S SSU rRNA gene high-throughput sequencing to characterize the impact of a polymetallic perturbation and other environmental parameters on taxonomic networks within five lacustrine bacterial communities from sites located near Rouyn-Noranda, Quebec, Canada. The results showed that community equilibrium was disturbed in terms of both diversity and structure. Moreover, heavy metals, especially cadmium combined with water acidity, induced parallel changes among sites via the selection of resistant OTUs (Operational Taxonomic Unit) and taxonomic dominance perturbations favoring the Alphaproteobacteria. Furthermore, under a similar selective pressure, covariation trends between phyla revealed conservation and parallelism within interphylum interactions. Our study sheds light on the importance of analyzing communities not only from a phylogenetic perspective but also including a quantitative approach to provide significant insights into the evolutionary forces that shape the dynamic of the taxonomic interaction networks in bacterial communities. PMID:23789031

  18. A Force-Based, Parallel Assay for the Quantification of Protein-DNA Interactions

    PubMed Central

    Limmer, Katja; Pippig, Diana A.; Aschenbrenner, Daniela; Gaub, Hermann E.

    2014-01-01

    Analysis of transcription factor binding to DNA sequences is of utmost importance to understand the intricate regulatory mechanisms that underlie gene expression. Several techniques exist that quantify DNA-protein affinity, but they are either very time-consuming or suffer from possible misinterpretation due to complicated algorithms or approximations like many high-throughput techniques. We present a more direct method to quantify DNA-protein interaction in a force-based assay. In contrast to single-molecule force spectroscopy, our technique, the Molecular Force Assay (MFA), parallelizes force measurements so that it can test one or multiple proteins against several DNA sequences in a single experiment. The interaction strength is quantified by comparison to the well-defined rupture stability of different DNA duplexes. As a proof-of-principle, we measured the interaction of the zinc finger construct Zif268/NRE against six different DNA constructs. We could show the specificity of our approach and quantify the strength of the protein-DNA interaction. PMID:24586920

  19. Determination of interaction forces between parallel dislocations by the evaluation of J integrals of plane elasticity

    NASA Astrophysics Data System (ADS)

    Lubarda, Vlado A.

    2015-05-01

    The Peach-Koehler expressions for the glide and climb components of the force exerted on a straight dislocation in an infinite isotropic medium by another straight dislocation are derived by evaluating the plane and antiplane strain versions of J integrals around the center of the dislocation. After expressing the elastic fields as the sums of elastic fields of each dislocation, the energy momentum tensor is decomposed into three parts. It is shown that only one part, involving mixed products from the two dislocation fields, makes a nonvanishing contribution to J integrals and the corresponding dislocation forces. Three examples are considered, with dislocations on parallel or intersecting slip planes. For two edge dislocations on orthogonal slip planes, there are two equilibrium configurations in which the glide and climb components of the dislocation force simultaneously vanish. The interactions between two different types of screw dislocations and a nearby circular void, as well as between parallel line forces in an infinite or semi-infinite medium, are then evaluated.

  20. Determination of interaction forces between parallel dislocations by the evaluation of J integrals of plane elasticity

    NASA Astrophysics Data System (ADS)

    Lubarda, Vlado A.

    2016-03-01

    The Peach-Koehler expressions for the glide and climb components of the force exerted on a straight dislocation in an infinite isotropic medium by another straight dislocation are derived by evaluating the plane and antiplane strain versions of J integrals around the center of the dislocation. After expressing the elastic fields as the sums of elastic fields of each dislocation, the energy momentum tensor is decomposed into three parts. It is shown that only one part, involving mixed products from the two dislocation fields, makes a nonvanishing contribution to J integrals and the corresponding dislocation forces. Three examples are considered, with dislocations on parallel or intersecting slip planes. For two edge dislocations on orthogonal slip planes, there are two equilibrium configurations in which the glide and climb components of the dislocation force simultaneously vanish. The interactions between two different types of screw dislocations and a nearby circular void, as well as between parallel line forces in an infinite or semi-infinite medium, are then evaluated.

  1. Use of Hilbert Curves in Parallelized CUDA code: Interaction of Interstellar Atoms with the Heliosphere

    NASA Astrophysics Data System (ADS)

    Destefano, Anthony; Heerikhuisen, Jacob

    2015-04-01

    Fully 3D particle simulations can be a computationally and memory expensive task, especially when high resolution grid cells are required. The problem becomes further complicated when parallelization is needed. In this work we focus on computational methods to solve these difficulties. Hilbert curves are used to map the 3D particle space to the 1D contiguous memory space. This method of organization allows for minimized cache misses on the GPU as well as a sorted structure that is equivalent to an octal tree data structure. This type of sorted structure is attractive for uses in adaptive mesh implementations due to the logarithm search time. Implementations using the Message Passing Interface (MPI) library and NVIDIA's parallel computing platform CUDA will be compared, as MPI is commonly used on server nodes with many CPU's. We will also compare static grid structures with those of adaptive mesh structures. The physical test bed will be simulating heavy interstellar atoms interacting with a background plasma, the heliosphere, simulated from fully consistent coupled MHD/kinetic particle code. It is known that charge exchange is an important factor in space plasmas, specifically it modifies the structure of the heliosphere itself. We would like to thank the Alabama Supercomputer Authority for the use of their computational resources.

  2. Massively parallel measurements of molecular interaction kinetics on a microfluidic platform

    PubMed Central

    Geertz, Marcel; Shore, David; Maerkl, Sebastian J.

    2012-01-01

    Quantitative biology requires quantitative data. No high-throughput technologies exist capable of obtaining several hundred independent kinetic binding measurements in a single experiment. We present an integrated microfluidic device (k-MITOMI) for the simultaneous kinetic characterization of 768 biomolecular interactions. We applied k-MITOMI to the kinetic analysis of transcription factor (TF)—DNA interactions, measuring the detailed kinetic landscapes of the mouse TF Zif268, and the yeast TFs Tye7p, Yox1p, and Tbf1p. We demonstrated the integrated nature of k-MITOMI by expressing, purifying, and characterizing 27 additional yeast transcription factors in parallel on a single device. Overall, we obtained 2,388 association and dissociation curves of 223 unique molecular interactions with equilibrium dissociation constants ranging from 2 × 10-6 M to 2 × 10-9 M, and dissociation rate constants of approximately 6 s-1 to 8.5 × 10-3 s-1. Association rate constants were uniform across 3 TF families, ranging from 3.7 × 106 M-1 s-1 to 9.6 × 107 M-1 s-1, and are well below the diffusion limit. We expect that k-MITOMI will contribute to our quantitative understanding of biological systems and accelerate the development and characterization of engineered systems. PMID:23012409

  3. Numerical investigation of two interacting parallel thruster-plumes and comparison to experiment

    NASA Astrophysics Data System (ADS)

    Grabe, Martin; Holz, Andr; Ziegenhagen, Stefan; Hannemann, Klaus

    2014-12-01

    Clusters of orbital thrusters are an attractive option to achieve graduated thrust levels and increased redundancy with available hardware, but the heavily under-expanded plumes of chemical attitude control thrusters placed in close proximity will interact, leading to a local amplification of downstream fluxes and of back-flow onto the spacecraft. The interaction of two similar, parallel, axi-symmetric cold-gas model thrusters has recently been studied in the DLR High-Vacuum Plume Test Facility STG under space-like vacuum conditions, employing a Patterson-type impact pressure probe with slot orifice. We reproduce a selection of these experiments numerically, and emphasise that a comparison of numerical results to the measured data is not straight-forward. The signal of the probe used in the experiments must be interpreted according to the degree of rarefaction and local flow Mach number, and both vary dramatically thoughout the flow-field. We present a procedure to reconstruct the probe signal by post-processing the numerically obtained flow-field data and show that agreement to the experimental results is then improved. Features of the investigated cold-gas thruster plume interaction are discussed on the basis of the numerical results.

  4. Parallel kinetic Monte Carlo simulation framework incorporating accurate models of adsorbate lateral interactions

    NASA Astrophysics Data System (ADS)

    Nielsen, Jens; d'Avezac, Mayeul; Hetherington, James; Stamatakis, Michail

    2013-12-01

    Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. More recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion.

  5. Parallel kinetic Monte Carlo simulation framework incorporating accurate models of adsorbate lateral interactions.

    PubMed

    Nielsen, Jens; d'Avezac, Mayeul; Hetherington, James; Stamatakis, Michail

    2013-12-14

    Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. More recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion. PMID:24329081

  6. Parallel kinetic Monte Carlo simulation framework incorporating accurate models of adsorbate lateral interactions

    SciTech Connect

    Nielsen, Jens; DAvezac, Mayeul; Hetherington, James; Stamatakis, Michail

    2013-12-14

    Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. More recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion.

  7. 3D magnetospheric parallel hybrid multi-grid method applied to planet-plasma interactions

    NASA Astrophysics Data System (ADS)

    Leclercq, L.; Modolo, R.; Leblanc, F.; Hess, S.; Mancini, M.

    2016-03-01

    We present a new method to exploit multiple refinement levels within a 3D parallel hybrid model, developed to study planet-plasma interactions. This model is based on the hybrid formalism: ions are kinetically treated whereas electrons are considered as a inertia-less fluid. Generally, ions are represented by numerical particles whose size equals the volume of the cells. Particles that leave a coarse grid subsequently entering a refined region are split into particles whose volume corresponds to the volume of the refined cells. The number of refined particles created from a coarse particle depends on the grid refinement rate. In order to conserve velocity distribution functions and to avoid calculations of average velocities, particles are not coalesced. Moreover, to ensure the constancy of particles' shape function sizes, the hybrid method is adapted to allow refined particles to move within a coarse region. Another innovation of this approach is the method developed to compute grid moments at interfaces between two refinement levels. Indeed, the hybrid method is adapted to accurately account for the special grid structure at the interfaces, avoiding any overlapping grid considerations. Some fundamental test runs were performed to validate our approach (e.g. quiet plasma flow, Alfven wave propagation). Lastly, we also show a planetary application of the model, simulating the interaction between Jupiter's moon Ganymede and the Jovian plasma.

  8. Rotor-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Schlinker, R. H.; Amiet, R. K.

    1983-01-01

    A theoretical and experimental study was conducted to develop a validated first principles analysis for predicting noise generated by helicopter main-rotor shed vortices interacting with the tail rotor. The generalized prediction procedure requires a knowledge of the incident vortex velocity field, rotor geometry, and rotor operating conditions. The analysis includes compressibility effects, chordwise and spanwise noncompactness, and treats oblique intersections with the blade planform. Assessment of the theory involved conducting a model rotor experiment which isolated the blade-vortex interaction noise from other rotor noise mechanisms. An isolated tip vortex, generated by an upstream semispan airfoil, was convected into the model tail rotor. Acoustic spectra, pressure signatures, and directivity were measured. Since assessment of the acoustic prediction required a knowledge of the vortex properties, blade-vortes intersection angle, intersection station, vortex stength, and vortex core radius were documented. Ingestion of the vortex by the rotor was experimentally observed to generate harmonic noise and impulsive waveforms.

  9. Experimental Studies of the Interaction Between a Parallel Shear Flow and a Directionally-Solidifying Front

    NASA Technical Reports Server (NTRS)

    Zhang, Meng; Maxworthy, Tony

    1999-01-01

    It has long been recognized that flow in the melt can have a profound influence on the dynamics of a solidifying interface and hence the quality of the solid material. In particular, flow affects the heat and mass transfer, and causes spatial and temporal variations in the flow and melt composition. This results in a crystal with nonuniform physical properties. Flow can be generated by buoyancy, expansion or contraction upon phase change, and thermo-soluto capillary effects. In general, these flows can not be avoided and can have an adverse effect on the stability of the crystal structures. This motivates crystal growth experiments in a microgravity environment, where buoyancy-driven convection is significantly suppressed. However, transient accelerations (g-jitter) caused by the acceleration of the spacecraft can affect the melt, while convection generated from the effects other than buoyancy remain important. Rather than bemoan the presence of convection as a source of interfacial instability, Hurle in the 1960s suggested that flow in the melt, either forced or natural convection, might be used to stabilize the interface. Delves considered the imposition of both a parabolic velocity profile and a Blasius boundary layer flow over the interface. He concluded that fast stirring could stabilize the interface to perturbations whose wave vector is in the direction of the fluid velocity. Forth and Wheeler considered the effect of the asymptotic suction boundary layer profile. They showed that the effect of the shear flow was to generate travelling waves parallel to the flow with a speed proportional to the Reynolds number. There have been few quantitative, experimental works reporting on the coupling effect of fluid flow and morphological instabilities. Huang studied plane Couette flow over cells and dendrites. It was found that this flow could greatly enhance the planar stability and even induce the cell-planar transition. A rotating impeller was buried inside the sample cell, driven by an outside rotating magnet, in order to generate the flow. However, it appears that this was not a well-controlled flow and may also have been unsteady. In the present experimental study, we want to study how a forced parallel shear flow in a Hele-Shaw cell interacts with the directionally solidifying crystal interface. The comparison of experimental data show that the parallel shear flow in a Hele-Shaw cell has a strong stabilizing effect on the planar interface by damping the existing initial perturbations. The flow also shows a stabilizing effect on the cellular interface by slightly reducing the exponential growth rate of cells. The left-right symmetry of cells is broken by the flow with cells tilting toward the incoming flow direction. The tilting angle increases with the velocity ratio. The experimental results are explained through the parallel flow effect on lateral solute transport. The phenomenon of cells tilting against the flow is consistent with the numerical result of Dantzig and Chao.

  10. Large-scale massively parallel atomistic simulations of short pulse laser interaction with metals

    NASA Astrophysics Data System (ADS)

    Wu, Chengping; Zhigilei, Leonid; Computational Materials Group Team

    2014-03-01

    Taking advantage of petascale supercomputing architectures, large-scale massively parallel atomistic simulations (108-109 atoms) are performed to study the microscopic mechanisms of short pulse laser interaction with metals. The results of the simulations reveal a complex picture of highly non-equilibrium processes responsible for material modification and/or ejection. At low laser fluences below the ablation threshold, fast melting and resolidification occur under conditions of extreme heating and cooling rates resulting in surface microstructure modification. At higher laser fluences in the spallation regime, the material is ejected by the relaxation of laser-induced stresses and proceeds through the nucleation, growth and percolation of multiple voids in the sub-surface region of the irradiated target. At a fluence of ~ 2.5 times the spallation threshold, the top part of the target reaches the conditions for an explosive decomposition into vapor and small droplets, marking the transition to the phase explosion regime of laser ablation. The dynamics of plume formation and the characteristics of the ablation plume are obtained from the simulations and compared with the results of time-resolved plume imaging experiments. Financial support for this work was provided by NSF (DMR-0907247 and CMMI-1301298) and AFOSR (FA9550-10-1-0541). Computational support was provided by the OLCF (MAT048) and XSEDE (TG-DMR110090).

  11. Interaction of a Rectangular Jet with a Flat-Plate Placed Parallel to the Flow

    NASA Technical Reports Server (NTRS)

    Zaman, K. B. M. Q.; Brown, C. A.; Bridges, J. A.

    2013-01-01

    An experimental study is carried out addressing the flowfield and radiated noise from the interaction of a large aspect ratio rectangular jet with a flat plate placed parallel to but away from the direct path of the jet. Sound pressure level spectra exhibit an increase in the noise levels for both the 'reflected' and 'shielded' sides of the plate relative to the free-jet case. Detailed cross-sectional distributions of flowfield properties obtained by hot-wire anemometry are documented for a low subsonic condition. Corresponding mean Mach number distributions obtained by Pitot-probe surveys are presented for high subsonic conditions. In the latter flow regime and for certain relative locations of the plate, a flow resonance accompanied by audible tones is encountered. Under the resonant condition the jet cross-section experiences an 'axis-switching' and flow visualization indicates the presence of an organized 'vortex street'. The trends of the resonant frequency variation with flow parameters exhibit some similarities to, but also marked differences with, corresponding trends of the well-known edgetone phenomenon.

  12. Highly scalable parallel implementation of turbulent collision of aerodynamically interacting cloud droplets

    NASA Astrophysics Data System (ADS)

    Parishani, Hossein; Ayala, Orlando; Wang, Lian-Ping; Rosa, Bogdan; Grabowski, Wojciech

    2011-11-01

    Hybrid direct numerical simulation (HDNS) has advanced our understanding of turbulent collision-coalescence of cloud droplets. In this approach, the background fluid turbulence is simulated by a pseudospectral method and disturbance flows of droplets are treated analytically. To better realize its potential on PetaScale computers with ~100,000 processors, here we implement and test a parallel implementation using two-dimensional domain decomposition. The purpose is to increase both the range of flow scales and the number of droplets realizable in the simulations, so the dependence of collision statistics on flow Reynolds number and droplet size can be explored. We expect that the 2D domain-decomposition HDNS code can be used to produce statistics of aerodynamically-interacting droplets with Taylor microscale flow Reynolds number R? up to ~ 1000 and a system of O (107) polydisperse droplets. We will present the implementation details as well as results of turbulent collision statistics (e.g., collision kernel, radial distribution function, relative velocity statistics) of sedimenting cloud droplets from our latest high-resolution HDNS. Work supported by NSF and NCAR.

  13. Software tools for developing parallel applications. Part 2: Interactive control and performance tuning

    SciTech Connect

    Brown, J.; Geist, A.; Pancake, C.; Rover, D.

    1997-04-01

    This paper continues the discussion of parallel tool support with an overview of the current state of tools for runtime control and performance tuning. Each is discussed in terms of the programmer needs addressed, the extent to which representative current tools meet those needs, and what new levels of tool support are important if parallel computing is to become more widespread.

  14. Parallelization of the Flow Field Dependent Variation Scheme for Solving the Triple Shock/Boundary Layer Interaction Problem

    NASA Technical Reports Server (NTRS)

    Schunk, Richard Gregory; Chung, T. J.

    2001-01-01

    A parallelized version of the Flowfield Dependent Variation (FDV) Method is developed to analyze a problem of current research interest, the flowfield resulting from a triple shock/boundary layer interaction. Such flowfields are often encountered in the inlets of high speed air-breathing vehicles including the NASA Hyper-X research vehicle. In order to resolve the complex shock structure and to provide adequate resolution for boundary layer computations of the convective heat transfer from surfaces inside the inlet, models containing over 500,000 nodes are needed. Efficient parallelization of the computation is essential to achieving results in a timely manner. Results from a parallelization scheme, based upon multi-threading, as implemented on multiple processor supercomputers and workstations is presented.

  15. Parallelization of the Flow Field Dependent Variation Scheme for Solving the Triple Shock/Boundary Layer Interaction Problem

    NASA Technical Reports Server (NTRS)

    Schunk, Greg; chung, T. J.

    1999-01-01

    A parallelized version of the Flowfield Dependent Variation (FDV) Method is developed to analyze a problem of current research interest, the flowfield resulting from a triple shock/boundary layer interaction. Such flowfields are often encountered in the inlets of high speed air-breathing vehicles including NASA's Hyper-X. In order to resolve the complex shock structure and to provide adequate resolution for boundary layer computations of the convective heat transfer from surfaces inside the inlet, models containing over 500,000 nodes are needed. Efficient parallelization of the computation is essential to obtaining the results in a timely manner. Results from different parallelization schemes, based upon multi-threading and message passing, as implemented on multiple processor supercomputers and on distributed workstations are compared.

  16. Parallel diffusion of energetic particles interacting with noisy reduced MHD turbulence

    NASA Astrophysics Data System (ADS)

    Reimer, A.; Shalchi, A.

    2016-03-01

    We investigate analytically parallel diffusion in noisy reduced magnetohydrodynamic (NRMHD) turbulence. We employ different theories such as quasi-linear theory, second-order quasi-linear theory, and the weakly non-linear theory to compute the parallel diffusion coefficient. Our analytical findings are compared with test-particle simulations performed previously. We demonstrate systematically that quasi-linear theory does not work for the turbulence model considered here because it provides an infinite parallel diffusion coefficient. The second-order theory, on the other hand, provides a finite parallel mean free path which is, however, too large. Only by using the weakly non-linear theory we can reproduce the simulations and, thus, we conclude that resonance broadening due to perpendicular diffusion is an important effect if it comes to particle transport along the mean field in NRMHD turbulence.

  17. Request queues for interactive clients in a shared file system of a parallel computing system

    DOEpatents

    Bent, John M.; Faibish, Sorin

    2015-08-18

    Interactive requests are processed from users of log-in nodes. A metadata server node is provided for use in a file system shared by one or more interactive nodes and one or more batch nodes. The interactive nodes comprise interactive clients to execute interactive tasks and the batch nodes execute batch jobs for one or more batch clients. The metadata server node comprises a virtual machine monitor; an interactive client proxy to store metadata requests from the interactive clients in an interactive client queue; a batch client proxy to store metadata requests from the batch clients in a batch client queue; and a metadata server to store the metadata requests from the interactive client queue and the batch client queue in a metadata queue based on an allocation of resources by the virtual machine monitor. The metadata requests can be prioritized, for example, based on one or more of a predefined policy and predefined rules.

  18. LPIC++ a parallel one-dimensional relativistic electromagnetic Particle-In-Cell code for simulating laser-plasma-interaction

    NASA Astrophysics Data System (ADS)

    Pfund, R. E. W.; Lichters, R.; Meyer-ter-Vehn, J.

    1998-02-01

    We report on a recently developed electromagnetic relativistic 1D3V (one spatial, three velocity dimensions) Particle-In-Cell code for simulating laser-plasma interaction at normal and oblique incidence. The code is written in C++ and easy to extend. The data structure is characterized by the use of chained lists for the grid cells as well as particles belonging to one cell. The parallel version of the code is based on PVM. It splits the grid into several spatial domains each belonging to one processor. Since particles can cross boundaries of cells as well as domains, the processor loads will generally change in time. This is counteracted by adjusting the domain sizes dynamically, for which the use of chained lists has proven to be very convenient. Moreover, an option for restarting the simulation from intermediate stages of the time evolution has been implemented even in the parallel version. The code will be published and distributed freely.

  19. Interacting parallel pathways associate sounds with visual identity in auditory cortices.

    PubMed

    Ahveninen, Jyrki; Huang, Samantha; Ahlfors, Seppo P; Hmlinen, Matti; Rossi, Stephanie; Sams, Mikko; Jskelinen, Iiro P

    2016-01-01

    Spatial and non-spatial information of sound events is presumably processed in parallel auditory cortex (AC) "what" and "where" streams, which are modulated by inputs from the respective visual-cortex subsystems. How these parallel processes are integrated to perceptual objects that remain stable across time and the source agent's movements is unknown. We recorded magneto- and electroencephalography (MEG/EEG) data while subjects viewed animated video clips featuring two audiovisual objects, a black cat and a gray cat. Adaptor-probe events were either linked to the same object (the black cat meowed twice in a row in the same location) or included a visually conveyed identity change (the black and then the gray cat meowed with identical voices in the same location). In addition to effects in visual (including fusiform, middle temporal or MT areas) and frontoparietal association areas, the visually conveyed object-identity change was associated with a release from adaptation of early (50-150ms) activity in posterior ACs, spreading to left anterior ACs at 250-450ms in our combined MEG/EEG source estimates. Repetition of events belonging to the same object resulted in increased theta-band (4-8Hz) synchronization within the "what" and "where" pathways (e.g., between anterior AC and fusiform areas). In contrast, the visually conveyed identity changes resulted in distributed synchronization at higher frequencies (alpha and beta bands, 8-32Hz) across different auditory, visual, and association areas. The results suggest that sound events become initially linked to perceptual objects in posterior AC, followed by modulations of representations in anterior AC. Hierarchical what and where pathways seem to operate in parallel after repeating audiovisual associations, whereas the resetting of such associations engages a distributed network across auditory, visual, and multisensory areas. PMID:26419388

  20. MPI parallelization of Vlasov codes for the simulation of nonlinear laser-plasma interactions

    NASA Astrophysics Data System (ADS)

    Savchenko, V.; Won, K.; Afeyan, B.; Decyk, V.; Albrecht-Marc, M.; Ghizzo, A.; Bertrand, P.

    2003-10-01

    The simulation of optical mixing driven KEEN waves [1] and electron plasma waves [1] in laser-produced plasmas require nonlinear kinetic models and massive parallelization. We use Massage Passing Interface (MPI) libraries and Appleseed [2] to solve the Vlasov Poisson system of equations on an 8 node dual processor MAC G4 cluster. We use the semi-Lagrangian time splitting method [3]. It requires only row-column exchanges in the global data redistribution, minimizing the total number of communications between processors. Recurrent communication patterns for 2D FFTs involves global transposition. In the Vlasov-Maxwell case, we use splitting into two 1D spatial advections and a 2D momentum advection [4]. Discretized momentum advection equations have a double loop structure with the outer index being assigned to different processors. We adhere to a code structure with separate routines for calculations and data management for parallel computations. [1] B. Afeyan et al., IFSA 2003 Conference Proceedings, Monterey, CA [2] V. K. Decyk, Computers in Physics, 7, 418 (1993) [3] Sonnendrucker et al., JCP 149, 201 (1998) [4] Begue et al., JCP 151, 458 (1999)

  1. A Lightweight Remote Parallel Visualization Platform for Interactive Massive Time-varying Climate Data Analysis

    NASA Astrophysics Data System (ADS)

    Li, J.; Zhang, T.; Huang, Q.; Liu, Q.

    2014-12-01

    Today's climate datasets are featured with large volume, high degree of spatiotemporal complexity and evolving fast overtime. As visualizing large volume distributed climate datasets is computationally intensive, traditional desktop based visualization applications fail to handle the computational intensity. Recently, scientists have developed remote visualization techniques to address the computational issue. Remote visualization techniques usually leverage server-side parallel computing capabilities to perform visualization tasks and deliver visualization results to clients through network. In this research, we aim to build a remote parallel visualization platform for visualizing and analyzing massive climate data. Our visualization platform was built based on Paraview, which is one of the most popular open source remote visualization and analysis applications. To further enhance the scalability and stability of the platform, we have employed cloud computing techniques to support the deployment of the platform. In this platform, all climate datasets are regular grid data which are stored in NetCDF format. Three types of data access methods are supported in the platform: accessing remote datasets provided by OpenDAP servers, accessing datasets hosted on the web visualization server and accessing local datasets. Despite different data access methods, all visualization tasks are completed at the server side to reduce the workload of clients. As a proof of concept, we have implemented a set of scientific visualization methods to show the feasibility of the platform. Preliminary results indicate that the framework can address the computation limitation of desktop based visualization applications.

  2. Simulation of the Quasi-Monoenergetic Protons Generation by Parallel Laser Pulses Interaction with Foils

    NASA Astrophysics Data System (ADS)

    Wang, Wei-Quan; Yin, Yan; Zou, De-Bin; Yu, Tong-Pu; Yang, Xiao-Hu; Xu, Han; Yu, Ming-Yang; Ma, Yan-Yun; Zhuo, Hong-Bin; Shao, Fu-Qiu

    2014-11-01

    A new scheme of radiation pressure acceleration for generating high-quality protons by using two overlapping-parallel laser pulses is proposed. Particle-in-cell simulation shows that the overlapping of two pulses with identical Gaussian profiles in space and trapezoidal profiles in the time domain can result in a composite light pulse with a spatial profile suitable for stable acceleration of protons to high energies. At ~2.46 1021 W/cm2 intensity of the combination light pulse, a quasi-monoenergetic proton beam with peak energy ~200 MeV/nucleon, energy spread <15%, and divergency angle <4 is obtained, which is appropriate for tumor therapy. The proton beam quality can be controlled by adjusting the incidence points of two laser pulses.

  3. Parallel adaptive fluid-structure interaction simulation of explosions impacting on building structures

    SciTech Connect

    Deiterding, Ralf; Wood, Stephen L

    2013-01-01

    We pursue a level set approach to couple an Eulerian shock-capturing fluid solver with space-time refinement to an explicit solid dynamics solver for large deformations and fracture. The coupling algorithms considering recursively finer fluid time steps as well as overlapping solver updates are discussed in detail. Our ideas are implemented in the AMROC adaptive fluid solver framework and are used for effective fluid-structure coupling to the general purpose solid dynamics code DYNA3D. Beside simulations verifying the coupled fluid-structure solver and assessing its parallel scalability, the detailed structural analysis of a reinforced concrete column under blast loading and the simulation of a prototypical blast explosion in a realistic multistory building are presented.

  4. RKKY Interaction and the Nature of the Ground State of Double Dots in Parallel

    SciTech Connect

    Kulkarni, M.; Konik, R.

    2011-06-23

    We argue through a combination of slave-boson mean-field theory and the Bethe ansatz that the ground state of closely spaced double quantum dots in parallel coupled to a single effective channel are Fermi liquids. We do so by studying the dots conductance, impurity entropy, and spin correlation. In particular, we find that the zero-temperature conductance is characterized by the Friedel sum rule, a hallmark of Fermi-liquid physics, and that the impurity entropy vanishes in the limit of zero temperature, indicating that the ground state is a singlet. This conclusion is in opposition to a number of numerical renormalization-group studies. We suggest a possible reason for the discrepancy.

  5. Study of the parallel-plate EMP simulator and the simulator-obstacle interaction. Final technical report

    SciTech Connect

    Gedney, S.D.

    1990-12-01

    The Parallel-Plate Bounded-Wave EMP Simulator is typically used to test the vulnerability of electronic systems to the electromagnetic pulse (EMP) produced by a high altitude nuclear burst by subjecting the systems to a simulated EMP environment. However, when large test objects are placed within the simulator for investigation, the desired EMP environment may be affected by the interaction between the simulator and the test object. This simulator/obstacle interaction can be attributed to the following phenomena: (1) mutual coupling between the test object and the simulator, (2) fringing effects due to the finite width of the conducting plates of the simulator, and (3) multiple reflections between the object and the simulator's tapered end-sections. When the interaction is significant, the measurement of currents coupled into the system may not accurately represent those induced by an actual EMP. To better understand the problem of simulator/obstacle interaction, a dynamic analysis of the fields within the parallel-plate simulator is presented. The fields are computed using a moment method solution based on a wire mesh approximation of the conducting surfaces of the simulator. The fields within an empty simulator are found to be predominately transversse electromagnetic (TEM) for frequencies within the simulator's bandwidth, properly simulating the properties of the EMP propagating in free space. However, when a large test object is placed within the simulator, it is found that the currents induced on the object can be quite different from those on an object situated in free space. A comprehensive study of the mechanisms contributing to this deviation is presented.

  6. Gamma ray bursts from comet neutron star magnetosphere interaction, field twisting and E sub parallel formation

    SciTech Connect

    Colgate, S.A.

    1990-01-01

    Consider the problem of a comet in a collision trajectory with a magnetized neutron star. The question addressed in this paper is whether the comet interacts strongly enough with a magnetic field such as to capture at a large radius or whether in general the comet will escape a magnetized neutron star. 6 refs., 4 figs.

  7. Tn-seq; high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms

    PubMed Central

    van Opijnen, Tim; Bodi, Kip L.; Camilli, Andrew

    2009-01-01

    Biological pathways are structured in complex networks of interacting genes. Solving the architecture of such networks may provide valuable information, such as how microorganisms cause disease. Here we present a method (Tn-seq) for accurately determining quantitative genetic interactions on a genome-wide scale in microorganisms. Tn-seq is based on the assembly of a saturated Mariner transposon insertion library. After library selection, changes in frequency of each insertion mutant are determined by sequencing of the flanking regions en masse. These changes are used to calculate each mutant’s fitness. Fitness was determined for each gene of the gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis. A genome-wide screen for genetic interactions identified both alleviating and aggravating interactions that could be further divided into seven distinct categories. Due to the wide activity of the Mariner transposon, Tn-seq has the potential to contribute to the exploration of complex pathways across many different species. PMID:19767758

  8. Interaction between a laminar starting immersed micro-jet and a parallel wall

    NASA Astrophysics Data System (ADS)

    Cabaleiro, Juan Martin; Laborde, Cecilia; Artana, Guillermo

    2015-01-01

    In the present work, we study the starting transient of an immersed micro-jet in close vicinity to a solid wall parallel to its axis. The experiments concern laminar jets (Re < 200) issuing from a 100 ?m internal tip diameter glass micro-pipette. The effect of the confinement was studied placing the micro-pipette at different distances from the wall. The characterization of the jet was carried out by visualizations on which the morphology of the vortex head and trajectories was analyzed. Numerical simulations were used as a complementary tool for the analysis. The jet remains stable for very long distances away from the tip allowing for a similarity analysis. The self-similar behavior of the starting jet has been studied in terms of the frontline position with time. A symmetric and a wall dominated regime could be identified. The starting jet in the wall type regime, and in the symmetric regime as well, develops a self-similar behavior that has a relative rapid loss of memory of the preceding condition of the flow. Scaling for both regimes are those that correspond to viscous dominated flows.

  9. The grid-based fast multipole method--a massively parallel numerical scheme for calculating two-electron interaction energies.

    PubMed

    Toivanen, Elias A; Losilla, Sergio A; Sundholm, Dage

    2015-12-21

    Algorithms and working expressions for a grid-based fast multipole method (GB-FMM) have been developed and implemented. The computational domain is divided into cubic subdomains, organized in a hierarchical tree. The contribution to the electrostatic interaction energies from pairs of neighboring subdomains is computed using numerical integration, whereas the contributions from further apart subdomains are obtained using multipole expansions. The multipole moments of the subdomains are obtained by numerical integration. Linear scaling is achieved by translating and summing the multipoles according to the tree structure, such that each subdomain interacts with a number of subdomains that are almost independent of the size of the system. To compute electrostatic interaction energies of neighboring subdomains, we employ an algorithm which performs efficiently on general purpose graphics processing units (GPGPU). Calculations using one CPU for the FMM part and 20 GPGPUs consisting of tens of thousands of execution threads for the numerical integration algorithm show the scalability and parallel performance of the scheme. For calculations on systems consisting of Gaussian functions (? = 1) distributed as fullerenes from C20 to C720, the total computation time and relative accuracy (ppb) are independent of the system size. PMID:26006111

  10. Parallel Three-Dimensional Computation of Fluid Dynamics and Fluid-Structure Interactions of Ram-Air Parachutes

    NASA Technical Reports Server (NTRS)

    Tezduyar, Tayfun E.

    1998-01-01

    This is a final report as far as our work at University of Minnesota is concerned. The report describes our research progress and accomplishments in development of high performance computing methods and tools for 3D finite element computation of aerodynamic characteristics and fluid-structure interactions (FSI) arising in airdrop systems, namely ram-air parachutes and round parachutes. This class of simulations involves complex geometries, flexible structural components, deforming fluid domains, and unsteady flow patterns. The key components of our simulation toolkit are a stabilized finite element flow solver, a nonlinear structural dynamics solver, an automatic mesh moving scheme, and an interface between the fluid and structural solvers; all of these have been developed within a parallel message-passing paradigm.

  11. DNS of hydrodynamically interacting droplets in turbulent clouds: Parallel implementation and scalability analysis using 2D domain decomposition

    NASA Astrophysics Data System (ADS)

    Ayala, Orlando; Parishani, Hossein; Chen, Liu; Rosa, Bogdan; Wang, Lian-Ping

    2014-12-01

    The study of turbulent collision of cloud droplets requires simultaneous considerations of the transport by background air turbulence (i.e., geometric collision rate) and influence of droplet disturbance flows (i.e., collision efficiency). In recent years, this multiscale problem has been addressed through a hybrid direct numerical simulation (HDNS) approach (Ayala et al., 2007). This approach, while currently is the only viable tool to quantify the effects of air turbulence on collision statistics, is computationally expensive. In order to extend the HDNS approach to higher flow Reynolds numbers, here we developed a highly scalable implementation of the approach using 2D domain decomposition. The scalability of the parallel implementation was studied using several parallel computers, at 5123 and 10243 grid resolutions with O(106)-O(107) droplets. It was found that the execution time scaled with number of processors almost linearly until it saturates and deteriorates due to communication latency issues. To better understand the scalability, we developed a complexity analysis by partitioning the execution tasks into computation, communication, and data copy. Using this complexity analysis, we were able to predict the scalability performance of our parallel code. Furthermore, the theory was used to estimate the maximum number of processors below which the approximately linear scalability is sustained. We theoretically showed that we could efficiently solved problems of up to 81923 with O(100,000) processors. The complexity analysis revealed that the pseudo-spectral simulation of background turbulent flow for a dilute droplet suspension typical of cloud conditions typically takes about 80% of the total execution time, except when the droplets are small (less than 5 μm in a flow with energy dissipation rate of 400 cm2/s3 and liquid water content of 1 g/m3), for which case the particle-particle hydrodynamic interactions become the bottleneck. The complexity analysis was also used to explore some alternative methods to handle FFT calculations within the flow simulation and to advance droplets less than 5 μm in radius, for better computational efficiency. Finally, preliminary results are reported to shed light on the Reynolds number-dependence of collision kernel of non-interacting droplets.

  12. Interaction of parallel strike-slip faults and a characteristic distance in the spatial distribution of active faults

    NASA Astrophysics Data System (ADS)

    Kato, Naoyuki; Lei, Xinglin

    2001-01-01

    A numerical simulation of the activities of many parallel strike-slip faults is performed to explore the effect of the interaction of fault slip on the spatial distribution of active faults. In the model, a large number of faults with random strengths are embedded in an elastic layer (lithosphere) over a Maxwell-type viscoelastic half-space (asthenosphere) and shear loading of a constant strain rate is applied. When slip takes place on a model fault, shear stress is decreased around the fault and then recovered with time due to the viscoelastic response of the asthenosphere. The decrease in shear stress prohibits the occurrence of another earthquake around the slipped fault, resulting in the existence of a characteristic distance between active faults. This characteristic distance is found to be controlled by the thickness of the elastic layer, the strain rate and the viscoelastic relaxation time. The density of simulated active faults increases with the strain rate, consistent with observations of active faults in Japan. Furthermore, the present simulation result may explain the characteristic distance which breaks the fractal structure of the spatial distribution of active faults in Japan, which was discovered by Lei & Kusunose (1999).

  13. Quantitative analysis of RNA-protein interactions on a massively parallel array for mapping biophysical and evolutionary landscapes

    PubMed Central

    Buenrostro, Jason D.; Chircus, Lauren M.; Araya, Carlos L.; Layton, Curtis J.; Chang, Howard Y.; Snyder, Michael P.; Greenleaf, William J.

    2015-01-01

    RNA-protein interactions drive fundamental biological processes and are targets for molecular engineering, yet quantitative and comprehensive understanding of the sequence determinants of affinity remains limited. Here we repurpose a high-throughput sequencing instrument to quantitatively measure binding and dissociation of MS2 coat protein to >107 RNA targets generated on a flow-cell surface by in situ transcription and inter-molecular tethering of RNA to DNA. We decompose the binding energy contributions from primary and secondary RNA structure, finding that differences in affinity are often driven by sequence-specific changes in association rates. By analyzing the biophysical constraints and modeling mutational paths describing the molecular evolution of MS2 from low- to high-affinity hairpins, we quantify widespread molecular epistasis, and a long-hypothesized structure-dependent preference for G:U base pairs over C:A intermediates in evolutionary trajectories. Our results suggest that quantitative analysis of RNA on a massively parallel array (RNAMaP) relationships across molecular variants. PMID:24727714

  14. A fluidstructure interaction model to characterize bone cell stimulation in parallel-plate flow chamber systems

    PubMed Central

    Vaughan, T. J.; Haugh, M. G.; McNamara, L. M.

    2013-01-01

    Bone continuously adapts its internal structure to accommodate the functional demands of its mechanical environment and strain-induced flow of interstitial fluid is believed to be the primary mediator of mechanical stimuli to bone cells in vivo. In vitro investigations have shown that bone cells produce important biochemical signals in response to fluid flow applied using parallel-plate flow chamber (PPFC) systems. However, the exact mechanical stimulus experienced by the cells within these systems remains unclear. To fully understand this behaviour represents a most challenging multi-physics problem involving the interaction between deformable cellular structures and adjacent fluid flows. In this study, we use a fluidstructure interaction computational approach to investigate the nature of the mechanical stimulus being applied to a single osteoblast cell under fluid flow within a PPFC system. The analysis decouples the contribution of pressure and shear stress on cellular deformation and for the first time highlights that cell strain under flow is dominated by the pressure in the PPFC system rather than the applied shear stress. Furthermore, it was found that strains imparted on the cell membrane were relatively low whereas significant strain amplification occurred at the cellsubstrate interface. These results suggest that strain transfer through focal attachments at the base of the cell are the primary mediators of mechanical signals to the cell under flow in a PPFC system. Such information is vital in order to correctly interpret biological responses of bone cells under in vitro stimulation and elucidate the mechanisms associated with mechanotransduction in vivo. PMID:23365189

  15. Development of a Multi-Grids Approach into a Parallelized Hybrid Model to Describe Ganymede's Interaction with the Jovian Plasma

    NASA Astrophysics Data System (ADS)

    Leclercq, L.; Modolo, R.; Leblanc, F.; Hess, S. L.; Andre, N.

    2014-12-01

    Ganymede is the only satellite which has its own magnetosphere, which is embedded in the Jovian magnetosphere (Kivelson et al. 1996). This peculiar interaction has been investigated by means of a 3D parallel multi-species hybrid model based on a CAM-CL algorithm (Mathews et al. 1994). In this formalism, ions have a kinetic description whereas electrons are considered as an inertialess fluid which ensures the neutrality of the plasma and contributes to the total current and electronic pressure. Maxwell's equations are solved to compute the temporal evolution of electromagnetic field. Hybrid simulations are performed on a uniform cartesian grid with a spatial resolution of about 240 km. Our results are globally consistent with other models and Galileo measurements. Nevertheless, our description of the magnetopause and the ionosphere is not satisfying enough due to the low spatial resolution. Indeed, we want to describe scale heights of 125 km in the ionosphere whereas the best spatial resolution that we are allowed to use is about 240 km. Therefore, in order to obtain more efficient and relevant results, it is necessary to improve the size of the grid. In this optic, we are introducing a multi-grids approach in order to refine the spatial resolution by a factor 2 (~120km) near Ganymede. The creation of a finer mesh in the simulation grid leads to make some peculiar computations at the interfaces between the two different grids, whether for the calculation of moments, such as charge density or current, or the computation of electromagnetic fields. Moreover, the parallelization of the code, based on domain decomposition methods, imposes us to take care of boundary conditions. In the hybrid model, macroparticules, which represent a kind of cloud of physical particles, have a volume equal to that of a grid cell. Then, the macroparticules entering into the higher spatial resolution region are splited into smaller macroparticules whose the volume corresponds to the volume of a cell of the finer mesh. The improvement of the spatial resolution in the hybrid model will also allow us to relevantly couple the results of this model with those of our 3D multi-species exospheric model (Turc et al. 2014), into a test-particle model that describes the ionosphere of Ganymede. Basic tests and validation results of the multi-grids approach are presented.

  16. Parallel programming

    SciTech Connect

    Perrott, R.H.

    1987-01-01

    This book examines the major hardware developments and programming concepts that have influenced the introduction of parallelism. It provides an overview of some of the features of specific machine architectures and their interaction with developments in software technology. The independent areas of multiprocessor and distributed programming, programming array and vector processors, and data flow programming are also examined in detail. Topics covered include: hardware technology developments; software technology developments; mutual exclusion; process synchronization; message passing primitives; Modula-2; Pascal Plus; Ada; Occam: a distributed computing language; Cray-1 FORTRAN translator: CFT; CDC Cyber FORTRAN; Illiac IV CFD FORTRAN; distributed array processor FORTRAN; Actus: a Pascal-based language; data flow programming.

  17. Computation of interactional aerodynamics for noise prediction of heavy lift rotorcraft

    NASA Astrophysics Data System (ADS)

    Hennes, Christopher C.

    Many computational tools are used when developing a modern helicopter. As the design space is narrowed, more accurate and time-intensive tools are brought to bear. These tools are used to determine the effect of a design decision on the performance, handling, stability and efficiency of the aircraft. One notable parameter left out of this process is acoustics. This is due in part to the difficulty in making useful acoustics calculations that reveal the differences between various design configurations. This thesis presents a new approach designed to bridge the gap in prediction capability between fast but low-fidelity Lagrangian particle methods, and slow but high-fidelity Eulerian computational fluid dynamics simulations. A multi-pronged approach is presented. First, a simple flow solver using well-understood and tested flow solution methodologies is developed specifically to handle bodies in arbitrary motion. To this basic flow solver two new technologies are added. The first is an Immersed Boundary technique designed to be tolerant of geometric degeneracies and low-resolution grids. This new technique allows easy inclusion of complex fuselage geometries at minimal computational cost, improving the ability of a solver to capture the complex interactional aerodynamic effects expected in modern rotorcraft design. The second new technique is an extension of a concept from flow visualization where the motion of tip vortices are tracked through the solution using massless particles convecting with the local flow. In this extension of that concept, the particles maintain knowledge of the expected and actual vortex strength. As a post-processing step, when the acoustic calculations are made, these particles are used to augment the loading noise calculation and reproduce the highly-impulsive character of blade-vortex interaction noise. In combination these new techniques yield a significant improvement to the state of the art in rotorcraft blade-vortex interaction noise prediction.

  18. Prediction of BVI noise patterns and correlation with wake interaction locations

    NASA Astrophysics Data System (ADS)

    Marcolini, Michael A.; Martin, Ruth M.; Lorber, Peter F.; Egolf, T. A.

    High resolution fluctuating airloads data were acquired during a test of a contemporary design United Technologies model rotor in the Duits-Nederlandse Windtunnel (DNW). The airloads are used as input to the noise prediction program WOPWOP, in order to predict the blade-vortex interaction (BVI) noise field on a large plane below the rotor. Trends of predicted advancing and retreating side BVI noise levels and directionality as functions of flight condition are presented. The measured airloads have been analyzed to determine the BVI locations on the blade surface, and are used to interpret the predicted BVI noise radiation patterns. Predicted BVI locations are obtained using the free wake model in CAMRAD/JA, the UTRC Generalized Forward Flight Distorted Wake Model, and the UTRC FREEWAKE analysis. These predicted BVI locations are compared with those obtained from the measured pressure data.

  19. Prediction of BVI noise patterns and correlation with wake interaction locations

    NASA Technical Reports Server (NTRS)

    Marcolini, Michael A.; Martin, Ruth M.; Lorber, Peter F.; Egolf, T. A.

    1992-01-01

    High resolution fluctuating airloads data were acquired during a test of a contemporary design United Technologies model rotor in the Duits-Nederlandse Windtunnel (DNW). The airloads are used as input to the noise prediction program WOPWOP, in order to predict the blade-vortex interaction (BVI) noise field on a large plane below the rotor. Trends of predicted advancing and retreating side BVI noise levels and directionality as functions of flight condition are presented. The measured airloads have been analyzed to determine the BVI locations on the blade surface, and are used to interpret the predicted BVI noise radiation patterns. Predicted BVI locations are obtained using the free wake model in CAMRAD/JA, the UTRC Generalized Forward Flight Distorted Wake Model, and the UTRC FREEWAKE analysis. These predicted BVI locations are compared with those obtained from the measured pressure data.

  20. Parallel computers

    SciTech Connect

    Treveaven, P.

    1989-01-01

    This book presents an introduction to object-oriented, functional, and logic parallel computing on which the fifth generation of computer systems will be based. Coverage includes concepts for parallel computing languages, a parallel object-oriented system (DOOM) and its language (POOL), an object-oriented multilevel VLSI simulator using POOL, and implementation of lazy functional languages on parallel architectures.

  1. Targeted Tuning of Interactive Forces by Engineering of Molecular Bonds in Series and Parallel Using Peptide-Based Adhesives.

    PubMed

    Utzig, Thomas; Stock, Philipp; Raman, Sangeetha; Valtiner, Markus

    2015-10-13

    Polymer-mediated adhesion plays a major role for both technical glues and biological processes like self-assembly or biorecognition. In contrast to engineering systems, adhesive strength in biological systems is precisely tuned via well-adjusted arrangement of individual bonds. How adhesion may be engineered by arrangement of individual bonds is however not yet well-understood. Here we show how the number of bonds in series and parallel can significantly influence adhesion forces using specifically designed surface-bridging peptides. We directly measure how adhesion forces between -COOH and -NH2 functionalized surfaces across aqueous media vary as a function of the number of bonds in parallel. We also introduce surface bridging peptide sequences that are similarly end-functionalized with amines and carboxylic acid. Compared to single molecular junctions, adhesive strength mediated by these surface bridging peptides decreases by a factor of 2 for adhesive junctions that consist of two acid/base bonds in series. Furthermore, adhesive strength varies with the density of bonds in parallel. For dense systems, we observe that the formation of a bridging peptide monolayer is sterically hindered and therefore adhesion is further reduced significantly by 20%. Our results unravel how the arrangement of individual bonds in an adhesive junction allows for a wide tuning of adhesive strength on the basis of utilizing just one single specific bond. As such, for peptide adhesives it is essential to consider bonds in parallel in a wide range of applications where both high adhesion and triggered release of adhesive bonds is essential. PMID:26382013

  2. A Parallel Code for Lifetime Simulations in Hadron Storage Rings in the Presence of Parasitic Beam-Beam Interactions

    SciTech Connect

    Kabel, A.C.; Cai, Y.; Erdelyi, B.; Sen, T.; Xiao, M.; /SLAC /Fermilab

    2008-03-17

    The usual approach to predict particle loss in storage rings in the presence of nonlinearities consists in the determination of the dynamic aperture of the machine. This method, however, will not directly predict the lifetimes of beams. We have developed a code which can, by parallelization and careful speed optimization, predict lifetimes in the presence of 100 parasitic beam-beam crossings by tracking > 10{sup 10} particles-turns. An application of this code to the anti-proton lifetime in the Tevatron at injection is discussed.

  3. Parallel rendering

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1995-01-01

    This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.

  4. Stochastic gyroresonant electron acceleration in a low-beta plasma. I - Interaction with parallel transverse cold plasma waves

    NASA Technical Reports Server (NTRS)

    Steinacker, Juergen; Miller, James A.

    1992-01-01

    The gyroresonance of electrons with parallel transverse cold plasma waves is considered, and the Fokker-Planck equation describing the evolution of the electron distribution function in the presence of a spectrum of turbulence is derived. A new resonance which produces a divergence in the Fokker-Planck coefficients is identified; it results when the electron is in gyroresonance with a wave that has a group velocity equal to the velocity of the electron along the magnetic field. Under the assumption of a power-law spectral density, the Fokker-Planck coefficients are calculated numerically, and their complicated momentum and pitch-angle dependence, as well as the influence of various approximations to the dispersion relation, gyroresonance condition, and spectral density are discussed. It is found that there is no resonance gap at any pitch angle as long as the full gyroresonance condition is used and waves propagating on both directions are present.

  5. Three-dimensional reconnection in the outer heliosphere: interactions between parallel current sheets, and the effects of interstellar pick-up ions

    NASA Astrophysics Data System (ADS)

    Gingell, Peter; Burgess, David; Matteini, Lorenzo

    2015-04-01

    We examine the evolution of a three-dimensional system comprising a series of closely packed, parallel current sheets. Each individual current sheet may be subject to a tearing instability, and hence generate magnetic islands and hot populations of ions associated with magnetic reconnection. However, previous studies have shown that a drift-kink instability can significantly affect the three-dimensional evolution of each current sheet, leading to an effective widening that can reduce reconnection rates and limit magnetic island formation compared to the two-dimensional case. This system also introduces the possibility of interaction between adjacent current sheets, leading to a complex magnetic topology, perpendicular particle transport, and a turbulent end-state. The evolution of this system has important consequences for the structure of the outer heliosphere, where pile-up of parallel current sheets is expected to produce a sectored heliosheath. In order to better model this region, we also introduce a population of interstellar H+ pick-up ions, which may dominate the pressure in the region and significantly alter the spectra of the otherwise largely monochromatic drift-kink instability. We will discuss the evolution of this system with particular focus on particle heating and transport, and the turbulent spectrum of the fluctuations generated by current sheet interactions.

  6. Suppression of electron magnetotunneling between parallel two-dimensional GaAs/InAs electron systems by the correlation interaction

    SciTech Connect

    Khanin, Yu. N.; Vdovin, E. E.; Makarovsky, O.; Henini, M.

    2013-09-15

    Magnetotunneling between two-dimensional GaAs/InAs electron systems in vertical resonant tunneling GaAs/InAs/AlAs heterostructures is studied. A new-type of singularity in the tunneling density of states, specifically a dip at the Fermi level, is found; this feature is drastically different from that observed previously for the case of tunneling between two-dimensional GaAs tunnel systems in terms of both the kind of functional dependence and the energy and temperature parameters. As before, this effect manifests itself in the suppression of resonant tunneling in a narrow range near zero bias voltage in a high magnetic field parallel to the current direction. Magnetic-field and temperature dependences of the effect's parameters are obtained; these dependences are compared with available theoretical and experimental data. The observed effect can be caused by a high degree of disorder in two-dimensional correlated electron systems as a result of the introduction of structurally imperfect strained InAs layers.

  7. 3-D Hybrid Kinetic Modeling of the Interaction Between the Solar Wind and Lunar-like Exospheric Pickup Ions in Case of Oblique/ Quasi-Parallel/Parallel Upstream Magnetic Field

    NASA Technical Reports Server (NTRS)

    Lipatov, A. S.; Farrell, W. M.; Cooper, J. F.; Sittler, E. C., Jr.; Hartle, R. E.

    2015-01-01

    The interactions between the solar wind and Moon-sized objects are determined by a set of the solar wind parameters and plasma environment of the space objects. The orientation of upstream magnetic field is one of the key factors which determines the formation and structure of bow shock wave/Mach cone or Alfven wing near the obstacle. The study of effects of the direction of the upstream magnetic field on lunar-like plasma environment is the main subject of our investigation in this paper. Photoionization, electron-impact ionization and charge exchange are included in our hybrid model. The computational model includes the self-consistent dynamics of the light (hydrogen (+), helium (+)) and heavy (sodium (+)) pickup ions. The lunar interior is considered as a weakly conducting body. Our previous 2013 lunar work, as reported in this journal, found formation of a triple structure of the Mach cone near the Moon in the case of perpendicular upstream magnetic field. Further advances in modeling now reveal the presence of strong wave activity in the upstream solar wind and plasma wake in the cases of quasiparallel and parallel upstream magnetic fields. However, little wave activity is found for the opposite case with a perpendicular upstream magnetic field. The modeling does not show a formation of the Mach cone in the case of theta(Sub B,U) approximately equal to 0 degrees.

  8. Massively parallel visualization: Parallel rendering

    SciTech Connect

    Hansen, C.D.; Krogh, M.; White, W.

    1995-12-01

    This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume renderer use a MIMD approach. Implementations for these algorithms are presented for the Thinking Machines Corporation CM-5 MPP.

  9. An experimental investigation of the chopping of helicopter main rotor tip vortices by the tail rotor

    NASA Technical Reports Server (NTRS)

    Ahmadi, A. R.

    1984-01-01

    The chopping of helicopter main rotor tip vortices by the tail rotor was experimentally investigated. This is a problem of blade vortex interaction (BVI) at normal incidence where the vortex is generally parallel to the rotor axis. The experiment used a model rotor and an isolated vortex and was designed to isolate BVI noise from other types of rotor noise. Tip Mach number, radical BVI station, and free stream velocity were varied. Fluctuating blade pressures, farfield sound pressure level and directivity, velocity field of the incident vortex, and blade vortex interaction angles were measured. Blade vortex interaction was found to produce impulsive noise which radiates primarily ahead of the blade. For interaction away from the blade tip, the results demonstrate the dipole character of BVI radiation. For BVI close to the tip, three dimensional relief effect reduces the intensity of the interaction, despite larger BVI angle and higher local Mach number. Furthermore, in this case, the radiation patern is more complex due to diffraction at and pressure communication around the tip.

  10. Parallel computations in hydro acoustics

    NASA Astrophysics Data System (ADS)

    Pelz, Richard B.

    1994-10-01

    This research concerns the algorithmic development, computer implementation and direct numerical simulation of incompressible and compressible flow of naval relevance. Calculations were executed on a class of current generation multiprocessors. Pseudospectral methods were used exclusively. Lack of parallel algorithms critical to the effective implementation of spectral methods on parallel computers necessitated the need for the development of parallel FFT algorithms for real, conjugate symmetric and real symmetric sequences. These algorithms are applied to spectral methods, but also in many areas of scientific computing. The last algorithm, the parallel fast discrete cosine transform, is used extensively in image and signal processing. The parallel Fourier pseudospectral method for the incompressible Navier-Stokes equations was developed and implemented on many multiprocessors. Reconnection of orthogonally interacting vortex tubes was then investigated using the algorithm on parallel computers as well as vector supercomputers. The parallel Fourier pseudospectral method for the compressible Navier-Stokes equations was also developed. Shock/vortex interactions in two dimensions were investigated.

  11. Wind tunnel tests of a two bladed model rotor to evaluate the TAMI system in descending forward flight

    NASA Technical Reports Server (NTRS)

    White, R. P., Jr.

    1977-01-01

    A research investigation was conducted to assess the potential of the Tip Air Mass Injection system in reducing the noise output during blade vortex interaction in descending low speed flight. In general it was concluded that the noise output due to blade vortex interaction can be reduced by 4 to 6 db with an equivalent power expenditure of approximately 14 percent of installed power.

  12. Massively parallel multiple interacting continua formulation for modeling flow in fractured porous media using the subsurface reactive flow and transport code PFLOTRAN

    NASA Astrophysics Data System (ADS)

    Kumar, J.; Mills, R. T.; Lichtner, P. C.; Hammond, G. E.

    2010-12-01

    Fracture dominated flows occur in numerous subsurface geochemical processes and at many different scales in rock pore structures, micro-fractures, fracture networks and faults. Fractured porous media can be modeled as multiple interacting continua which are connected to each other through transfer terms that capture the flow of mass and energy in response to pressure, temperature and concentration gradients. However, the analysis of large-scale transient problems using the multiple interacting continuum approach presents an algorithmic and computational challenge for problems with very large numbers of degrees of freedom. A generalized dual porosity model based on the Dual Continuum Disconnected Matrix approach has been implemented within a massively parallel multiphysics-multicomponent-multiphase subsurface reactive flow and transport code PFLOTRAN. Developed as part of the Department of Energy's SciDAC-2 program, PFLOTRAN provides subsurface simulation capabilities that can scale from laptops to ultrascale supercomputers, and utilizes the PETSc framework to solve the large, sparse algebraic systems that arises in complex subsurface reactive flow and transport problems. It has been successfully applied to the solution of problems composed of more than two billions degrees of freedom, utilizing up to 131,072 processor cores on Jaguar, the Cray XT5 system at Oak Ridge National Laboratory that is the worlds fastest supercomputer. Building upon the capabilities and computational efficiency of PFLOTRAN, we will present an implementation of the multiple interacting continua formulation for fractured porous media along with an application case study.

  13. Spin accumulation in the parallel-coupled double quantum dots with Rashba spin-orbit interaction connected with ferromagnetic and superconducting electrodes

    NASA Astrophysics Data System (ADS)

    Ye, Cheng-Zhi; Lu, Wei-Tao; Xu, Chang-Tan

    2015-09-01

    Using the standard nonequilibrium Greens function techniques, we investigate the effect of Rashba spin-orbit interaction (RSOI) and ferromagnetic electrode on the spin accumulation in the parallel-coupled double quantum dots coupled with a ferromagnetic and a superconducting electrode. It is demonstrated that FM electrode cannot induce the spin polarization of Andreev reflection (AR) current, but can induce the spin accumulation in the QDs. However, RSOI can lead to the spin polarization of AR current as well as the spin accumulation in the QDs. In the existence of RSOI, complete spin-polarized QD can be achieved with negative bias voltage V, which is the most significant advantage of our device. When energy levels ?1 = ?2 = 0 and the interdot coupling strength tc = 0.01, the maximum value of spin accumulation in this paper is obtained as 0.7. The results may be useful on the design of spintronic devices.

  14. Activity and interactions of methane seep microorganisms assessed by parallel transcription and FISH-NanoSIMS analyses.

    PubMed

    Dekas, Anne E; Connon, Stephanie A; Chadwick, Grayson L; Trembath-Reichert, Elizabeth; Orphan, Victoria J

    2016-03-01

    To characterize the activity and interactions of methanotrophic archaea (ANME) and Deltaproteobacteria at a methane-seeping mud volcano, we used two complimentary measures of microbial activity: a community-level analysis of the transcription of four genes (16S rRNA, methyl coenzyme M reductase A (mcrA), adenosine-5'-phosphosulfate reductase α-subunit (aprA), dinitrogenase reductase (nifH)), and a single-cell-level analysis of anabolic activity using fluorescence in situ hybridization coupled to nanoscale secondary ion mass spectrometry (FISH-NanoSIMS). Transcript analysis revealed that members of the deltaproteobacterial groups Desulfosarcina/Desulfococcus (DSS) and Desulfobulbaceae (DSB) exhibit increased rRNA expression in incubations with methane, suggestive of ANME-coupled activity. Direct analysis of anabolic activity in DSS cells in consortia with ANME by FISH-NanoSIMS confirmed their dependence on methanotrophy, with no (15)NH4(+) assimilation detected without methane. In contrast, DSS and DSB cells found physically independent of ANME (i.e., single cells) were anabolically active in incubations both with and without methane. These single cells therefore comprise an active 'free-living' population, and are not dependent on methane or ANME activity. We investigated the possibility of N2 fixation by seep Deltaproteobacteria and detected nifH transcripts closely related to those of cultured diazotrophic Deltaproteobacteria. However, nifH expression was methane-dependent. (15)N2 incorporation was not observed in single DSS cells, but was detected in single DSB cells. Interestingly, (15)N2 incorporation in single DSB cells was methane-dependent, raising the possibility that DSB cells acquired reduced (15)N products from diazotrophic ANME while spatially coupled, and then subsequently dissociated. With this combined data set we address several outstanding questions in methane seep microbial ecosystems and highlight the benefit of measuring microbial activity in the context of spatial associations. PMID:26394007

  15. Parallel Dislocation Simulator

    Energy Science and Technology Software Center (ESTSC)

    2006-10-30

    ParaDiS is software capable of simulating the motion, evolution, and interaction of dislocation networks in single crystals using massively parallel computer architectures. The software is capable of outputting the stress-strain response of a single crystal whose plastic deformation is controlled by the dislocation processes.

  16. Flowfield-Dependent Variation (FDV) method for compressible, incompressible, viscous, and inviscid flow interactions with FDV adaptive mesh refinements and parallel processing

    NASA Astrophysics Data System (ADS)

    Heard, Gary Wayne

    A new approach to solution-adaptive grid refinement using the finite element method and Flowfield-Dependent Variation (FDV) theory applied to the Navier-Stokes system of equations is discussed. Flowfield-Dependent Variation (FDV) parameters are introduced into a modified Taylor series expansion of the conservation variables, with the Navier-Stokes system of equations substituted into the Taylor series. The FDV parameters are calculated from the current Fowfield conditions, and automatically adjust the resulting equations from elliptic to parabolic to hyperbolic in type to assure solution accuracy in evolving fluid flowfields that may consist of interactions between regions of compressible and incompressible flow, viscous and inviscid flow, and turbulent and laminar flow. The system of equations is solved using an element-by-element iterative GMRES solver with the elements grouped together to allow the element operations to be performed in parallel. The FDV parameters play many roles in the numerical scheme. One of these roles is to control formations of shock wave discontinuities in high speeds and pressure oscillations in low speeds. To demonstrate these abilities, various example problems are shown, including supersonic flows over a flat plate and a compression corner, and flows involving triple shock waves generated on fin geometries for high speed compressible flows. Furthermore, analysis of low speed incompressible flows is presented in the form of flow in a lid-driven cavity at various Reynolds numbers. Another role of the FDV parameters is their use as error indicators for a solution-adaptive mesh. The finite element grid is refined as dictated by the magnitude of the FDV parameters. Examples of adaptive grids generated using the FDV parameters as error indicators are presented for supersonic flow over flat plate/compression ramp combinations in both two and three dimensions. Grids refined using the FDV parameters as error indicators are comparable to ones refined using primitive variable error indicators, and require less computational time to generate the grids. The use of parallel processing in performing some element operations is shown to reduce the wall clock time approximately forty-five percent in going from one to eight processors. Finally, the algorithm's ability to solve a flowfield containing interactions and transitions between regions of incompressible and compressible, viscous and inviscid, and laminar and turbulent flow is demonstrated by modeling the flowfield generated by supersonic flow over a compression ramp located between two fins. The structure of the resulting systems of shock waves are analyzed and compared with planar laser scattering images obtained experimentally for similar flow structures.

  17. Non-equilibrium reaction and relaxation dynamics in a strongly interacting explicit solvent: F + CD3CN treated with a parallel multi-state EVB model

    NASA Astrophysics Data System (ADS)

    Glowacki, David R.; Orr-Ewing, Andrew J.; Harvey, Jeremy N.

    2015-07-01

    We describe a parallelized linear-scaling computational framework developed to implement arbitrarily large multi-state empirical valence bond (MS-EVB) calculations within CHARMM and TINKER. Forces are obtained using the Hellmann-Feynman relationship, giving continuous gradients, and good energy conservation. Utilizing multi-dimensional Gaussian coupling elements fit to explicitly correlated coupled cluster theory, we built a 64-state MS-EVB model designed to study the F + CD3CN → DF + CD2CN reaction in CD3CN solvent (recently reported in Dunning et al. [Science 347(6221), 530 (2015)]). This approach allows us to build a reactive potential energy surface whose balanced accuracy and efficiency considerably surpass what we could achieve otherwise. We ran molecular dynamics simulations to examine a range of observables which follow in the wake of the reactive event: energy deposition in the nascent reaction products, vibrational relaxation rates of excited DF in CD3CN solvent, equilibrium power spectra of DF in CD3CN, and time dependent spectral shifts associated with relaxation of the nascent DF. Many of our results are in good agreement with time-resolved experimental observations, providing evidence for the accuracy of our MS-EVB framework in treating both the solute and solute/solvent interactions. The simulations provide additional insight into the dynamics at sub-picosecond time scales that are difficult to resolve experimentally. In particular, the simulations show that (immediately following deuterium abstraction) the nascent DF finds itself in a non-equilibrium regime in two different respects: (1) it is highly vibrationally excited, with ˜23 kcal mol-1 localized in the stretch and (2) its post-reaction solvation environment, in which it is not yet hydrogen-bonded to CD3CN solvent molecules, is intermediate between the non-interacting gas-phase limit and the solution-phase equilibrium limit. Vibrational relaxation of the nascent DF results in a spectral blue shift, while relaxation of the post-reaction solvation environment results in a red shift. These two competing effects mean that the post-reaction relaxation profile is distinct from what is observed when Franck-Condon vibrational excitation of DF occurs within a microsolvation environment initially at equilibrium. Our conclusions, along with the theoretical and parallel software framework presented in this paper, should be more broadly applicable to a range of complex reactive systems.

  18. Aeroacoustic theory for noncompact wing-gust interaction

    NASA Technical Reports Server (NTRS)

    Martinez, R.; Widnall, S. E.

    1981-01-01

    Three aeroacoustic models for noncompact wing-gust interaction were developed for subsonic flow. The first is that for a two dimensional (infinite span) wing passing through an oblique gust. The unsteady pressure field was obtained by the Wiener-Hopf technique; the airfoil loading and the associated acoustic field were calculated, respectively, by allowing the field point down on the airfoil surface, or by letting it go to infinity. The second model is a simple spanwise superposition of two dimensional solutions to account for three dimensional acoustic effects of wing rotation (for a helicopter blade, or some other rotating planform) and of finiteness of wing span. A three dimensional theory for a single gust was applied to calculate the acoustic signature in closed form due to blade vortex interaction in helicopters. The third model is that of a quarter infinite plate with side edge through a gust at high subsonic speed. An approximate solution for the three dimensional loading and the associated three dimensional acoustic field in closed form was obtained. The results reflected the acoustic effect of satisfying the correct loading condition at the side edge.

  19. Open-Label, Single-Dose, Parallel-Group Study in Healthy Volunteers To Determine the Drug-Drug Interaction Potential between KAE609 (Cipargamin) and Piperaquine

    PubMed Central

    Jain, Jay Prakash; Kangas, Michael; Lefvre, Gilbert; Machineni, Surendra; Griffin, Paul; Lickliter, Jason

    2015-01-01

    KAE609 represents a new class of potent, fast-acting, schizonticidal antimalarials. This study investigated the safety and pharmacokinetics of KAE609 in combination with the long-acting antimalarial piperaquine (PPQ) in healthy volunteers. A two-way pharmacokinetic interaction was hypothesized for KAE609 and PPQ, as both drugs are CYP3A4 substrates and inhibitors. The potential for both agents to affect the QT interval was also assessed. This was an open-label, parallel-group, single-dose study with healthy volunteers. Subjects were randomized to four parallel dosing arms with five cohorts (2:2:2:2:1), receiving 75 mg KAE609 plus 320 mg PPQ, 25 mg KAE609 plus 1,280 mg PPQ, 25 mg KAE609 alone, 320 mg PPQ alone, or 1,280 mg PPQ alone. Triplicate electrocardiograms were performed over the first 24 h after dosing, with single electrocardiograms at other time points. Routine safety (up to 89 days) and pharmacokinetic (up to 61 days) assessments were performed. Of the 110 subjects recruited, 99 completed the study. Coadministration of PPQ had no overall effect on exposure to KAE609, although 1,280 mg PPQ decreased the KAE609 maximum concentration (Cmax) by 17%. The group that received 25 mg KAE609 plus 1,280 mg PPQ showed a 32% increase in the PPQ area under the concentration-time curve from 0 to infinity (AUCinf), while the group that received 75 mg KAE609 plus 320 mg PPQ showed a 14% reduction. Mean changes from baseline in the QT interval corrected by Fridericia's method (QTcF) and the QT interval corrected by Bazett's method (QTcB) with PPQ were consistent with its known effects. PPQ but not KAE609 exposure correlated with corrected QT interval (QTc) increases, and KAE609 did not affect the PPQ exposure-QTc relationship. The QTcF effect for PPQ (least-squares estimate of the difference in mean maximal changes from baseline of 7.47 ms [90% confidence interval, 3.55 to 11.4 ms]) was consistent with the criteria for a positive thorough QT study. No subject had QTcF or QTcB values of >500 ms. Both drugs given alone or in combination were well tolerated, with no deaths, serious adverse events (AEs), or severe AEs reported. Most AEs were mild; upper respiratory tract infections, headache, diarrhea, and oropharyngeal pain were most common. PPQ and KAE609 coadministration had no relevant effect on exposure to either agent, and KAE609 did not affect or potentiate the known effects of PPQ on cardiac conduction. PMID:25845867

  20. Getting a feel for parameters: using interactive parallel plots as a tool for parameter identification in the new rainfall-runoff model WALRUS

    NASA Astrophysics Data System (ADS)

    Brauer, Claudia; Torfs, Paul; Teuling, Ryan; Uijlenhoet, Remko

    2015-04-01

    Recently, we developed the Wageningen Lowland Runoff Simulator (WALRUS) to fill the gap between complex, spatially distributed models often used in lowland catchments and simple, parametric models which have mostly been developed for mountainous catchments (Brauer et al., 2014ab). This parametric rainfall-runoff model can be used all over the world in both freely draining lowland catchments and polders with controlled water levels. The open source model code is implemented in R and can be downloaded from www.github.com/ClaudiaBrauer/WALRUS. The structure and code of WALRUS are simple, which facilitates detailed investigation of the effect of parameters on all model variables. WALRUS contains only four parameters requiring calibration; they are intended to have a strong, qualitative relation with catchment characteristics. Parameter estimation remains a challenge, however. The model structure contains three main feedbacks: (1) between groundwater and surface water; (2) between saturated and unsaturated zone; (3) between catchment wetness and (quick/slow) flowroute division. These feedbacks represent essential rainfall-runoff processes in lowland catchments, but increase the risk of parameter dependence and equifinality. Therefore, model performance should not only be judged based on a comparison between modelled and observed discharges, but also based on the plausibility of the internal modelled variables. Here, we present a method to analyse the effect of parameter values on internal model states and fluxes in a qualitative and intuitive way using interactive parallel plotting. We applied WALRUS to ten Dutch catchments with different sizes, slopes and soil types and both freely draining and polder areas. The model was run with a large number of parameter sets, which were created using Latin Hypercube Sampling. The model output was characterised in terms of several signatures, both measures of goodness of fit and statistics of internal model variables (such as the percentage of rain water travelling through the quickflow reservoir). End users can then eliminate parameter combinations with unrealistic outcomes based on expert knowledge using interactive parallel plots. In these plots, for instance, ranges can be selected for each signature and only model runs which yield signature values in these ranges are highlighted. The resulting selection of realistic parameter sets can be used for ensemble simulations. C.C. Brauer, A.J. Teuling, P.J.J.F. Torfs, R. Uijlenhoet (2014a): The Wageningen Lowland Runoff Simulator (WALRUS): a lumped rainfall-runoff model for catchments with shallow groundwater, Geoscientific Model Development, 7, 2313-2332, www.geosci-model-dev.net/7/2313/2014/gmd-7-2313-2014.pdf C.C. Brauer, P.J.J.F. Torfs, A.J. Teuling, R. Uijlenhoet (2014b): The Wageningen Lowland Runoff Simulator (WALRUS): application to the Hupsel Brook catchment and Cabauw polder, Hydrology and Earth System Sciences, 18, 4007-4028, www.hydrol-earth-syst-sci.net/18/4007/2014/hess-18-4007-2014.pdf

  1. Development of a prototype PET scanner with depth-of-interaction measurement using solid-state photomultiplier arrays and parallel readout electronics

    PubMed Central

    Shao, Yiping; Sun, Xishan; Lan, Kejian A.; Bircher, Chad; Lou, Kai; Deng, Zhi

    2014-01-01

    In this study, we developed a prototype animal PET by applying several novel technologies to use the solid-state photomultiplier (SSPM) arrays for measuring the depth-of-interaction (DOI) and improving imaging performance. Each PET detector has an 88 array of about 1.91.930.0 mm3 lutetium-yttrium-oxyorthosilicate (LYSO) scintillators, with each end optically connected to a SSPM array (16-channel in a 44 matrix) through a light guide to enable continuous DOI measurement. Each SSPM has an active area of about 33 mm2, and its output is read by a custom-developed application-specific-integrated-circuit (ASIC) to directly convert analog signals to digital timing pulses that encode the interaction information. These pulses are transferred to and be decoded by a field-programmable-gate-array (FPGA) based time-to-digital convertor for coincident event selection and data acquisition. The independent readout of each SSPM and the parallel signal process can significantly improve the signal-to-noise ratio and enable using flexible algorithms for different data processes. The prototype PET consists of two rotating detector panels on a portable gantry with four detectors in each panel to provide 16 mm axial and variable transaxial field-of-view (FOV) sizes. List-mode ordered-subset-expectation-maximization image reconstruction was implemented. The measured mean energy, coincidence timing, and DOI resolution for a crystal were about 17.6%, 2.8 ns, and 5.6 mm, respectively. The measured transaxial resolutions at the center of the FOV were 2.0 mm and 2.3 mm for images reconstructed with and without DOI, respectively. In addition, the resolutions across the FOV with DOI were substantially better than those without DOI. The quality of PET images of both a hot-rod phantom and mouse acquired with DOI was much higher than that of images obtained without DOI. This study demonstrates that SSPM arrays and advanced readout/processing electronics can be used to develop a practical DOI-measureable PET scanner. PMID:24556629

  2. Development of a prototype PET scanner with depth-of-interaction measurement using solid-state photomultiplier arrays and parallel readout electronics

    NASA Astrophysics Data System (ADS)

    Shao, Yiping; Sun, Xishan; Lan, Kejian A.; Bircher, Chad; Lou, Kai; Deng, Zhi

    2014-03-01

    In this study, we developed a prototype animal PET by applying several novel technologies to use solid-state photomultiplier (SSPM) arrays to measure the depth of interaction (DOI) and improve imaging performance. Each PET detector has an 8 8 array of about 1.9 1.9 30.0 mm3 lutetium-yttrium-oxyorthosilicate scintillators, with each end optically connected to an SSPM array (16 channels in a 4 4 matrix) through a light guide to enable continuous DOI measurement. Each SSPM has an active area of about 3 3 mm2, and its output is read by a custom-developed application-specific integrated circuit to directly convert analogue signals to digital timing pulses that encode the interaction information. These pulses are transferred to and are decoded by a field-programmable gate array-based time-to-digital convertor for coincident event selection and data acquisition. The independent readout of each SSPM and the parallel signal process can significantly improve the signal-to-noise ratio and enable the use of flexible algorithms for different data processes. The prototype PET consists of two rotating detector panels on a portable gantry with four detectors in each panel to provide 16 mm axial and variable transaxial field-of-view (FOV) sizes. List-mode ordered subset expectation maximization image reconstruction was implemented. The measured mean energy, coincidence timing and DOI resolution for a crystal were about 17.6%, 2.8 ns and 5.6 mm, respectively. The measured transaxial resolutions at the center of the FOV were 2.0 mm and 2.3 mm for images reconstructed with and without DOI, respectively. In addition, the resolutions across the FOV with DOI were substantially better than those without DOI. The quality of PET images of both a hot-rod phantom and mouse acquired with DOI was much higher than that of images obtained without DOI. This study demonstrates that SSPM arrays and advanced readout/processing electronics can be used to develop a practical DOI-measureable PET scanner.

  3. A Fast Parallel Simulation Code for Interaction between Proto-Planetary Disk and Embedded Proto-Planets: Implementation for 3D Code

    SciTech Connect

    Li, Shengtai; Li, Hui

    2012-06-14

    We develop a 3D simulation code for interaction between the proto-planetary disk and embedded proto-planets. The protoplanetary disk is treated as a three-dimensional (3D), self-gravitating gas whose motion is described by the locally isothermal Navier-Stokes equations in a spherical coordinate centered on the star. The differential equations for the disk are similar to those given in Kley et al. (2009) with a different gravitational potential that is defined in Nelson et al. (2000). The equations are solved by directional split Godunov method for the inviscid Euler equations plus operator-split method for the viscous source terms. We use a sub-cycling technique for the azimuthal sweep to alleviate the time step restriction. We also extend the FARGO scheme of Masset (2000) and modified in Li et al. (2001) to our 3D code to accelerate the transport in the azimuthal direction. Furthermore, we have implemented a reduced 2D (r, {theta}) and a fully 3D self-gravity solver on our uniform disk grid, which extends our 2D method (Li, Buoni, & Li 2008) to 3D. This solver uses a mode cut-off strategy and combines FFT in the azimuthal direction and direct summation in the radial and meridional direction. An initial axis-symmetric equilibrium disk is generated via iteration between the disk density profile and the 2D disk-self-gravity. We do not need any softening in the disk self-gravity calculation as we have used a shifted grid method (Li et al. 2008) to calculate the potential. The motion of the planet is limited on the mid-plane and the equations are the same as given in D'Angelo et al. (2005), which we adapted to the polar coordinates with a fourth-order Runge-Kutta solver. The disk gravitational force on the planet is assumed to evolve linearly with time between two hydrodynamics time steps. The Planetary potential acting on the disk is calculated accurately with a small softening given by a cubic-spline form (Kley et al. 2009). Since the torque is extremely sensitive to the position of the planet, we adopt the corotating frame that allows the planet moving only in radial direction if only one planet is present. This code has been extensively tested on a number of problems. For the earthmass planet with constant aspect ratio h = 0.05, the torque calculated using our code matches quite well with the the 3D linear theory results by Tanaka et al. (2002). The code is fully parallelized via message-passing interface (MPI) and has very high parallel efficiency. Several numerical examples for both fixed planet and moving planet are provided to demonstrate the efficacy of the numerical method and code.

  4. Research investigation of helicopter main rotor/tail rotor interaction noise

    NASA Astrophysics Data System (ADS)

    Fitzgerald, J.; Kohlhepp, F.

    1988-05-01

    Acoustic measurements were obtained in a Langley 14 x 22 foot Subsonic Wind Tunnel to study the aeroacoustic interaction of 1/5th scale main rotor, tail rotor, and fuselage models. An extensive aeroacoustic data base was acquired for main rotor, tail rotor, fuselage aerodynamic interaction for moderate forward speed flight conditions. The details of the rotor models, experimental design and procedure, aerodynamic and acoustic data acquisition and reduction are presented. The model was initially operated in trim for selected fuselage angle of attack, main rotor tip-path-plane angle, and main rotor thrust combinations. The effects of repositioning the tail rotor in the main rotor wake and the corresponding tail rotor countertorque requirements were determined. Each rotor was subsequently tested in isolation at the thrust and angle of attack combinations for trim. The acoustic data indicated that the noise was primarily dominated by the main rotor, especially for moderate speed main rotor blade-vortex interaction conditions. The tail rotor noise increased when the main rotor was removed indicating that tail rotor inflow was improved with the main rotor present.

  5. Detached-eddy simulation of flow non-linearity of fluid-structural interactions using high order schemes and parallel computation

    NASA Astrophysics Data System (ADS)

    Wang, Baoyuan

    The objective of this research is to develop an efficient and accurate methodology to resolve flow non-linearity of fluid-structural interaction. To achieve this purpose, a numerical strategy to apply the detached-eddy simulation (DES) with a fully coupled fluid-structural interaction model is established for the first time. The following novel numerical algorithms are also created: a general sub-domain boundary mapping procedure for parallel computation to reduce wall clock simulation time, an efficient and low diffusion E-CUSP (LDE) scheme used as a Riemann solver to resolve discontinuities with minimal numerical dissipation, and an implicit high order accuracy weighted essentially non-oscillatory (WENO) scheme to capture shock waves. The Detached-Eddy Simulation is based on the model proposed by Spalart in 1997. Near solid walls within wall boundary layers, the Reynolds averaged Navier-Stokes (RANS) equations are solved. Outside of the wall boundary layers, the 3D filtered compressible Navier-Stokes equations are solved based on large eddy simulation(LES). The Spalart-Allmaras one equation turbulence model is solved to provide the Reynolds stresses in the RANS region and the subgrid scale stresses in the LES region. An improved 5th order finite differencing weighted essentially non-oscillatory (WENO) scheme with an optimized epsilon value is employed for the inviscid fluxes. The new LDE scheme used with the WENO scheme is able to capture crisp shock profiles and exact contact surfaces. A set of fully conservative 4th order finite central differencing schemes are used for the viscous terms. The 3D Navier-Stokes equations are discretized based on a conservative finite differencing scheme. The unfactored line Gauss-Seidel relaxation iteration is employed for time marching. A general sub-domain boundary mapping procedure is developed for arbitrary topology multi-block structured grids with grid points matched on sub-domain boundaries. Extensive numerical experiments are conducted to test the performance of the numerical algorithms. The RANS simulation with the Spalart-Allmaras one equation turbulence model is the foundation for DES and is hence validated with other transonic flows. The predicted results agree very well with the experiments. The RANS code is then further used to study the slot size effect of a co-flow jet (CFJ) airfoil. The DES solver with fully coupled fluid-structural interaction methodology is validated with vortex induced vibration of a cylinder and a transonic forced pitching airfoil. For the cylinder, the laminar Navier-Stokes equations are solved due to the low Reynolds number. The 3D effects are observed in both stationary and oscillating cylinder simulation because of the flow separations behind the cylinder. For the transonic forced pitching airfoil DES computation, there is no flow separation in the flow field. The DES results agree well with the RANS results. These two cases indicate that the DES is more effective on predicting flow separation. The DES code is used to simulate the limited cycle oscillation of NLR7301 airfoil. For the cases computed in this research, the predicted LCO frequency, amplitudes, averaged lift and moment, all agree excellently with the experiment. The solutions appear to have bifurcation and are dependent on the initial perturbation. The developed methodology is able to capture the LCO with very small amplitudes measured in the experiment. This is attributed to the high order low diffusion schemes, fully coupled FSI model, and the turbulence model used. This research appears to be the first time that a numerical simulation of LCO matches the experiment. The DES code is also used to simulate the CFJ airfoil jet mixing at high angle of attack. In conclusion, the numerical strategy of the high order DES with fully coupled FSI model and parallel computing developed in this research is demonstrated to have high accuracy, robustness, and efficiency. Future work to further maturate the methodology is suggested. (Abstract shortened by UMI.)

  6. Introduction to the POKER parallel programming environment

    SciTech Connect

    Snyder, L.

    1983-01-01

    The POKER parallel programming environment is a graphics-based, interactive system for programming the configurable, highly parallel (CHIP) computer. Designed to support nearly all aspects of parallel programming in one integrated system, POKER has been implemented as a (=35000 line) C program on the VAX 11/780 under UNIX. It provides a number of novel features including graphics programming of parallel processor communication. 4 references.

  7. Parallel pivoting combined with parallel reduction

    NASA Technical Reports Server (NTRS)

    Alaghband, Gita

    1987-01-01

    Parallel algorithms for triangularization of large, sparse, and unsymmetric matrices are presented. The method combines the parallel reduction with a new parallel pivoting technique, control over generations of fill-ins and a check for numerical stability, all done in parallel with the work being distributed over the active processes. The parallel technique uses the compatibility relation between pivots to identify parallel pivot candidates and uses the Markowitz number of pivots to minimize fill-in. This technique is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds.

  8. Serial Order: A Parallel Distributed Processing Approach.

    ERIC Educational Resources Information Center

    Jordan, Michael I.

    Human behavior shows a variety of serially ordered action sequences. This paper presents a theory of serial order which describes how sequences of actions might be learned and performed. In this theory, parallel interactions across time (coarticulation) and parallel interactions across space (dual-task interference) are viewed as two aspects of a

  9. Prediction of rotating-blade vortex noise from noise of nonrotating blades

    NASA Technical Reports Server (NTRS)

    Fink, M. R.; Schlinker, R. H.; Amiet, R. K.

    1976-01-01

    Measurements were conducted in an acoustic wind tunnel to determine vortex noise of nonrotating circular cylinders and NACA 0012 airfoils. Both constant-width and spanwise tapered models were tested at a low turbulence level. The constant-diameter cylinder and constant-chord airfoil also were tested in the turbulent wake generated by an upstream cylinder or airfoil. Vortex noise radiation from nonrotating circular cylinders at Reynolds numbers matching those of the rotating-blade tests were found to be strongly dependent on surface conditions and Reynolds number. Vortex noise of rotating circular cylinder blades, operating with and without the shed wake blown downstream, could be predicted using data for nonrotating circular cylinders as functions of Reynolds number. Vortex noise of nonrotating airfoils was found to be trailing-edge noise at a time frequence equal to that predicted for maximum-amplitude Tollmein-Schlichting instability waves at the trailing edge.

  10. Parallel rendering techniques for massively parallel visualization

    SciTech Connect

    Hansen, C.; Krogh, M.; Painter, J.

    1995-07-01

    As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory. and parallelism of Massively Parallel Processors (MPPs) are becoming increasingly important. For large applications rendering on the MPP tends to be preferable to rendering on a graphics workstation due to the MPP`s abundant resources: memory, disk, and numerous processors. The challenge becomes developing algorithms that can exploit these resources while minimizing overhead, typically communication costs. This paper will describe recent efforts in parallel rendering for polygonal primitives as well as parallel volumetric techniques. This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume render use a MIMD approach. Implementations for these algorithms are presented for the Thinking Ma.chines Corporation CM-5 MPP.

  11. Applied Parallel Metadata Indexing

    SciTech Connect

    Jacobi, Michael R

    2012-08-01

    The GPFS Archive is parallel archive is a parallel archive used by hundreds of users in the Turquoise collaboration network. It houses 4+ petabytes of data in more than 170 million files. Currently, users must navigate the file system to retrieve their data, requiring them to remember file paths and names. A better solution might allow users to tag data with meaningful labels and searach the archive using standard and user-defined metadata, while maintaining security. last summer, I developed the backend to a tool that adheres to these design goals. The backend works by importing GPFS metadata into a MongoDB cluster, which is then indexed on each attribute. This summer, the author implemented security and developed the user interfae for the search tool. To meet security requirements, each database table is associated with a single user, which only stores records that the user may read, and requires a set of credentials to access. The interface to the search tool is implemented using FUSE (Filesystem in USErspace). FUSE is an intermediate layer that intercepts file system calls and allows the developer to redefine how those calls behave. In the case of this tool, FUSE interfaces with MongoDB to issue queries and populate output. A FUSE implementation is desirable because it allows users to interact with the search tool using commands they are already familiar with. These security and interface additions are essential for a usable product.

  12. Massively Parallel QCD

    SciTech Connect

    Soltz, R; Vranas, P; Blumrich, M; Chen, D; Gara, A; Giampap, M; Heidelberger, P; Salapura, V; Sexton, J; Bhanot, G

    2007-04-11

    The theory of the strong nuclear force, Quantum Chromodynamics (QCD), can be numerically simulated from first principles on massively-parallel supercomputers using the method of Lattice Gauge Theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures that it suggests. We demonstrate these methods on the BlueGene massively-parallel supercomputer and argue that LQCD and the BlueGene architecture are a natural match. This can be traced to the simple fact that LQCD is a regular lattice discretization of space into lattice sites while the BlueGene supercomputer is a discretization of space into compute nodes, and that both are constrained by requirements of locality. This simple relation is both technologically important and theoretically intriguing. The main result of this paper is the speedup of LQCD using up to 131,072 CPUs on the largest BlueGene/L supercomputer. The speedup is perfect with sustained performance of about 20% of peak. This corresponds to a maximum of 70.5 sustained TFlop/s. At these speeds LQCD and BlueGene are poised to produce the next generation of strong interaction physics theoretical results.

  13. A comparison with theory of peak to peak sound level for a model helicopter rotor generating blade slap at low tip speeds

    NASA Technical Reports Server (NTRS)

    Fontana, R. R.; Hubbard, J. E., Jr.

    1983-01-01

    Mini-tuft and smoke flow visualization techniques have been developed for the investigation of model helicopter rotor blade vortex interaction noise at low tip speeds. These techniques allow the parameters required for calculation of the blade vortex interaction noise using the Widnall/Wolf model to be determined. The measured acoustics are compared with the predicted acoustics for each test condition. Under the conditions tested it is determined that the dominating acoustic pulse results from the interaction of the blade with a vortex 1-1/4 revolutions old at an interaction angle of less than 8 deg. The Widnall/Wolf model predicts the peak sound pressure level within 3 dB for blade vortex separation distances greater than 1 semichord, but it generally over predicts the peak S.P.L. by over 10 dB for blade vortex separation distances of less than 1/4 semichord.

  14. Computer-Aided Parallelizer and Optimizer

    NASA Technical Reports Server (NTRS)

    Jin, Haoqiang

    2011-01-01

    The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.

  15. Programming parallel processors

    SciTech Connect

    Babb, R.G. II

    1987-01-01

    This book surveys the major commercially available, scientific parallel computers with emphasis on how they are programmed. For each machine, the way in which parallel performance can be assessed is shown for the same, small example program. A wide range of parallel machines is covered, from superminis to parallel vector supercomputers, including both shared memory and message-passing machines. Topics covered include: exploiting multiprocessors: issues and options; Alliant FX/8; BBN Butterfly Parallel Processor; CRAY X-MP; FPS T Series Parallel Processor; IBM 3090; Intel iPSC Concurrent Computer; Loral Dataflo LDF 100; and Sequent Balance Series.

  16. Modeling the Backscatter and Transmitted Light of High Power Smoothed Beams with pF3D, a Massively Parallel Laser Plasma Interaction Code

    SciTech Connect

    Berger, R.L.; Divol, L.; Glenzer, S.; Hinkel, D.E.; Kirkwood, R.K.; Langdon, A.B.; Moody, J.D.; Still, C.H.; Suter, L.; Williams, E.A.; Young, P.E.

    2000-06-01

    Using the three-dimensional wave propagation code, F3D[Berger et al., Phys. Fluids B 5,2243 (1993), Berger et al., Phys. Plasmas 5,4337(1998)], and the massively parallel version pF3D, [Still et al. Phys. Plasmas 7 (2000)], we have computed the transmitted and reflected light for laser and plasma conditions in experiments that simulated ignition hohlraum conditions. The frequency spectrum and the wavenumber spectrum of the transmitted light are calculated and used to identify the relative contributions of stimulated forward Brillouin and self-focusing in hydrocarbon-filled balloons, commonly called gasbags. The effect of beam smoothing, smoothing by spectral dispersion (SSD) and polarization smoothing (PS), on the stimulated Brillouin backscatter (SBS) from Scale-1 NOVA hohlraums was simulated with the use nonlinear saturation models that limit the amplitude of the driven acoustic waves. Other experiments on CO{sub 2} gasbags simultaneously measure at a range of intensities the SBS reflectivity and the Thomson scatter from the SBS-driven acoustic waves that provide a more detailed test of the modeling. These calculations also predict that the backscattered light will be very nonuniform in the nearfield (the focusing system optics) which is important for specifying the backscatter intensities be tolerated by the National Ignition Facility laser system.

  17. Parallel flow diffusion battery

    DOEpatents

    Yeh, H.C.; Cheng, Y.S.

    1984-01-01

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  18. Parallel flow diffusion battery

    DOEpatents

    Yeh, Hsu-Chi (Albuquerque, NM); Cheng, Yung-Sung (Albuquerque, NM)

    1984-08-07

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  19. Parallel computations and control of adaptive structures

    NASA Technical Reports Server (NTRS)

    Park, K. C.; Alvin, Kenneth F.; Belvin, W. Keith; Chong, K. P. (Editor); Liu, S. C. (Editor); Li, J. C. (Editor)

    1991-01-01

    The equations of motion for structures with adaptive elements for vibration control are presented for parallel computations to be used as a software package for real-time control of flexible space structures. A brief introduction of the state-of-the-art parallel computational capability is also presented. Time marching strategies are developed for an effective use of massive parallel mapping, partitioning, and the necessary arithmetic operations. An example is offered for the simulation of control-structure interaction on a parallel computer and the impact of the approach presented for applications in other disciplines than aerospace industry is assessed.

  20. Parallel processing ITS

    SciTech Connect

    Fan, W.C.; Halbleib, J.A. Sr.

    1996-09-01

    This report provides a users` guide for parallel processing ITS on a UNIX workstation network, a shared-memory multiprocessor or a massively-parallel processor. The parallelized version of ITS is based on a master/slave model with message passing. Parallel issues such as random number generation, load balancing, and communication software are briefly discussed. Timing results for example problems are presented for demonstration purposes.

  1. Wave-particle interactions with parallel whistler waves: Nonlinear and time-dependent effects revealed by particle-in-cell simulations

    NASA Astrophysics Data System (ADS)

    Camporeale, Enrico; Zimbardo, Gaetano

    2015-09-01

    We present a self-consistent Particle-in-Cell simulation of the resonant interactions between anisotropic energetic electrons and a population of whistler waves, with parameters relevant to the Earth's radiation belt. By tracking PIC particles and comparing with test-particle simulations, we emphasize the importance of including nonlinear effects and time evolution in the modeling of wave-particle interactions, which are excluded in the resonant limit of quasi-linear theory routinely used in radiation belt studies. In particular, we show that pitch angle diffusion is enhanced during the linear growth phase, and it rapidly saturates well before a single bounce period. This calls into question the widely used bounce average performed in most radiation belt diffusion calculations. Furthermore, we discuss how the saturation is related to the fact that the domain in which the particles pitch angle diffuses is bounded, and to the well-known problem of 90° diffusion barrier.

  2. Parallel program design

    SciTech Connect

    Chandy, K.M.; Misra, J. )

    1989-01-01

    The main theme of this book demonstrates that to program parallel computers, you need to understand how to program any computer well -- program that is, independently of any specific architecture. It considers a wide spectrum of computer architectures, and develops parallel programs for a variety of problems. This book is a statement of unique and important ideas necessary for understanding parallel programs.

  3. Parallel simulation today

    NASA Technical Reports Server (NTRS)

    Nicol, David; Fujimoto, Richard

    1992-01-01

    This paper surveys topics that presently define the state of the art in parallel simulation. Included in the tutorial are discussions on new protocols, mathematical performance analysis, time parallelism, hardware support for parallel simulation, load balancing algorithms, and dynamic memory management for optimistic synchronization.

  4. Parallel algorithm development

    SciTech Connect

    Adams, T.F.

    1996-06-01

    Rapid changes in parallel computing technology are causing significant changes in the strategies being used for parallel algorithm development. One approach is simply to write computer code in a standard language like FORTRAN 77 or with the expectation that the compiler will produce executable code that will run in parallel. The alternatives are: (1) to build explicit message passing directly into the source code; or (2) to write source code without explicit reference to message passing or parallelism, but use a general communications library to provide efficient parallel execution. Application of these strategies is illustrated with examples of codes currently under development.

  5. Visualization and Tracking of Parallel CFD Simulations

    NASA Technical Reports Server (NTRS)

    Vaziri, Arsi; Kremenetsky, Mark

    1995-01-01

    We describe a system for interactive visualization and tracking of a 3-D unsteady computational fluid dynamics (CFD) simulation on a parallel computer. CM/AVS, a distributed, parallel implementation of a visualization environment (AVS) runs on the CM-5 parallel supercomputer. A CFD solver is run as a CM/AVS module on the CM-5. Data communication between the solver, other parallel visualization modules, and a graphics workstation, which is running AVS, are handled by CM/AVS. Partitioning of the visualization task, between CM-5 and the workstation, can be done interactively in the visual programming environment provided by AVS. Flow solver parameters can also be altered by programmable interactive widgets. This system partially removes the requirement of storing large solution files at frequent time steps, a characteristic of the traditional 'simulate (yields) store (yields) visualize' post-processing approach.

  6. Genome-Wide Fitness and Genetic Interactions Determined by Tn-seq, a High-Throughput Massively Parallel Sequencing Method for Microorganisms

    PubMed Central

    van Opijnen, Tim; Lazinski, David W.; Camilli, Andrew

    2015-01-01

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobials and vaccines. Here we present the method Tn-seq, with which it has become possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies in the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated transposon insertion library. After library selection, changes in frequency of each insertion mutant are determined by sequencing of the flanking regions en masse. These changes are used to calculate each mutant's fitness. The method was originally developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis, but has now been applied to several different microbial species. PMID:24733243

  7. Genome-Wide Fitness and Genetic Interactions Determined by Tn-seq, a High-Throughput Massively Parallel Sequencing Method for Microorganisms

    PubMed Central

    van Opijnen, Tim; Camilli, Andrew

    2013-01-01

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobials and vaccines. Here we present the method Tn-seq, with which it has become possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies in the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated Mariner transposon insertion library. After library selection, changes in frequency of each insertion mutant are determined by sequencing of the flanking regions en masse. These changes are used to calculate each mutant’s fitness. The method has been developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis; however, due to the wide activity of the Mariner transposon, Tn-seq can be applied to many different microbial species. PMID:21053251

  8. Genome-Wide Fitness and Genetic Interactions Determined by Tn-seq, a High-Throughput Massively Parallel Sequencing Method for Microorganisms.

    PubMed

    van Opijnen, Tim; Lazinski, David W; Camilli, Andrew

    2015-01-01

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobial agents and vaccines. This unit presents Tn-seq, a method that has made it possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies on the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated transposon insertion library. After library selection, changes in the frequency of each insertion mutant are determined by sequencing flanking regions en masse. These changes are used to calculate each mutant's fitness. The method was originally developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis, but has now been applied to several different microbial species. 2015 by John Wiley & Sons, Inc. PMID:25641100

  9. Genome-Wide Fitness and Genetic Interactions Determined by Tn-seq, a High-Throughput Massively Parallel Sequencing Method for Microorganisms

    PubMed Central

    van Opijnen, Tim; Lazinski, David W.; Camilli, Andrew

    2015-01-01

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobial agents and vaccines. This unit presents Tn-seq, a method that has made it possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies on the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated transposon insertion library. After library selection, changes in the frequency of each insertion mutant are determined by sequencing flanking regions en masse. These changes are used to calculate each mutant’s fitness. The method was originally developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis, but has now been applied to several different microbial species. PMID:25641100

  10. Parallel digital forensics infrastructure.

    SciTech Connect

    Liebrock, Lorie M.; Duggan, David Patrick

    2009-10-01

    This report documents the architecture and implementation of a Parallel Digital Forensics infrastructure. This infrastructure is necessary for supporting the design, implementation, and testing of new classes of parallel digital forensics tools. Digital Forensics has become extremely difficult with data sets of one terabyte and larger. The only way to overcome the processing time of these large sets is to identify and develop new parallel algorithms for performing the analysis. To support algorithm research, a flexible base infrastructure is required. A candidate architecture for this base infrastructure was designed, instantiated, and tested by this project, in collaboration with New Mexico Tech. Previous infrastructures were not designed and built specifically for the development and testing of parallel algorithms. With the size of forensics data sets only expected to increase significantly, this type of infrastructure support is necessary for continued research in parallel digital forensics. This report documents the implementation of the parallel digital forensics (PDF) infrastructure architecture and implementation.

  11. Application Portable Parallel Library

    NASA Technical Reports Server (NTRS)

    Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott

    1995-01-01

    Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.

  12. PCLIPS: Parallel CLIPS

    NASA Technical Reports Server (NTRS)

    Hall, Lawrence O.; Bennett, Bonnie H.; Tello, Ivan

    1994-01-01

    A parallel version of CLIPS 5.1 has been developed to run on Intel Hypercubes. The user interface is the same as that for CLIPS with some added commands to allow for parallel calls. A complete version of CLIPS runs on each node of the hypercube. The system has been instrumented to display the time spent in the match, recognize, and act cycles on each node. Only rule-level parallelism is supported. Parallel commands enable the assertion and retraction of facts to/from remote nodes working memory. Parallel CLIPS was used to implement a knowledge-based command, control, communications, and intelligence (C(sup 3)I) system to demonstrate the fusion of high-level, disparate sources. We discuss the nature of the information fusion problem, our approach, and implementation. Parallel CLIPS has also be used to run several benchmark parallel knowledge bases such as one to set up a cafeteria. Results show from running Parallel CLIPS with parallel knowledge base partitions indicate that significant speed increases, including superlinear in some cases, are possible.

  13. Linked-View Parallel Coordinate Plot Renderer

    Energy Science and Technology Software Center (ESTSC)

    2011-06-28

    This software allows multiple linked views for interactive querying via map-based data selection, bar chart analytic overlays, and high dynamic range (HDR) line renderings. The major component of the visualization package is a parallel coordinate renderer with binning, curved layouts, shader-based rendering, and other techniques to allow interactive visualization of multidimensional data.

  14. Eclipse Parallel Tools Platform

    Energy Science and Technology Software Center (ESTSC)

    2005-02-18

    Designing and developing parallel programs is an inherently complex task. Developers must choose from the many parallel architectures and programming paradigms that are available, and face a plethora of tools that are required to execute, debug, and analyze parallel programs i these environments. Few, if any, of these tools provide any degree of integration, or indeed any commonality in their user interfaces at all. This further complicates the parallel developer's task, hampering software engineering practices,more » and ultimately reducing productivity. One consequence of this complexity is that best practice in parallel application development has not advanced to the same degree as more traditional programming methodologies. The result is that there is currently no open-source, industry-strength platform that provides a highly integrated environment specifically designed for parallel application development. Eclipse is a universal tool-hosting platform that is designed to providing a robust, full-featured, commercial-quality, industry platform for the development of highly integrated tools. It provides a wide range of core services for tool integration that allow tool producers to concentrate on their tool technology rather than on platform specific issues. The Eclipse Integrated Development Environment is an open-source project that is supported by over 70 organizations, including IBM, Intel and HP. The Eclipse Parallel Tools Platform (PTP) plug-in extends the Eclipse framwork by providing support for a rich set of parallel programming languages and paradigms, and a core infrastructure for the integration of a wide variety of parallel tools. The first version of the PTP is a prototype that only provides minimal functionality for parallel tool integration of a wide variety of parallel tools. The first version of the PTP is a prototype that only provides minimal functionality for parallel tool integration, support for a small number of parallel architectures, and basis Fortran integration. Future versions will extend the functionality substantially, provide a number of core parallel tools, and provide support across a wide rang of parallel architectures and languages.« less

  15. Advanced parallel processing with supercomputer architectures

    SciTech Connect

    Hwang, K.

    1987-10-01

    This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers.

  16. Parallel Lisp simulator

    SciTech Connect

    Weening, J.S.

    1988-05-01

    CSIM is a simulator for parallel Lisp, based on a continuation passing interpreter. It models a shared-memory multiprocessor executing programs written in Common Lisp, extended with several primitives for creating and controlling processes. This paper describes the structure of the simulator, measures its performance, and gives an example of its use with a parallel Lisp program.

  17. Mirror versus parallel bimanual reaching

    PubMed Central

    2013-01-01

    Background In spite of their importance to everyday function, tasks that require both hands to work together such as lifting and carrying large objects have not been well studied and the full potential of how new technology might facilitate recovery remains unknown. Methods To help identify the best modes for self-teleoperated bimanual training, we used an advanced haptic/graphic environment to compare several modes of practice. In a 2-by-2 study, we compared mirror vs. parallel reaching movements, and also compared veridical display to one that transforms the right hand’s cursor to the opposite side, reducing the area that the visual system has to monitor. Twenty healthy, right-handed subjects (5 in each group) practiced 200 movements. We hypothesized that parallel reaching movements would be the best performing, and attending to one visual area would reduce the task difficulty. Results The two-way comparison revealed that mirror movement times took an average 1.24 s longer to complete than parallel. Surprisingly, subjects’ movement times moving to one target (attending to one visual area) also took an average of 1.66 s longer than subjects moving to two targets. For both hands, there was also a significant interaction effect, revealing the lowest errors for parallel movements moving to two targets (p < 0.001). This was the only group that began and maintained low errors throughout training. Conclusion Combined with other evidence, these results suggest that the most intuitive reaching performance can be observed with parallel movements with a veridical display (moving to two separate targets). These results point to the expected levels of challenge for these bimanual training modes, which could be used to advise therapy choices in self-neurorehabilitation. PMID:23837908

  18. Totally parallel multilevel algorithms

    NASA Technical Reports Server (NTRS)

    Frederickson, Paul O.

    1988-01-01

    Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.

  19. Massively parallel mathematical sieves

    SciTech Connect

    Montry, G.R.

    1989-01-01

    The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.

  20. Parallel computing works

    SciTech Connect

    Not Available

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  1. Introduction to the Poker Parallel Programming Environment. Interim technical report

    SciTech Connect

    Snyder, L.

    1983-08-01

    The Poker Parallel Programming Environment is a graphics-based, interactive system for programming the Configurable, High Parallel (CHiP) Computer. Designed to support nearly all aspects of parallel programming in one integrated system, Poker has been implemented as a (35,000 line) C program on the VAX 11/780 under UNIX. It provides a number of novel features including graphics programming of parallel processor communication.

  2. Synchronization Of Parallel Discrete Event Simulations

    NASA Technical Reports Server (NTRS)

    Steinman, Jeffrey S.

    1992-01-01

    Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.

  3. Parallel Adaptive Mesh Refinement

    SciTech Connect

    Diachin, L; Hornung, R; Plassmann, P; WIssink, A

    2005-03-04

    As large-scale, parallel computers have become more widely available and numerical models and algorithms have advanced, the range of physical phenomena that can be simulated has expanded dramatically. Many important science and engineering problems exhibit solutions with localized behavior where highly-detailed salient features or large gradients appear in certain regions which are separated by much larger regions where the solution is smooth. Examples include chemically-reacting flows with radiative heat transfer, high Reynolds number flows interacting with solid objects, and combustion problems where the flame front is essentially a two-dimensional sheet occupying a small part of a three-dimensional domain. Modeling such problems numerically requires approximating the governing partial differential equations on a discrete domain, or grid. Grid spacing is an important factor in determining the accuracy and cost of a computation. A fine grid may be needed to resolve key local features while a much coarser grid may suffice elsewhere. Employing a fine grid everywhere may be inefficient at best and, at worst, may make an adequately resolved simulation impractical. Moreover, the location and resolution of fine grid required for an accurate solution is a dynamic property of a problem's transient features and may not be known a priori. Adaptive mesh refinement (AMR) is a technique that can be used with both structured and unstructured meshes to adjust local grid spacing dynamically to capture solution features with an appropriate degree of resolution. Thus, computational resources can be focused where and when they are needed most to efficiently achieve an accurate solution without incurring the cost of a globally-fine grid. Figure 1.1 shows two example computations using AMR; on the left is a structured mesh calculation of a impulsively-sheared contact surface and on the right is the fuselage and volume discretization of an RAH-66 Comanche helicopter [35]. Note the ability of both meshing methods to resolve simulation details by varying the local grid spacing.

  4. Parallel nearest neighbor calculations

    NASA Astrophysics Data System (ADS)

    Trease, Harold

    We are just starting to parallelize the nearest neighbor portion of our free-Lagrange code. Our implementation of the nearest neighbor reconnection algorithm has not been parallelizable (i.e., we just flip one connection at a time). In this paper we consider what sort of nearest neighbor algorithms lend themselves to being parallelized. For example, the construction of the Voronoi mesh can be parallelized, but the construction of the Delaunay mesh (dual to the Voronoi mesh) cannot because of degenerate connections. We will show our most recent attempt to tessellate space with triangles or tetrahedrons with a new nearest neighbor construction algorithm called DAM (Dial-A-Mesh). This method has the characteristics of a parallel algorithm and produces a better tessellation of space than the Delaunay mesh. Parallel processing is becoming an everyday reality for us at Los Alamos. Our current production machines are Cray YMPs with 8 processors that can run independently or combined to work on one job. We are also exploring massive parallelism through the use of two 64K processor Connection Machines (CM2), where all the processors run in lock step mode. The effective application of 3-D computer models requires the use of parallel processing to achieve reasonable "turn around" times for our calculations.

  5. Bilingual parallel programming

    SciTech Connect

    Foster, I.; Overbeek, R.

    1990-01-01

    Numerous experiments have demonstrated that computationally intensive algorithms support adequate parallelism to exploit the potential of large parallel machines. Yet successful parallel implementations of serious applications are rare. The limiting factor is clearly programming technology. None of the approaches to parallel programming that have been proposed to date -- whether parallelizing compilers, language extensions, or new concurrent languages -- seem to adequately address the central problems of portability, expressiveness, efficiency, and compatibility with existing software. In this paper, we advocate an alternative approach to parallel programming based on what we call bilingual programming. We present evidence that this approach provides and effective solution to parallel programming problems. The key idea in bilingual programming is to construct the upper levels of applications in a high-level language while coding selected low-level components in low-level languages. This approach permits the advantages of a high-level notation (expressiveness, elegance, conciseness) to be obtained without the cost in performance normally associated with high-level approaches. In addition, it provides a natural framework for reusing existing code.

  6. The NAS parallel benchmarks

    NASA Technical Reports Server (NTRS)

    Bailey, David (Editor); Barton, John (Editor); Lasinski, Thomas (Editor); Simon, Horst (Editor)

    1993-01-01

    A new set of benchmarks was developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of a set of kernels, the 'Parallel Kernels,' and a simulated application benchmark. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification - all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.

  7. Radiative Heat Transfer in Combustion Applications: Parallel Efficiencies of Two Gas Models, Turbulent Radiation Interactions in Particulate Laden Flows, and Coarse Mesh Finite Difference Acceleration for Improved Temporal Accuracy

    NASA Astrophysics Data System (ADS)

    Cleveland, Mathew A.

    We investigate several aspects of the numerical solution of the radiative transfer equation in the context of coal combustion: the parallel efficiency of two commonly-used opacity models, the sensitivity of turbulent radiation interaction (TRI) effects to the presence of coal particulate, and an improvement of the order of temporal convergence using the coarse mesh finite difference (CMFD) method. There are four opacity models commonly employed to evaluate the radiative transfer equation in combustion applications; line-by-line (LBL), multigroup, band, and global. Most of these models have been rigorously evaluated for serial computations of a spectrum of problem types [1]. Studies of these models for parallel computations [2] are limited. We assessed the performance of the Spectral-Line-Based weighted sum of gray gasses (SLW) model, a global method related to K-distribution methods [1], and the LBL model. The LBL model directly interpolates opacity information from large data tables. The LBL model outperforms the SLW model in almost all cases, as suggested by Wang et al. [3]. The SLW model, however, shows superior parallel scaling performance and a decreased sensitivity to load imbalancing, suggesting that for some problems, global methods such as the SLW model, could outperform the LBL model. Turbulent radiation interaction (TRI) effects are associated with the differences in the time scales of the fluid dynamic equations and the radiative transfer equations. Solving on the fluid dynamic time step size produces large changes in the radiation field over the time step. We have modified the statistically homogeneous, non-premixed flame problem of Deshmukh et al. [4] to include coal-type particulate. The addition of low mass loadings of particulate minimally impacts the TRI effects. Observed differences in the TRI effects from variations in the packing fractions and Stokes numbers are difficult to analyze because of the significant effect of variations in problem initialization. The TRI effects are very sensitive to the initialization of the turbulence in the system. The TRI parameters are somewhat sensitive to the treatment of particulate temperature and the particulate optical thickness, and this effect are amplified by increased particulate loading. Monte Carlo radiative heat transfer simulations of time-dependent combustion processes generally involve an explicit evaluation of emission source because of the expense of the transport solver. Recently, Park et al. [5] have applied quasi-diffusion with Monte Carlo in high energy density radiative transfer applications. We employ a Crank-Nicholson temporal integration scheme in conjunction with the coarse mesh finite difference (CMFD) method, in an effort to improve the temporal accuracy of the Monte Carlo solver. Our results show that this CMFD-CN method is an improvement over Monte Carlo with CMFD time-differenced via Backward Euler, and Implicit Monte Carlo [6] (IMC). The increase in accuracy involves very little increase in computational cost, and the figure of merit for the CMFD-CN scheme is greater than IMC.

  8. Simplified Parallel Domain Traversal

    SciTech Connect

    Erickson III, David J

    2011-01-01

    Many data-intensive scientific analysis techniques require global domain traversal, which over the years has been a bottleneck for efficient parallelization across distributed-memory architectures. Inspired by MapReduce and other simplified parallel programming approaches, we have designed DStep, a flexible system that greatly simplifies efficient parallelization of domain traversal techniques at scale. In order to deliver both simplicity to users as well as scalability on HPC platforms, we introduce a novel two-tiered communication architecture for managing and exploiting asynchronous communication loads. We also integrate our design with advanced parallel I/O techniques that operate directly on native simulation output. We demonstrate DStep by performing teleconnection analysis across ensemble runs of terascale atmospheric CO{sub 2} and climate data, and we show scalability results on up to 65,536 IBM BlueGene/P cores.

  9. The Parallel Axiom

    ERIC Educational Resources Information Center

    Rogers, Pat

    1972-01-01

    Criteria for a reasonable axiomatic system are discussed. A discussion of the historical attempts to prove the independence of Euclids parallel postulate introduces non-Euclidean geometries. Poincare's model for a non-Euclidean geometry is defined and analyzed. (LS)

  10. Parallel programming with PCN

    SciTech Connect

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  11. UCLA Parallel PIC Framework

    NASA Astrophysics Data System (ADS)

    Decyk, Viktor K.; Norton, Charles D.

    2004-12-01

    The UCLA Parallel PIC Framework (UPIC) has been developed to provide trusted components for the rapid construction of new, parallel Particle-in-Cell (PIC) codes. The Framework uses object-based ideas in Fortran95, and is designed to provide support for various kinds of PIC codes on various kinds of hardware. The focus is on student programmers. The Framework supports multiple numerical methods, different physics approximations, different numerical optimizations and implementations for different hardware. It is designed with "defensive" programming in mind, meaning that it contains many error checks and debugging helps. Above all, it is designed to hide the complexity of parallel processing. It is currently being used in a number of new Parallel PIC codes.

  12. Parallels with nature

    NASA Astrophysics Data System (ADS)

    2014-10-01

    Adam Nelson and Stuart Warriner, from the University of Leeds, talk with Nature Chemistry about their work to develop viable synthetic strategies for preparing new chemical structures in parallel with the identification of desirable biological activity.

  13. Scalable parallel communications

    NASA Technical Reports Server (NTRS)

    Maly, K.; Khanna, S.; Overstreet, C. M.; Mukkamala, R.; Zubair, M.; Sekhar, Y. S.; Foudriat, E. C.

    1992-01-01

    Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth service to a single application); and (3) coarse grain parallelism will be able to incorporate many future improvements from related work (e.g., reduced data movement, fast TCP, fine-grain parallelism) also with near linear speed-ups.

  14. Parallel DC notch filter

    NASA Astrophysics Data System (ADS)

    Kwok, Kam-Cheung; Chan, Ming-Kam

    1991-12-01

    In the process of image acquisition, the object of interest may not be evenly illuminated. So an image with shading irregularities would be produced. This type of image is very difficult to analyze. Consequently, a lot of research work concentrates on this problem. In order to remove the light illumination problem, one of the methods is to filter the image. The dc notch filter is one of the spatial domain filters used for reducing the effect of uneven light illumination on the image. Although the dc notch filter is a spatial domain filter, it is still rather time consuming to apply, especially when it is implemented on a microcomputer. To overcome the speed problem, a parallel dc notch filter is proposed. Based on the separability of the algorithm dc of notch filter, image parallelism (parallel image processing model) is used. To improve the performance of the microcomputer, an INMOS IMS B008 Module Mother Board with four IMS T800-17 is installed in the microcomputer. In fact, the dc notch filter is implemented on the transputer network. This parallel dc notch filter creates a great improvement in the computation time of the filter in comparison with the sequential one. Furthermore, the speed-up is used to analyze the performance of the parallel algorithm. As a result, parallel implementation of the dc notch filter on a transputer network gives a real-time performance of this filter.

  15. Code Parallelization with CAPO: A User Manual

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry; Biegel, Bryan (Technical Monitor)

    2001-01-01

    A software tool has been developed to assist the parallelization of scientific codes. This tool, CAPO, extends an existing parallelization toolkit, CAPTools developed at the University of Greenwich, to generate OpenMP parallel codes for shared memory architectures. This is an interactive toolkit to transform a serial Fortran application code to an equivalent parallel version of the software - in a small fraction of the time normally required for a manual parallelization. We first discuss the way in which loop types are categorized and how efficient OpenMP directives can be defined and inserted into the existing code using the in-depth interprocedural analysis. The use of the toolkit on a number of application codes ranging from benchmark to real-world application codes is presented. This will demonstrate the great potential of using the toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of processors. The second part of the document gives references to the parameters and the graphic user interface implemented in the toolkit. Finally a set of tutorials is included for hands-on experiences with this toolkit.

  16. Parallel time integration software

    Energy Science and Technology Software Center (ESTSC)

    2014-07-01

    This package implements an optimal-scaling multigrid solver for the (non) linear systems that arise from the discretization of problems with evolutionary behavior. Typically, solution algorithms for evolution equations are based on a time-marching approach, solving sequentially for one time step after the other. Parallelism in these traditional time-integrarion techniques is limited to spatial parallelism. However, current trends in computer architectures are leading twards system with more, but not faster. processors. Therefore, faster compute speeds mustmore » come from greater parallelism. One approach to achieve parallelism in time is with multigrid, but extending classical multigrid methods for elliptic poerators to this setting is a significant achievement. In this software, we implement a non-intrusive, optimal-scaling time-parallel method based on multigrid reduction techniques. The examples in the package demonstrate optimality of our multigrid-reduction-in-time algorithm (MGRIT) for solving a variety of parabolic equations in two and three sparial dimensions. These examples can also be used to show that MGRIT can achieve significant speedup in comparison to sequential time marching on modern architectures.« less

  17. Parallel time integration software

    SciTech Connect

    2014-07-01

    This package implements an optimal-scaling multigrid solver for the (non) linear systems that arise from the discretization of problems with evolutionary behavior. Typically, solution algorithms for evolution equations are based on a time-marching approach, solving sequentially for one time step after the other. Parallelism in these traditional time-integrarion techniques is limited to spatial parallelism. However, current trends in computer architectures are leading twards system with more, but not faster. processors. Therefore, faster compute speeds must come from greater parallelism. One approach to achieve parallelism in time is with multigrid, but extending classical multigrid methods for elliptic poerators to this setting is a significant achievement. In this software, we implement a non-intrusive, optimal-scaling time-parallel method based on multigrid reduction techniques. The examples in the package demonstrate optimality of our multigrid-reduction-in-time algorithm (MGRIT) for solving a variety of parabolic equations in two and three sparial dimensions. These examples can also be used to show that MGRIT can achieve significant speedup in comparison to sequential time marching on modern architectures.

  18. Parallel shear and turbulence

    NASA Astrophysics Data System (ADS)

    Hayes, Tiffany; Gilmore, Mark; Watts, Christopher; Xie, Shuangwei; Yan, Lincan

    2009-11-01

    Instabilities may be caused in plasma due to (shear) flow. These flows can be transverse or parallel to the magnetic field. Past work has generally focussed on controlling and understanding the processes that occur from (shear) flow transverse to the magnetic field. At UNM experimental work is being performed in the the HelCat device (Helicon Cathode) to control the parallel flow in order to study and understand the processes that arise from this situation. It is also our aim to be able to control the transverse flow simulatneously, but independently of the parallel flow. By inserting a system of biased rings and grids into the plasma we are able to modify the flows, and hence the turbulence. Flows are measured using a seven-tip Mach probe. Results of our ability to control the flows independently are presented.

  19. Parallel optical sampler

    SciTech Connect

    Tauke-Pedretti, Anna; Skogen, Erik J; Vawter, Gregory A

    2014-05-20

    An optical sampler includes a first and second 1.times.n optical beam splitters splitting an input optical sampling signal and an optical analog input signal into n parallel channels, respectively, a plurality of optical delay elements providing n parallel delayed input optical sampling signals, n photodiodes converting the n parallel optical analog input signals into n respective electrical output signals, and n optical modulators modulating the input optical sampling signal or the optical analog input signal by the respective electrical output signals, and providing n successive optical samples of the optical analog input signal. A plurality of output photodiodes and eADCs convert the n successive optical samples to n successive digital samples. The optical modulator may be a photodiode interconnected Mach-Zehnder Modulator. A method of sampling the optical analog input signal is disclosed.

  20. Parallel channel flow excursions

    SciTech Connect

    Johnston, B.S.

    1990-01-01

    Among the many known types of vapor-liquid flow instability is the excursion which may occur in heated parallel channels. Under certain conditions, the pressure drop requirement in a heated channel may increase with decreases in flow rate. This leads to an excursive reduction in flow. For channels heated by electricity or nuclear fission, this can result in overheating and damage to the channel. In the design of any parallel channel device, flow excursion limits should be established. After a review of parallel channel behavior and analysis, a conservative criterion will be proposed for avoiding excursions. In support of this criterion, recent experimental work on boiling in downward flow will be described. 5 figs.

  1. Parallelism in System Tools

    SciTech Connect

    Matney, Sr., Kenneth D; Shipman, Galen M

    2010-01-01

    The Cray XT, when employed in conjunction with the Lustre filesystem, has provided the ability to generate huge amounts of data in the form of many files. Typically, this is accommodated by satisfying the requests of large numbers of Lustre clients in parallel. In contrast, a single service node (Lustre client) cannot adequately service such datasets. This means that the use of traditional UNIX tools like cp, tar, et alli (with have no parallel capability) can result in substantial impact to user productivity. For example, to copy a 10 TB dataset from the service node using cp would take about 24 hours, under more or less ideal conditions. During production operation, this could easily extend to 36 hours. In this paper, we introduce the Lustre User Toolkit for Cray XT, developed at the Oak Ridge Leadership Computing Facility (OLCF). We will show that Linux commands, implementing highly parallel I/O algorithms, provide orders of magnitude greater performance, greatly reducing impact to productivity.

  2. Adaptive parallel logic networks

    NASA Technical Reports Server (NTRS)

    Martinez, Tony R.; Vidal, Jacques J.

    1988-01-01

    Adaptive, self-organizing concurrent systems (ASOCS) that combine self-organization with massive parallelism for such applications as adaptive logic devices, robotics, process control, and system malfunction management, are presently discussed. In ASOCS, an adaptive network composed of many simple computing elements operating in combinational and asynchronous fashion is used and problems are specified by presenting if-then rules to the system in the form of Boolean conjunctions. During data processing, which is a different operational phase from adaptation, the network acts as a parallel hardware circuit.

  3. Coarrars for Parallel Processing

    NASA Technical Reports Server (NTRS)

    Snyder, W. Van

    2011-01-01

    The design of the Coarray feature of Fortran 2008 was guided by answering the question "What is the smallest change required to convert Fortran to a robust and efficient parallel language." Two fundamental issues that any parallel programming model must address are work distribution and data distribution. In order to coordinate work distribution and data distribution, methods for communication and synchronization must be provided. Although originally designed for Fortran, the Coarray paradigm has stimulated development in other languages. X10, Chapel, UPC, Titanium, and class libraries being developed for C++ have the same conceptual framework.

  4. The NAS Parallel Benchmarks

    SciTech Connect

    Bailey, David H.

    2009-11-15

    The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, although the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage over vector supercomputers, and, if so, which of the parallel offerings would be most useful in real-world scientific computation. In part to draw attention to some of the performance reporting abuses prevalent at the time, the present author wrote a humorous essay 'Twelve Ways to Fool the Masses,' which described in a light-hearted way a number of the questionable ways in which both vendor marketing people and scientists were inflating and distorting their performance results. All of this underscored the need for an objective and scientifically defensible measure to compare performance on these systems.

  5. Speeding up parallel processing

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.

    1988-01-01

    In 1967 Amdahl expressed doubts about the ultimate utility of multiprocessors. The formulation, now called Amdahl's law, became part of the computing folklore and has inspired much skepticism about the ability of the current generation of massively parallel processors to efficiently deliver all their computing power to programs. The widely publicized recent results of a group at Sandia National Laboratory, which showed speedup on a 1024 node hypercube of over 500 for three fixed size problems and over 1000 for three scalable problems, have convincingly challenged this bit of folklore and have given new impetus to parallel scientific computing.

  6. VLSI and parallel computation

    SciTech Connect

    Suaya, R.; Birtwistle, G.

    1988-01-01

    This volume presents a cross-section of the most current research in parallel computation encompassing theoretical models, VLSI design, routing, and machine implementations. The book comprises a series of invited tutorial chapters on advanced topics in VLSI and concurrency. The chapters have been revised and updated to form a coherent volume exploring issues of fundamental importance in parallel computation, as well as significant research results in the contributor's specialties. Topics include load sharing models, PRAM models of computation, neural networks, Cochlea models, the design of algorithms for explicit concurrency, and VLSI CAD.

  7. Deoxyribo Nanonucleic Acid: Antiparallel, Parallel and Unparalleled

    SciTech Connect

    Egli, M.

    2010-03-05

    The crystal structure of a single-stranded DNA oligonucleotide has revealed formation of a unique three-dimensional array by continuous antiparallel and parallel pairing between monomers. The array is based on tertiary interactions and represents a second-generation nanotechnological system.

  8. Parallel Molecular Dynamics Program for Molecules

    Energy Science and Technology Software Center (ESTSC)

    1995-03-07

    ParBond is a parallel classical molecular dynamics code that models bonded molecular systems, typically of an organic nature. It uses classical force fields for both non-bonded Coulombic and Van der Waals interactions and for 2-, 3-, and 4-body bonded (bond, angle, dihedral, and improper) interactions. It integrates Newton''s equation of motion for the molecular system and evaluates various thermodynamical properties of the system as it progresses.

  9. Parallel hierarchical global illumination

    SciTech Connect

    Snell, Q.O.

    1997-10-08

    Solving the global illumination problem is equivalent to determining the intensity of every wavelength of light in all directions at every point in a given scene. The complexity of the problem has led researchers to use approximation methods for solving the problem on serial computers. Rather than using an approximation method, such as backward ray tracing or radiosity, the authors have chosen to solve the Rendering Equation by direct simulation of light transport from the light sources. This paper presents an algorithm that solves the Rendering Equation to any desired accuracy, and can be run in parallel on distributed memory or shared memory computer systems with excellent scaling properties. It appears superior in both speed and physical correctness to recent published methods involving bidirectional ray tracing or hybrid treatments of diffuse and specular surfaces. Like progressive radiosity methods, it dynamically refines the geometry decomposition where required, but does so without the excessive storage requirements for ray histories. The algorithm, called Photon, produces a scene which converges to the global illumination solution. This amounts to a huge task for a 1997-vintage serial computer, but using the power of a parallel supercomputer significantly reduces the time required to generate a solution. Currently, Photon can be run on most parallel environments from a shared memory multiprocessor to a parallel supercomputer, as well as on clusters of heterogeneous workstations.

  10. NAS Parallel Benchmarks Results

    NASA Technical Reports Server (NTRS)

    Subhash, Saini; Bailey, David H.; Lasinski, T. A. (Technical Monitor)

    1995-01-01

    The NAS Parallel Benchmarks (NPB) were developed in 1991 at NASA Ames Research Center to study the performance of parallel supercomputers. The eight benchmark problems are specified in a pencil and paper fashion i.e. the complete details of the problem to be solved are given in a technical document, and except for a few restrictions, benchmarkers are free to select the language constructs and implementation techniques best suited for a particular system. In this paper, we present new NPB performance results for the following systems: (a) Parallel-Vector Processors: Cray C90, Cray T'90 and Fujitsu VPP500; (b) Highly Parallel Processors: Cray T3D, IBM SP2 and IBM SP-TN2 (Thin Nodes 2); (c) Symmetric Multiprocessing Processors: Convex Exemplar SPP1000, Cray J90, DEC Alpha Server 8400 5/300, and SGI Power Challenge XL. We also present sustained performance per dollar for Class B LU, SP and BT benchmarks. We also mention NAS future plans of NPB.

  11. Optimizing parallel reduction operations

    SciTech Connect

    Denton, S.M.

    1995-06-01

    A parallel program consists of sets of concurrent and sequential tasks. Often, a reduction (such as array sum) sequentially combines values produced by a parallel computation. Because reductions occur so frequently in otherwise parallel programs, they are good candidates for optimization. Since reductions may introduce dependencies, most languages separate computation and reduction. The Sisal functional language is unique in that reduction is a natural consequence of loop expressions; the parallelism is implicit in the language. Unfortunately, the original language supports only seven reduction operations. To generalize these expressions, the Sisal 90 definition adds user-defined reductions at the language level. Applicable optimizations depend upon the mathematical properties of the reduction. Compilation and execution speed, synchronization overhead, memory use and maximum size influence the final implementation. This paper (1) Defines reduction syntax and compares with traditional concurrent methods; (2) Defines classes of reduction operations; (3) Develops analysis of classes for optimized concurrency; (4) Incorporates reductions into Sisal 1.2 and Sisal 90; (5) Evaluates performance and size of the implementations.

  12. Parallel Total Energy

    Energy Science and Technology Software Center (ESTSC)

    2004-10-21

    This is a total energy electronic structure code using Local Density Approximation (LDA) of the density funtional theory. It uses the plane wave as the wave function basis set. It can sue both the norm conserving pseudopotentials and the ultra soft pseudopotentials. It can relax the atomic positions according to the total energy. It is a parallel code using MP1.

  13. Parallel Multigrid Equation Solver

    Energy Science and Technology Software Center (ESTSC)

    2001-09-07

    Prometheus is a fully parallel multigrid equation solver for matrices that arise in unstructured grid finite element applications. It includes a geometric and an algebraic multigrid method and has solved problems of up to 76 mullion degrees of feedom, problems in linear elasticity on the ASCI blue pacific and ASCI red machines.

  14. Parallel programming with PCN

    SciTech Connect

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  15. Parallel hierarchical radiosity rendering

    SciTech Connect

    Carter, M.

    1993-07-01

    In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

  16. Parallel fast gauss transform

    SciTech Connect

    Sampath, Rahul S; Sundar, Hari; Veerapaneni, Shravan

    2010-01-01

    We present fast adaptive parallel algorithms to compute the sum of N Gaussians at N points. Direct sequential computation of this sum would take O(N{sup 2}) time. The parallel time complexity estimates for our algorithms are O(N/n{sub p}) for uniform point distributions and O( (N/n{sub p}) log (N/n{sub p}) + n{sub p}log n{sub p}) for non-uniform distributions using n{sub p} CPUs. We incorporate a plane-wave representation of the Gaussian kernel which permits 'diagonal translation'. We use parallel octrees and a new scheme for translating the plane-waves to efficiently handle non-uniform distributions. Computing the transform to six-digit accuracy at 120 billion points took approximately 140 seconds using 4096 cores on the Jaguar supercomputer. Our implementation is 'kernel-independent' and can handle other 'Gaussian-type' kernels even when explicit analytic expression for the kernel is not known. These algorithms form a new class of core computational machinery for solving parabolic PDEs on massively parallel architectures.

  17. High performance parallel architectures

    SciTech Connect

    Anderson, R.E. )

    1989-09-01

    In this paper the author describes current high performance parallel computer architectures. A taxonomy is presented to show computer architecture from the user programmer's point-of-view. The effects of the taxonomy upon the programming model are described. Some current architectures are described with respect to the taxonomy. Finally, some predictions about future systems are presented. 5 refs., 1 fig.

  18. A massively asynchronous, parallel brain

    PubMed Central

    Zeki, Semir

    2015-01-01

    Whether the visual brain uses a parallel or a serial, hierarchical, strategy to process visual signals, the end result appears to be that different attributes of the visual scene are perceived asynchronouslywith colour leading form (orientation) by 40 ms and direction of motion by about 80 ms. Whatever the neural root of this asynchrony, it creates a problem that has not been properly addressed, namely how visual attributes that are perceived asynchronously over brief time windows after stimulus onset are bound together in the longer term to give us a unified experience of the visual world, in which all attributes are apparently seen in perfect registration. In this review, I suggest that there is no central neural clock in the (visual) brain that synchronizes the activity of different processing systems. More likely, activity in each of the parallel processing-perceptual systems of the visual brain is reset independently, making of the brain a massively asynchronous organ, just like the new generation of more efficient computers promise to be. Given the asynchronous operations of the brain, it is likely that the results of activities in the different processing-perceptual systems are not bound by physiological interactions between cells in the specialized visual areas, but post-perceptually, outside the visual brain. PMID:25823871

  19. Programming parallel architectures - The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1989-01-01

    This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.

  20. Method of moment solutions to scattering problems in a parallel processing environment

    NASA Technical Reports Server (NTRS)

    Cwik, Tom; Partee, Jonathan; Patterson, Jean

    1991-01-01

    This paper describes the implementation of a parallelized method of moments (MOM) code into an interactive workstation environment. The workstation allows interactive solid body modeling and mesh generation, MOM analysis, and the graphical display of results. After describing the parallel computing environment, the implementation and results of parallelizing a general MOM code are presented in detail.

  1. Parallel computers and parallel algorithms for CFD: An introduction

    NASA Astrophysics Data System (ADS)

    Roose, Dirk; Vandriessche, Rafael

    1995-10-01

    This text presents a tutorial on those aspects of parallel computing that are important for the development of efficient parallel algorithms and software for computational fluid dynamics. We first review the main architectural features of parallel computers and we briefly describe some parallel systems on the market today. We introduce some important concepts concerning the development and the performance evaluation of parallel algorithms. We discuss how work load imbalance and communication costs on distributed memory parallel computers can be minimized. We present performance results for some CFD test cases. We focus on applications using structured and block structured grids, but the concepts and techniques are also valid for unstructured grids.

  2. Parallel Subconvolution Filtering Architectures

    NASA Technical Reports Server (NTRS)

    Gray, Andrew A.

    2003-01-01

    These architectures are based on methods of vector processing and the discrete-Fourier-transform/inverse-discrete- Fourier-transform (DFT-IDFT) overlap-and-save method, combined with time-block separation of digital filters into frequency-domain subfilters implemented by use of sub-convolutions. The parallel-processing method implemented in these architectures enables the use of relatively small DFT-IDFT pairs, while filter tap lengths are theoretically unlimited. The size of a DFT-IDFT pair is determined by the desired reduction in processing rate, rather than on the order of the filter that one seeks to implement. The emphasis in this report is on those aspects of the underlying theory and design rules that promote computational efficiency, parallel processing at reduced data rates, and simplification of the designs of very-large-scale integrated (VLSI) circuits needed to implement high-order filters and correlators.

  3. Homology, convergence and parallelism.

    PubMed

    Ghiselin, Michael T

    2016-01-01

    Homology is a relation of correspondence between parts of parts of larger wholes. It is used when tracking objects of interest through space and time and in the context of explanatory historical narratives. Homologues can be traced through a genealogical nexus back to a common ancestral precursor. Homology being a transitive relation, homologues remain homologous however much they may come to differ. Analogy is a relationship of correspondence between parts of members of classes having no relationship of common ancestry. Although homology is often treated as an alternative to convergence, the latter is not a kind of correspondence: rather, it is one of a class of processes that also includes divergence and parallelism. These often give rise to misleading appearances (homoplasies). Parallelism can be particularly hard to detect, especially when not accompanied by divergences in some parts of the body. PMID:26598721

  4. Parallel grid population

    DOEpatents

    Wald, Ingo; Ize, Santiago

    2015-07-28

    Parallel population of a grid with a plurality of objects using a plurality of processors. One example embodiment is a method for parallel population of a grid with a plurality of objects using a plurality of processors. The method includes a first act of dividing a grid into n distinct grid portions, where n is the number of processors available for populating the grid. The method also includes acts of dividing a plurality of objects into n distinct sets of objects, assigning a distinct set of objects to each processor such that each processor determines by which distinct grid portion(s) each object in its distinct set of objects is at least partially bounded, and assigning a distinct grid portion to each processor such that each processor populates its distinct grid portion with any objects that were previously determined to be at least partially bounded by its distinct grid portion.

  5. PCLIPS: Parallel CLIPS

    NASA Technical Reports Server (NTRS)

    Gryphon, Coranth D.; Miller, Mark D.

    1991-01-01

    PCLIPS (Parallel CLIPS) is a set of extensions to the C Language Integrated Production System (CLIPS) expert system language. PCLIPS is intended to provide an environment for the development of more complex, extensive expert systems. Multiple CLIPS expert systems are now capable of running simultaneously on separate processors, or separate machines, thus dramatically increasing the scope of solvable tasks within the expert systems. As a tool for parallel processing, PCLIPS allows for an expert system to add to its fact-base information generated by other expert systems, thus allowing systems to assist each other in solving a complex problem. This allows individual expert systems to be more compact and efficient, and thus run faster or on smaller machines.

  6. Ultrascalable petaflop parallel supercomputer

    DOEpatents

    Blumrich, Matthias A. (Ridgefield, CT); Chen, Dong (Croton On Hudson, NY); Chiu, George (Cross River, NY); Cipolla, Thomas M. (Katonah, NY); Coteus, Paul W. (Yorktown Heights, NY); Gara, Alan G. (Mount Kisco, NY); Giampapa, Mark E. (Irvington, NY); Hall, Shawn (Pleasantville, NY); Haring, Rudolf A. (Cortlandt Manor, NY); Heidelberger, Philip (Cortlandt Manor, NY); Kopcsay, Gerard V. (Yorktown Heights, NY); Ohmacht, Martin (Yorktown Heights, NY); Salapura, Valentina (Chappaqua, NY); Sugavanam, Krishnan (Mahopac, NY); Takken, Todd (Brewster, NY)

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  7. Parallel multilevel preconditioners

    SciTech Connect

    Bramble, J.H.; Pasciak, J.E.; Xu, Jinchao.

    1989-01-01

    In this paper, we shall report on some techniques for the development of preconditioners for the discrete systems which arise in the approximation of solutions to elliptic boundary value problems. Here we shall only state the resulting theorems. It has been demonstrated that preconditioned iteration techniques often lead to the most computationally effective algorithms for the solution of the large algebraic systems corresponding to boundary value problems in two and three dimensional Euclidean space. The use of preconditioned iteration will become even more important on computers with parallel architecture. This paper discusses an approach for developing completely parallel multilevel preconditioners. In order to illustrate the resulting algorithms, we shall describe the simplest application of the technique to a model elliptic problem.

  8. Parallel Anisotropic Tetrahedral Adaptation

    NASA Technical Reports Server (NTRS)

    Park, Michael A.; Darmofal, David L.

    2008-01-01

    An adaptive method that robustly produces high aspect ratio tetrahedra to a general 3D metric specification without introducing hybrid semi-structured regions is presented. The elemental operators and higher-level logic is described with their respective domain-decomposed parallelizations. An anisotropic tetrahedral grid adaptation scheme is demonstrated for 1000-1 stretching for a simple cube geometry. This form of adaptation is applicable to more complex domain boundaries via a cut-cell approach as demonstrated by a parallel 3D supersonic simulation of a complex fighter aircraft. To avoid the assumptions and approximations required to form a metric to specify adaptation, an approach is introduced that directly evaluates interpolation error. The grid is adapted to reduce and equidistribute this interpolation error calculation without the use of an intervening anisotropic metric. Direct interpolation error adaptation is illustrated for 1D and 3D domains.

  9. ASSEMBLY OF PARALLEL PLATES

    DOEpatents

    Groh, E.F.; Lennox, D.H.

    1963-04-23

    This invention is concerned with a rigid assembly of parallel plates in which keyways are stamped out along the edges of the plates and a self-retaining key is inserted into aligned keyways. Spacers having similar keyways are included between adjacent plates. The entire assembly is locked into a rigid structure by fastening only the outermost plates to the ends of the keys. (AEC)

  10. Xyce parallel electronic simulator.

    SciTech Connect

    Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.

    2010-05-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.

  11. Parallel sphere rendering

    SciTech Connect

    Krogh, M.; Painter, J.; Hansen, C.

    1996-10-01

    Sphere rendering is an important method for visualizing molecular dynamics data. This paper presents a parallel algorithm that is almost 90 times faster than current graphics workstations. To render extremely large data sets and large images, the algorithm uses the MIMD features of the supercomputers to divide up the data, render independent partial images, and then finally composite the multiple partial images using an optimal method. The algorithm and performance results are presented for the CM-5 and the M.

  12. Aerodynamic, aeroacoustic, and aeroelastic investigations of airfoil-vortex interaction using large-eddy simulation

    NASA Astrophysics Data System (ADS)

    Ilie, Marcel

    In helicopters, vortices (generated at the tip of the rotor blades) interact with the next advancing blades during certain flight and manoeuvring conditions, generating undesirable levels of acoustic noise and vibration. These Blade-Vortex Interactions (BVIs), which may cause the most disturbing acoustic noise, normally occur in descent or high-speed forward flight. Acoustic noise characterization (and potential reduction) is one the areas generating intensive research interest to the rotorcraft industry. Since experimental investigations of BVI are extremely costly, some insights into the BVI or AVI (2-D Airfoil-Vortex Interaction) can be gained using Computational Fluid Dynamics (CFD) numerical simulations. Numerical simulation of BVI or AVI has been of interest to CFD for many years. There are still difficulties concerning an accurate numerical prediction of BVI. One of the main issues is the inherent dissipation of CFD turbulence models, which severely affects the preservation of the vortex characteristics. Moreover this is not an issue only for aerodynamic and aeroacoustic analysis but also for aeroelastic investigations as well, especially when the strong (two-way) aeroelastic coupling is of interest. The present investigation concentrates mainly on AVI simulations. The simulations are performed for Mach number, Ma = 0.3, resulting in a Reynolds number, Re = 1.3 x 106, which is based on the chord, c, of the airfoil (NACA0012). Extensive literature search has indicated that the present work represents the first comprehensive investigation of AVI using the LES numerical approach, in the rotorcraft research community. The major factor affecting the aerodynamic coefficients and aeroacoustic field as a result of airfoil-vortex interaction is observed to be the unsteady pressure generated at the location of the interaction. The present numerical results show that the aerodynamic coefficients (lift, moment, and drag) and aeroacoustic field are strongly dependent on the airfoil-vortex vertical miss-distance, airfoil angle of attack, vortex characteristics, and aeroelastic response of airfoil to airfoil-vortex interaction. A decay of airfoil-vortex interactions with the increase of vertical miss-distance and angle of attack was observed. Also, a decay of airfoil-vortex interactions is observed for the case of a flexible structure when compared with the case of a rigid structure. The decay of vortex core size produces a decrease in the aerodynamic coefficients.

  13. Resistor Combinations for Parallel Circuits.

    ERIC Educational Resources Information Center

    McTernan, James P.

    1978-01-01

    To help simplify both teaching and learning of parallel circuits, a high school electricity/electronics teacher presents and illustrates the use of tables of values for parallel resistive circuits in which total resistances are whole numbers. (MF)

  14. The Galley Parallel File System

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David

    1996-01-01

    As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. The interface conceals the parallelism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. We discuss Galley's file structure and application interface, as well as an application that has been implemented using that interface.

  15. Parallel Eclipse Project Checkout

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas M.; Joswig, Joseph C.; Shams, Khawaja S.; Powell, Mark W.; Bachmann, Andrew G.

    2011-01-01

    Parallel Eclipse Project Checkout (PEPC) is a program written to leverage parallelism and to automate the checkout process of plug-ins created in Eclipse RCP (Rich Client Platform). Eclipse plug-ins can be aggregated in a feature project. This innovation digests a feature description (xml file) and automatically checks out all of the plug-ins listed in the feature. This resolves the issue of manually checking out each plug-in required to work on the project. To minimize the amount of time necessary to checkout the plug-ins, this program makes the plug-in checkouts parallel. After parsing the feature, a request to checkout for each plug-in in the feature has been inserted. These requests are handled by a thread pool with a configurable number of threads. By checking out the plug-ins in parallel, the checkout process is streamlined before getting started on the project. For instance, projects that took 30 minutes to checkout now take less than 5 minutes. The effect is especially clear on a Mac, which has a network monitor displaying the bandwidth use. When running the client from a developer s home, the checkout process now saturates the bandwidth in order to get all the plug-ins checked out as fast as possible. For comparison, a checkout process that ranged from 8-200 Kbps from a developer s home is now able to saturate a pipe of 1.3 Mbps, resulting in significantly faster checkouts. Eclipse IDE (integrated development environment) tries to build a project as soon as it is downloaded. As part of another optimization, this innovation programmatically tells Eclipse to stop building while checkouts are happening, which dramatically reduces lock contention and enables plug-ins to continue downloading until all of them finish. Furthermore, the software re-enables automatic building, and forces Eclipse to do a clean build once it finishes checking out all of the plug-ins. This software is fully generic and does not contain any NASA-specific code. It can be applied to any Eclipse-based repository with a similar structure. It also can apply build parameters and preferences automatically at the end of the checkout.

  16. Fastpath Speculative Parallelization

    NASA Astrophysics Data System (ADS)

    Spear, Michael F.; Kelsey, Kirk; Bai, Tongxin; Dalessandro, Luke; Scott, Michael L.; Ding, Chen; Wu, Peng

    We describe Fastpath, a system for speculative parallelization of sequential programs on conventional multicore processors. Our system distinguishes between the lead thread, which executes at almost-native speed, and speculative threads, which execute somewhat slower. This allows us to achieve nontrivial speedup, even on two-core machines. We present a mathematical model of potential speedup, parameterized by application characteristics and implementation constants. We also present preliminary results gleaned from two different Fastpath implementations, each derived from an implementation of software transactional memory.

  17. Highly parallel computation

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.; Tichy, Walter F.

    1990-01-01

    Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed.

  18. Parallel sphere rendering

    SciTech Connect

    Krogh, M.; Hansen, C.; Painter, J.; de Verdiere, G.C.

    1995-05-01

    Sphere rendering is an important method for visualizing molecular dynamics data. This paper presents a parallel divide-and-conquer algorithm that is almost 90 times faster than current graphics workstations. To render extremely large data sets and large images, the algorithm uses the MIMD features of the supercomputers to divide up the data, render independent partial images, and then finally composite the multiple partial images using an optimal method. The algorithm and performance results are presented for the CM-5 and the T3D.

  19. Equalizer: a scalable parallel rendering framework.

    PubMed

    Eilemann, Stefan; Makhinya, Maxim; Pajarola, Renato

    2009-01-01

    Continuing improvements in CPU and GPU performances as well as increasing multi-core processor and cluster-based parallelism demand for flexible and scalable parallel rendering solutions that can exploit multipipe hardware accelerated graphics. In fact, to achieve interactive visualization, scalable rendering systems are essential to cope with the rapid growth of data sets. However, parallel rendering systems are non-trivial to develop and often only application specific implementations have been proposed. The task of developing a scalable parallel rendering framework is even more difficult if it should be generic to support various types of data and visualization applications, and at the same time work efficiently on a cluster with distributed graphics cards. In this paper we introduce a novel system called Equalizer, a toolkit for scalable parallel rendering based on OpenGL which provides an application programming interface (API) to develop scalable graphics applications for a wide range of systems ranging from large distributed visualization clusters and multi-processor multipipe graphics systems to single-processor single-pipe desktop machines. We describe the system architecture, the basic API, discuss its advantages over previous approaches, present example configurations and usage scenarios as well as scalability results. PMID:19282550

  20. Parallel Pascal - An extended Pascal for parallel computers

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.

    1984-01-01

    Parallel Pascal is an extended version of the conventional serial Pascal programming language which includes a convenient syntax for specifying array operations. It is upward compatible with standard Pascal and involves only a small number of carefully chosen new features. Parallel Pascal was developed to reduce the semantic gap between standard Pascal and a large range of highly parallel computers. Two important design goals of Parallel Pascal were efficiency and portability. Portability is particularly difficult to achieve since different parallel computers frequently have very different capabilities.

  1. New Computational Methods for the Prediction and Analysis of Helicopter Noise

    NASA Technical Reports Server (NTRS)

    Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak

    1996-01-01

    This paper describes several new methods to predict and analyze rotorcraft noise. These methods are: 1) a combined computational fluid dynamics and Kirchhoff scheme for far-field noise predictions, 2) parallel computer implementation of the Kirchhoff integrations, 3) audio and visual rendering of the computed acoustic predictions over large far-field regions, and 4) acoustic tracebacks to the Kirchhoff surface to pinpoint the sources of the rotor noise. The paper describes each method and presents sample results for three test cases. The first case consists of in-plane high-speed impulsive noise and the other two cases show idealized parallel and oblique blade-vortex interactions. The computed results show good agreement with available experimental data but convey much more information about the far-field noise propagation. When taken together, these new analysis methods exploit the power of new computer technologies and offer the potential to significantly improve our prediction and understanding of rotorcraft noise.

  2. CSM parallel structural methods research

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.

    1989-01-01

    Parallel structural methods, research team activities, advanced architecture computers for parallel computational structural mechanics (CSM) research, the FLEX/32 multicomputer, a parallel structural analyses testbed, blade-stiffened aluminum panel with a circular cutout and the dynamic characteristics of a 60 meter, 54-bay, 3-longeron deployable truss beam are among the topics discussed.

  3. Synchronous Parallel Kinetic Monte Carlo

    SciTech Connect

    Mart?nez, E; Marian, J; Kalos, M H

    2006-12-14

    A novel parallel kinetic Monte Carlo (kMC) algorithm formulated on the basis of perfect time synchronicity is presented. The algorithm provides an exact generalization of any standard serial kMC model and is trivially implemented in parallel architectures. We demonstrate the mathematical validity and parallel performance of the method by solving several well-understood problems in diffusion.

  4. Roo: A parallel theorem prover

    SciTech Connect

    Lusk, E.L.; McCune, W.W.; Slaney, J.K.

    1991-11-01

    We describe a parallel theorem prover based on the Argonne theorem-proving system OTTER. The parallel system, called Roo, runs on shared-memory multiprocessors such as the Sequent Symmetry. We explain the parallel algorithm used and give performance results that demonstrate near-linear speedups on large problems.

  5. Parallelized direct execution simulation of message-passing parallel programs

    NASA Technical Reports Server (NTRS)

    Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

    1994-01-01

    As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.

  6. Parallel ptychographic reconstruction

    PubMed Central

    Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom; Deng, Junjing; Ross, Rob; Jacobsen, Chris

    2014-01-01

    Ptychography is an imaging method whereby a coherent beam is scanned across an object, and an image is obtained by iterative phasing of the set of diffraction patterns. It is able to be used to image extended objects at a resolution limited by scattering strength of the object and detector geometry, rather than at an optics-imposed limit. As technical advances allow larger fields to be imaged, computational challenges arise for reconstructing the correspondingly larger data volumes, yet at the same time there is also a need to deliver reconstructed images immediately so that one can evaluate the next steps to take in an experiment. Here we present a parallel method for real-time ptychographic phase retrieval. It uses a hybrid parallel strategy to divide the computation between multiple graphics processing units (GPUs) and then employs novel techniques to merge sub-datasets into a single complex phase and amplitude image. Results are shown on a simulated specimen and a real dataset from an X-ray experiment conducted at a synchrotron light source. PMID:25607174

  7. Benchmarking massively parallel architectures

    SciTech Connect

    Lubeck, O.; Moore, J.; Simmons, M.; Wasserman, H.

    1993-01-01

    The purpose of this paper is to summarize some initial experiences related to measuring the performance of massively parallel processors (MPPs) at Los Alamos National Laboratory (LANL). Actually, the range of MPP architectures the authors have used is rather limited, being confined mostly to the Thinking Machines Corporation (TMC) Connection Machine CM-2 and CM-5. Some very preliminary work has been carried out on the Kendall Square KSR-1, and efforts related to other machines, such as the Intel Paragon and the soon-to-be-released CRAY T3D are planned. This paper will concentrate more on methodology rather than discuss specific architectural strengths and weaknesses; the latter is expected to be the subject of future reports. MPP benchmarking is a field in critical need of structure and definition. As the authors have stated previously, such machines have enormous potential, and there is certainly a dire need for orders of magnitude computational power over current supercomputers. However, performance reports for MPPs must emphasize actual sustainable performance from real applications in a careful, responsible manner. Such has not always been the case. A recent paper has described in some detail, the problem of potentially misleading performance reporting in the parallel scientific computing field. Thus, in this paper, the authors briefly offer a few general ideas on MPP performance analysis.

  8. Tolerant (parallel) Programming

    NASA Technical Reports Server (NTRS)

    DiNucci, David C.; Bailey, David H. (Technical Monitor)

    1997-01-01

    In order to be truly portable, a program must be tolerant of a wide range of development and execution environments, and a parallel program is just one which must be tolerant of a very wide range. This paper first defines the term "tolerant programming", then describes many layers of tools to accomplish it. The primary focus is on F-Nets, a formal model for expressing computation as a folded partial-ordering of operations, thereby providing an architecture-independent expression of tolerant parallel algorithms. For implementing F-Nets, Cooperative Data Sharing (CDS) is a subroutine package for implementing communication efficiently in a large number of environments (e.g. shared memory and message passing). Software Cabling (SC), a very-high-level graphical programming language for building large F-Nets, possesses many of the features normally expected from today's computer languages (e.g. data abstraction, array operations). Finally, L2(sup 3) is a CASE tool which facilitates the construction, compilation, execution, and debugging of SC programs.

  9. A parallel world in the dark

    NASA Astrophysics Data System (ADS)

    Higaki, Tetsutaro; Jeong, Kwang Sik; Takahashi, Fuminobu

    2013-08-01

    The baryon-dark matter coincidence is a long-standing issue. Interestingly, the recent observations suggest the presence of dark radiation, which, if confirmed, would pose another coincidence problem of why the density of dark radiation is comparable to that of photons. These striking coincidences may be traced back to the dark sector with particle contents and interactions that are quite similar, if not identical, to the standard model: a dark parallel world. It naturally solves the coincidence problems of dark matter and dark radiation, and predicts a sterile neutrino(s) with mass of Script O(0.1-1) eV, as well as self-interacting dark matter made of the counterpart of ordinary baryons. We find a robust prediction for the relation between the abundance of dark radiation and the sterile neutrino, which can serve as the smoking-gun evidence of the dark parallel world.

  10. A parallel world in the dark

    SciTech Connect

    Higaki, Tetsutaro; Jeong, Kwang Sik; Takahashi, Fuminobu E-mail: ksjeong@tuhep.phys.tohoku.ac.jp

    2013-08-01

    The baryon-dark matter coincidence is a long-standing issue. Interestingly, the recent observations suggest the presence of dark radiation, which, if confirmed, would pose another coincidence problem of why the density of dark radiation is comparable to that of photons. These striking coincidences may be traced back to the dark sector with particle contents and interactions that are quite similar, if not identical, to the standard model: a dark parallel world. It naturally solves the coincidence problems of dark matter and dark radiation, and predicts a sterile neutrino(s) with mass of O(0.1−1) eV, as well as self-interacting dark matter made of the counterpart of ordinary baryons. We find a robust prediction for the relation between the abundance of dark radiation and the sterile neutrino, which can serve as the smoking-gun evidence of the dark parallel world.

  11. Time sharing massively parallel machines. Draft

    SciTech Connect

    Gorda, B.; Wolski, R.

    1995-03-01

    As part of the Massively Parallel Computing Initiative (MPCI) at the Lawrence Livermore National Laboratory, the authors have developed a simple, effective and portable time sharing mechanism by scheduling gangs of processes on tightly coupled parallel machines. By time-sharing the resources, the system interleaves production and interactive jobs. Immediate priority is given to interactive use, maintaining good response time. Production jobs are scheduled during idle periods, making use of the otherwise unused resources. In this paper the authors discuss their experience with gang scheduling over the 3 year life-time of the project. In section 2, they motivate the project and discuss some of its details. Section 3.0 describes the general scheduling problem and how gang scheduling addresses it. In section 4.0, they describe the implementation. Section 8.0 presents results culled over the lifetime of the project. They conclude this paper with some observations and possible future directions.

  12. Parallel Computing in SCALE

    SciTech Connect

    DeHart, Mark D; Williams, Mark L; Bowman, Stephen M

    2010-01-01

    The SCALE computational architecture has remained basically the same since its inception 30 years ago, although constituent modules and capabilities have changed significantly. This SCALE concept was intended to provide a framework whereby independent codes can be linked to provide a more comprehensive capability than possible with the individual programs - allowing flexibility to address a wide variety of applications. However, the current system was designed originally for mainframe computers with a single CPU and with significantly less memory than today's personal computers. It has been recognized that the present SCALE computation system could be restructured to take advantage of modern hardware and software capabilities, while retaining many of the modular features of the present system. Preliminary work is being done to define specifications and capabilities for a more advanced computational architecture. This paper describes the state of current SCALE development activities and plans for future development. With the release of SCALE 6.1 in 2010, a new phase of evolutionary development will be available to SCALE users within the TRITON and NEWT modules. The SCALE (Standardized Computer Analyses for Licensing Evaluation) code system developed by Oak Ridge National Laboratory (ORNL) provides a comprehensive and integrated package of codes and nuclear data for a wide range of applications in criticality safety, reactor physics, shielding, isotopic depletion and decay, and sensitivity/uncertainty (S/U) analysis. Over the last three years, since the release of version 5.1 in 2006, several important new codes have been introduced within SCALE, and significant advances applied to existing codes. Many of these new features became available with the release of SCALE 6.0 in early 2009. However, beginning with SCALE 6.1, a first generation of parallel computing is being introduced. In addition to near-term improvements, a plan for longer term SCALE enhancement activities has been developed to provide an integrated framework for future methods development. Some of the major components of the SCALE parallel computing development plan are parallelization and multithreading of computationally intensive modules and redesign of the fundamental SCALE computational architecture.

  13. Extended Parallelism Models for Optimization on Massively Parallel Computers

    SciTech Connect

    Eldred, M.S.; Schimel, B.D.

    1999-05-24

    Single-level parallel optimization approaches, those in which either the simulation code executes in parallel or the optimiza- tion algorithm invokes multiple simultaneous single-processor analyses, have been investigated previously and been shown to be effective in reducing the time required to compute optimal solutions. However, these approaches have clear performance limita- tions that prevent effective scaling with the thousands of processors available in massively parallel supercomputers. In more recent work, a capability has been developed for multilevel parallelism in which multiple instances of multiprocessor simulations are coordinated simultaneously. This implementation employs a master-slave approach using the Message Passing Interface (MPI) within the DAKOTA software toolkit. Mathematical analysis on achieving peak efficiency in multilevel parallelism has shown that the most effective processor partitioning scheme is the one that limits the size of multiprocessor simulations in favor of concurrent execution of multiple simulations. That is, if both coarse-grained and fine-grained parallelism can be exploited, then preference should be given to the coarse-grained parallelism. This analysis was verified in multilevel paralIel computatiorud experiments on networks of workstations (NOWS) and on the Intel TeraFLOPS massively parallel supercomputer. In current work, methods for exploiting additional coarse-grained parallelism in optimization are being investigated so that fine-grained efficiency losses can be further minimized. These activities are focusing on both algorithmic coarse-grained parallel- ism (multiple independent function evaluations) through the development of speculative gradient methods and concurrent iterator strategies and on function evaluation coarse-grained parallelism (multiple separable simulations within a function evaluation) through the development of general partitioning and nested synchronization facilities. The net result is a total of four separate lev- els of parallelism which can minimize efficiency losses and achieve near linear scaling on massively parallel computers.

  14. Toward Parallel Document Clustering

    SciTech Connect

    Mogill, Jace A.; Haglin, David J.

    2011-09-01

    A key challenge to automated clustering of documents in large text corpora is the high cost of comparing documents in a multimillion dimensional document space. The Anchors Hierarchy is a fast data structure and algorithm for localizing data based on a triangle inequality obeying distance metric, the algorithm strives to minimize the number of distance calculations needed to cluster the documents into anchors around reference documents called pivots. We extend the original algorithm to increase the amount of available parallelism and consider two implementations: a complex data structure which affords efficient searching, and a simple data structure which requires repeated sorting. The sorting implementation is integrated with a text corpora Bag of Words program and initial performance results of end-to-end a document processing workflow are reported.

  15. Parallel tridiagonal equation solvers

    NASA Technical Reports Server (NTRS)

    Stone, H. S.

    1974-01-01

    Three parallel algorithms were compared for the direct solution of tridiagonal linear systems of equations. The algorithms are suitable for computers such as ILLIAC 4 and CDC STAR. For array computers similar to ILLIAC 4, cyclic odd-even reduction has the least operation count for highly structured sets of equations, and recursive doubling has the least count for relatively unstructured sets of equations. Since the difference in operation counts for these two algorithms is not substantial, their relative running times may be more related to overhead operations, which are not measured in this paper. The third algorithm, based on Buneman's Poisson solver, has more arithmetic operations than the others, and appears to be the least favorable. For pipeline computers similar to CDC STAR, cyclic odd-even reduction appears to be the most preferable algorithm for all cases.

  16. Unified Parallel Software

    Energy Science and Technology Software Center (ESTSC)

    2003-12-01

    UPS (Unified Paralled Software is a collection of software tools libraries, scripts, executables) that assist in parallel programming. This consists of: o libups.a C/Fortran callable routines for message passing (utilities written on top of MPI) and file IO (utilities written on top of HDF). o libuserd-HDF.so EnSight user-defined reader for visualizing data files written with UPS File IO. o ups_libuserd_query, ups_libuserd_prep.pl, ups_libuserd_script.pl Executables/scripts to get information from data files and to simplify the use ofmore » EnSight on those data files. o ups_io_rm/ups_io_cp Manipulate data files written with UPS File IO These tools are portable to a wide variety of Unix platforms.« less

  17. Parallel TreeSPH

    NASA Astrophysics Data System (ADS)

    Dav, Romeel; Dubinski, John; Hernquist, Lars

    1997-08-01

    We describe PTreeSPH, a gravity treecode combined with an SPH hydrodynamics code designed for parallel supercomputers having distributed memory. Our computational algorithm is based on the popular TreeSPH code of Hernquist & Katz (1989)[ApJS, 70, 419]. PTreeSPH utilizes a domain decomposition procedure and a synchronous hypercube communication paradigm to build self-contained subvolumes of the simulation on each processor at every timestep. Computations then proceed in a manner analogous to a serial code. We use the Message Passing Interface (MPI) communications package, making our code easily portable to a variety of parallel systems. PTreeSPH uses individual smoothing lengths and timesteps, with a communication algorithm designed to minimize exchange of information while still providing all information required to accurately perform SPH computations. We have incorporated periodic boundary conditions with forces calculated using a quadrupole Ewald summation method, and comoving integration under a variety of cosmologies. Following algorithms presented in Katz et al. (1996)[ApJS, 105, 19], we have also included radiative cooling, heating from a parameterized ionizing background, and star formation. A cosmological simulation from z = 49 to z = 2 with 64 3 gas particles and 64 3 dark matter particles requires 1800 node-hours on a Cray T3D, with a communications overhead of 8%, load balanced to ? 95% level. When used on the new Cray T3E, this code will be capable of performing cosmological hydrodynamical simulations down to z = 0 with 2 10 6 particles, or to z = 2 with 10 7 particles, in a reasonable amount of time. Even larger simulations will be practical in situations where the matter is not highly clustered or when periodic boundaries are not required.

  18. Parallel Imaging Microfluidic Cytometer

    PubMed Central

    Ehrlich, Daniel J.; McKenna, Brian K.; Evans, James G.; Belkina, Anna C.; Denis, Gerald V.; Sherr, David; Cheung, Man Ching

    2011-01-01

    By adding an additional degree of freedom from multichannel flow, the parallel microfluidic cytometer (PMC) combines some of the best features of flow cytometry (FACS) and microscope-based high-content screening (HCS). The PMC (i) lends itself to fast processing of large numbers of samples, (ii) adds a 1-D imaging capability for intracellular localization assays (HCS), (iii) has a high rare-cell sensitivity and, (iv) has an unusual capability for time-synchronized sampling. An inability to practically handle large sample numbers has restricted applications of conventional flow cytometers and microscopes in combinatorial cell assays, network biology, and drug discovery. The PMC promises to relieve a bottleneck in these previously constrained applications. The PMC may also be a powerful tool for finding rare primary cells in the clinic. The multichannel architecture of current PMC prototypes allows 384 unique samples for a cell-based screen to be read out in approximately 6–10 minutes, about 30-times the speed of most current FACS systems. In 1-D intracellular imaging, the PMC can obtain protein localization using HCS marker strategies at many times the sample throughput of CCD-based microscopes or CCD-based single-channel flow cytometers. The PMC also permits the signal integration time to be varied over a larger range than is practical in conventional flow cytometers. The signal-to-noise advantages are useful, for example, in counting rare positive cells in the most difficult early stages of genome-wide screening. We review the status of parallel microfluidic cytometry and discuss some of the directions the new technology may take. PMID:21704835

  19. Combinatorial parallel and scientific computing.

    SciTech Connect

    Pinar, Ali; Hendrickson, Bruce Alan

    2005-04-01

    Combinatorial algorithms have long played a pivotal enabling role in many applications of parallel computing. Graph algorithms in particular arise in load balancing, scheduling, mapping and many other aspects of the parallelization of irregular applications. These are still active research areas, mostly due to evolving computational techniques and rapidly changing computational platforms. But the relationship between parallel computing and discrete algorithms is much richer than the mere use of graph algorithms to support the parallelization of traditional scientific computations. Important, emerging areas of science are fundamentally discrete, and they are increasingly reliant on the power of parallel computing. Examples include computational biology, scientific data mining, and network analysis. These applications are changing the relationship between discrete algorithms and parallel computing. In addition to their traditional role as enablers of high performance, combinatorial algorithms are now customers for parallel computing. New parallelization techniques for combinatorial algorithms need to be developed to support these nontraditional scientific approaches. This chapter will describe some of the many areas of intersection between discrete algorithms and parallel scientific computing. Due to space limitations, this chapter is not a comprehensive survey, but rather an introduction to a diverse set of techniques and applications with a particular emphasis on work presented at the Eleventh SIAM Conference on Parallel Processing for Scientific Computing. Some topics highly relevant to this chapter (e.g. load balancing) are addressed elsewhere in this book, and so we will not discuss them here.

  20. Parallel structural optimization with different parallel analysis interfaces

    NASA Technical Reports Server (NTRS)

    El-Sayed, Mohamed E. M.; Hsiung, Ching-Kuo

    1990-01-01

    The real benefit of structural optimization techniques is in the application of these techniques to large structures such as full vehicles or full aircraft. For these structures, however, the sequential computer's time and memory requirements prohibit the solutions. With the rapid development of parallel computers, parallel processing of large scale structural optimization problems is achievable. In this paper we discuss the parallel processing of structural optimization problems with parallel structural analysis. Two different types of interface between the optimization and analysis routines are developed and tested.

  1. High Performance Parallel Architectures

    NASA Technical Reports Server (NTRS)

    El-Ghazawi, Tarek; Kaewpijit, Sinthop

    1998-01-01

    Traditional remote sensing instruments are multispectral, where observations are collected at a few different spectral bands. Recently, many hyperspectral instruments, that can collect observations at hundreds of bands, have been operational. Furthermore, there have been ongoing research efforts on ultraspectral instruments that can produce observations at thousands of spectral bands. While these remote sensing technology developments hold great promise for new findings in the area of Earth and space science, they present many challenges. These include the need for faster processing of such increased data volumes, and methods for data reduction. Dimension Reduction is a spectral transformation, aimed at concentrating the vital information and discarding redundant data. One such transformation, which is widely used in remote sensing, is the Principal Components Analysis (PCA). This report summarizes our progress on the development of a parallel PCA and its implementation on two Beowulf cluster configuration; one with fast Ethernet switch and the other with a Myrinet interconnection. Details of the implementation and performance results, for typical sets of multispectral and hyperspectral NASA remote sensing data, are presented and analyzed based on the algorithm requirements and the underlying machine configuration. It will be shown that the PCA application is quite challenging and hard to scale on Ethernet-based clusters. However, the measurements also show that a high- performance interconnection network, such as Myrinet, better matches the high communication demand of PCA and can lead to a more efficient PCA execution.

  2. Parallelization of adaptive MC integrators

    NASA Astrophysics Data System (ADS)

    Kreckel, Richard

    1997-11-01

    Monte Carlo (MC) methods for numerical integration seem to be embarrassingly parallel on first sight. When adaptive schemes are applied in order to enhance convergence however, the seemingly most natural way of replicating the whole job on each processor can potentially ruin the adaptive behaviour. Using the popular VEGAS-Algorithm as an example an economic method of semi-micro parallelization with variable grain-size is presented and contrasted with another straightforward approach of macro-parallelization. A portable implementation of this semi-micro parallelization is used in the xloops-project and is made publicly available.

  3. Parallel processing and expert systems

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Lau, Sonie

    1991-01-01

    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 90's cannot enjoy an increased level of autonomy without the efficient use of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real time demands are met for large expert systems. Speed-up via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial labs in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems was surveyed. The survey is divided into three major sections: (1) multiprocessors for parallel expert systems; (2) parallel languages for symbolic computations; and (3) measurements of parallelism of expert system. Results to date indicate that the parallelism achieved for these systems is small. In order to obtain greater speed-ups, data parallelism and application parallelism must be exploited.

  4. Parallel processor engine model program

    NASA Technical Reports Server (NTRS)

    Mclaughlin, P.

    1984-01-01

    The Parallel Processor Engine Model Program is a generalized engineering tool intended to aid in the design of parallel processing real-time simulations of turbofan engines. It is written in the FORTRAN programming language and executes as a subset of the SOAPP simulation system. Input/output and execution control are provided by SOAPP; however, the analysis, emulation and simulation functions are completely self-contained. A framework in which a wide variety of parallel processing architectures could be evaluated and tools with which the parallel implementation of a real-time simulation technique could be assessed are provided.

  5. Parallel Smoothed Aggregation Multigrid: Aggregation Strategies on Massively Parallel Machines

    SciTech Connect

    Ray S. Tuminaro

    2000-11-09

    Algebraic multigrid methods offer the hope that multigrid convergence can be achieved (for at least some important applications) without a great deal of effort from engineers and scientists wishing to solve linear systems. In this paper the authors consider parallelization of the smoothed aggregation multi-grid method. Smoothed aggregation is one of the most promising algebraic multigrid methods. Therefore, developing parallel variants with both good convergence and efficiency properties is of great importance. However, parallelization is nontrivial due to the somewhat sequential aggregation (or grid coarsening) phase. In this paper, they discuss three different parallel aggregation algorithms and illustrate the advantages and disadvantages of each variant in terms of parallelism and convergence. Numerical results will be shown on the Intel Teraflop computer for some large problems coming from nontrivial codes: quasi-static electric potential simulation and a fluid flow calculation.

  6. Using Motivational Interviewing Techniques to Address Parallel Process in Supervision

    ERIC Educational Resources Information Center

    Giordano, Amanda; Clarke, Philip; Borders, L. DiAnne

    2013-01-01

    Supervision offers a distinct opportunity to experience the interconnection of counselor-client and counselor-supervisor interactions. One product of this network of interactions is parallel process, a phenomenon by which counselors unconsciously identify with their clients and subsequently present to their supervisors in a similar fashion…

  7. Parallelization of irregularly coupled regular meshes

    NASA Technical Reports Server (NTRS)

    Chase, Craig; Crowley, Kay; Saltz, Joel; Reeves, Anthony

    1992-01-01

    Regular meshes are frequently used for modeling physical phenomena on both serial and parallel computers. One advantage of regular meshes is that efficient discretization schemes can be implemented in a straight forward manner. However, geometrically-complex objects, such as aircraft, cannot be easily described using a single regular mesh. Multiple interacting regular meshes are frequently used to describe complex geometries. Each mesh models a subregion of the physical domain. The meshes, or subdomains, can be processed in parallel, with periodic updates carried out to move information between the coupled meshes. In many cases, there are a relatively small number (one to a few dozen) subdomains, so that each subdomain may also be partitioned among several processors. We outline a composite run-time/compile-time approach for supporting these problems efficiently on distributed-memory machines. These methods are described in the context of a multiblock fluid dynamics problem developed at LaRC.

  8. Parallel contingency statistics with Titan.

    SciTech Connect

    Thompson, David C.; Pebay, Philippe Pierre

    2009-09-01

    This report summarizes existing statistical engines in VTK/Titan and presents the recently parallelized contingency statistics engine. It is a sequel to [PT08] and [BPRT09] which studied the parallel descriptive, correlative, multi-correlative, and principal component analysis engines. The ease of use of this new parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; however, the very nature of contingency tables prevent this new engine from exhibiting optimal parallel speed-up as the aforementioned engines do. This report therefore discusses the design trade-offs we made and study performance with up to 200 processors.

  9. Parallel Grid Manipulations in Earth Science Calculations

    NASA Technical Reports Server (NTRS)

    Sawyer, W.; Lucchesi, R.; daSilva, A.; Takacs, L. L.

    1999-01-01

    The National Aeronautics and Space Administration (NASA) Data Assimilation Office (DAO) at the Goddard Space Flight Center is moving its data assimilation system to massively parallel computing platforms. This parallel implementation of GEOS DAS will be used in the DAO's normal activities, which include reanalysis of data, and operational support for flight missions. Key components of GEOS DAS, including the gridpoint-based general circulation model and a data analysis system, are currently being parallelized. The parallelization of GEOS DAS is also one of the HPCC Grand Challenge Projects. The GEOS-DAS software employs several distinct grids. Some examples are: an observation grid- an unstructured grid of points at which observed or measured physical quantities from instruments or satellites are associated- a highly-structured latitude-longitude grid of points spanning the earth at given latitude-longitude coordinates at which prognostic quantities are determined, and a computational lat-lon grid in which the pole has been moved to a different location to avoid computational instabilities. Each of these grids has a different structure and number of constituent points. In spite of that, there are numerous interactions between the grids, e.g., values on one grid must be interpolated to another, or, in other cases, grids need to be redistributed on the underlying parallel platform. The DAO has designed a parallel integrated library for grid manipulations (PILGRIM) to support the needed grid interactions with maximum efficiency. It offers a flexible interface to generate new grids, define transformations between grids and apply them. Basic communication is currently MPI, however the interfaces defined here could conceivably be implemented with other message-passing libraries, e.g., Cray SHMEM, or with shared-memory constructs. The library is written in Fortran 90. First performance results indicate that even difficult problems, such as above-mentioned pole rotation- a sparse interpolation with little data locality between the physical lat-lon grid and a pole rotated computational grid- can be solved efficiently and at the GFlop/s rates needed to solve tomorrow's high resolution earth science models. In the subsequent presentation we will discuss the design and implementation of PILGRIM as well as a number of the problems it is required to solve. Some conclusions will be drawn about the potential performance of the overall earth science models on the supercomputer platforms foreseen for these problems.

  10. Tile-based Level of Detail for the Parallel Age

    SciTech Connect

    Niski, K; Cohen, J D

    2007-08-15

    Today's PCs incorporate multiple CPUs and GPUs and are easily arranged in clusters for high-performance, interactive graphics. We present an approach based on hierarchical, screen-space tiles to parallelizing rendering with level of detail. Adapt tiles, render tiles, and machine tiles are associated with CPUs, GPUs, and PCs, respectively, to efficiently parallelize the workload with good resource utilization. Adaptive tile sizes provide load balancing while our level of detail system allows total and independent management of the load on CPUs and GPUs. We demonstrate our approach on parallel configurations consisting of both single PCs and a cluster of PCs.

  11. Programming parallel architectures: The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1988-01-01

    Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.

  12. Initial results of a model rotor higher harmonic control (HHC) wind tunnel experiment on BVI impulsive noise reduction

    NASA Astrophysics Data System (ADS)

    Splettstoesser, W. R.; Lehmann, G.; van der Wall, B.

    1989-09-01

    Initial acoustic results are presented from a higher harmonic control (HHC) wind tunnel pilot experiment on helicopter rotor blade-vortex interaction (BVI) impulsive noise reduction, making use of the DFVLR 40-percent-scaled BO-105 research rotor in the DNW 6m by 8m closed test section. Considerable noise reduction (of several decibels) has been measured for particular HHC control settings, however, at the cost of increased vibration levels and vice versa. The apparently adverse results for noise and vibration reduction by HHC are explained. At optimum pitch control settings for BVI noise reduction, rotor simulation results demonstrate that blade loading at the outer tip region is decreased, vortex strength and blade vortex miss-distance are increased, resulting altogether in reduced BVI noise generation. At optimum pitch control settings for vibration reduction adverse effects on blade loading, vortex strength and blade vortex miss-distance are found.

  13. Is Monte Carlo embarrassingly parallel?

    SciTech Connect

    Hoogenboom, J. E.

    2012-07-01

    Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)

  14. Parallel NPARC: Implementation and Performance

    NASA Technical Reports Server (NTRS)

    Townsend, S. E.

    1996-01-01

    Version 3 of the NPARC Navier-Stokes code includes support for large-grain (block level) parallelism using explicit message passing between a heterogeneous collection of computers. This capability has the potential for significant performance gains, depending upon the block data distribution. The parallel implementation uses a master/worker arrangement of processes. The master process assigns blocks to workers, controls worker actions, and provides remote file access for the workers. The processes communicate via explicit message passing using an interface library which provides portability to a number of message passing libraries, such as PVM (Parallel Virtual Machine). A Bourne shell script is used to simplify the task of selecting hosts, starting processes, retrieving remote files, and terminating a computation. This script also provides a simple form of fault tolerance. An analysis of the computational performance of NPARC is presented, using data sets from an F/A-18 inlet study and a Rocket Based Combined Cycle Engine analysis. Parallel speedup and overall computational efficiency were obtained for various NPARC run parameters on a cluster of IBM RS6000 workstations. The data show that although NPARC performance compares favorably with the estimated potential parallelism, typical data sets used with previous versions of NPARC will often need to be reblocked for optimum parallel performance. In one of the cases studied, reblocking increased peak parallel speedup from 3.2 to 11.8.

  15. Parallel processing and expert systems

    NASA Technical Reports Server (NTRS)

    Lau, Sonie; Yan, Jerry C.

    1991-01-01

    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited.

  16. Template based parallel checkpointing in a massively parallel computer system

    DOEpatents

    Archer, Charles Jens; Inglett, Todd Alan

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  17. EFFICIENT SCHEDULING OF PARALLEL JOBS ON MASSIVELY PARALLEL SYSTEMS

    SciTech Connect

    F. PETRINI; W. FENG

    1999-09-01

    We present buffered coscheduling, a new methodology to multitask parallel jobs in a message-passing environment and to develop parallel programs that can pave the way to the efficient implementation of a distributed operating system. Buffered coscheduling is based on three innovative techniques: communication buffering, strobing, and non-blocking communication. By leveraging these techniques, we can perform effective optimizations based on the global status of the parallel machine rather than on the limited knowledge available locally to each processor. The advantages of buffered coscheduling include higher resource utilization, reduced communication overhead, efficient implementation of low-control strategies and fault-tolerant protocols, accurate performance modeling, and a simplified yet still expressive parallel programming model. Preliminary experimental results show that buffered coscheduling is very effective in increasing the overall performance in the presence of load imbalance and communication-intensive workloads.

  18. Parallel integer sorting with medium and fine-scale parallelism

    NASA Technical Reports Server (NTRS)

    Dagum, Leonardo

    1993-01-01

    Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.

  19. Parallel inverse iteration with reorthogonalization

    SciTech Connect

    Fann, G.I.; Littlefield, R.J.

    1993-03-01

    A parallel method for finding orthogonal eigenvectors of real symmetric tridiagonal is described. The method uses inverse iteration with repeated Modified Gram-Schmidt (MGS) reorthogonalization of the unconverged iterates for clustered eigenvalues. This approach is more parallelizable than reorthogonalizing against fully converged eigenvectors, as is done by LAPACK's current DSTEIN routine. The new method is found to provide accuracy and speed comparable to DSTEIN's and to have good parallel scalability even for matrices with large clusters of eigenvalues. We present al results for residual and orthogonality tests, plus timings on IBM RS/6000 (sequential) and Intel Touchstone DELTA (parallel) computers.

  20. Parallel inverse iteration with reorthogonalization

    SciTech Connect

    Fann, G.I.; Littlefield, R.J.

    1993-03-01

    A parallel method for finding orthogonal eigenvectors of real symmetric tridiagonal is described. The method uses inverse iteration with repeated Modified Gram-Schmidt (MGS) reorthogonalization of the unconverged iterates for clustered eigenvalues. This approach is more parallelizable than reorthogonalizing against fully converged eigenvectors, as is done by LAPACK`s current DSTEIN routine. The new method is found to provide accuracy and speed comparable to DSTEIN`s and to have good parallel scalability even for matrices with large clusters of eigenvalues. We present al results for residual and orthogonality tests, plus timings on IBM RS/6000 (sequential) and Intel Touchstone DELTA (parallel) computers.

  1. Adaptive, multiresolution visualization of large data sets using parallel octrees.

    SciTech Connect

    Freitag, L. A.; Loy, R. M.

    1999-06-10

    The interactive visualization and exploration of large scientific data sets is a challenging and difficult task; their size often far exceeds the performance and memory capacity of even the most powerful graphics work-stations. To address this problem, we have created a technique that combines hierarchical data reduction methods with parallel computing to allow interactive exploration of large data sets while retaining full-resolution capability. The hierarchical representation is built in parallel by strategically inserting field data into an octree data structure. We provide functionality that allows the user to interactively adapt the resolution of the reduced data sets so that resolution is increased in regions of interest without sacrificing local graphics performance. We describe the creation of the reduced data sets using a parallel octree, the software architecture of the system, and the performance of this system on the data from a Rayleigh-Taylor instability simulation.

  2. Demonstrating Forces between Parallel Wires.

    ERIC Educational Resources Information Center

    Baker, Blane

    2000-01-01

    Describes a physics demonstration that dramatically illustrates the mutual repulsion (attraction) between parallel conductors using insulated copper wire, wooden dowels, a high direct current power supply, electrical tape, and an overhead projector. (WRM)

  3. "Feeling" Series and Parallel Resistances.

    ERIC Educational Resources Information Center

    Morse, Robert A.

    1993-01-01

    Equipped with drinking straws and stirring straws, a teacher can help students understand how resistances in electric circuits combine in series and in parallel. Follow-up suggestions are provided. (ZWH)

  4. Turbomachinery CFD on parallel computers

    NASA Technical Reports Server (NTRS)

    Blech, Richard A.; Milner, Edward J.; Quealy, Angela; Townsend, Scott E.

    1992-01-01

    The role of multistage turbomachinery simulation in the development of propulsion system models is discussed. Particularly, the need for simulations with higher fidelity and faster turnaround time is highlighted. It is shown how such fast simulations can be used in engineering-oriented environments. The use of parallel processing to achieve the required turnaround times is discussed. Current work by several researchers in this area is summarized. Parallel turbomachinery CFD research at the NASA Lewis Research Center is then highlighted. These efforts are focused on implementing the average-passage turbomachinery model on MIMD, distributed memory parallel computers. Performance results are given for inviscid, single blade row and viscous, multistage applications on several parallel computers, including networked workstations.

  5. Parallel algorithms for message decomposition

    SciTech Connect

    Teng, S.H.; Wang, B.

    1987-06-01

    The authors consider the deterministic and random parallel complexity (time and processor) of message decoding: an essential problem in communications systems and translation systems. They present an optimal parallel algorithm to decompose prefix-coded messages and uniquely decipherable-coded messages in O(n/P) time, using O(P) processors (for all P:1 less than or equal toPless than or equal ton/log n) deterministically as well as randomly on the weakest version of parallel random access machines in which concurrent read and concurrent write to a cell in the common memory are not allowed. This is done by reducing decoding to parallel finite-state automata simulation and the prefix sums.

  6. Parallel node placement method by bubble simulation

    NASA Astrophysics Data System (ADS)

    Nie, Yufeng; Zhang, Weiwei; Qi, Nan; Li, Yiqiang

    2014-03-01

    An efficient Parallel Node Placement method by Bubble Simulation (PNPBS), employing METIS-based domain decomposition (DD) for an arbitrary number of processors is introduced. In accordance with the desired nodal density and Newtons Second Law of Motion, automatic generation of node sets by bubble simulation has been demonstrated in previous work. Since the interaction force between nodes is short-range, for two distant nodes, their positions and velocities can be updated simultaneously and independently during dynamic simulation, which indicates the inherent property of parallelism, it is quite suitable for parallel computing. In this PNPBS method, the METIS-based DD scheme has been investigated for uniform and non-uniform node sets, and dynamic load balancing is obtained by evenly distributing work among the processors. For the nodes near the common interface of two neighboring subdomains, there is no need for special treatment after dynamic simulation. These nodes have good geometrical properties and a smooth density distribution which is desirable in the numerical solution of partial differential equations (PDEs). The results of numerical examples show that quasi linear speedup in the number of processors and high efficiency are achieved.

  7. Parallel Strategies for Crash and Impact Simulations

    SciTech Connect

    Attaway, S.; Brown, K.; Hendrickson, B.; Plimpton, S.

    1998-12-07

    We describe a general strategy we have found effective for parallelizing solid mechanics simula- tions. Such simulations often have several computationally intensive parts, including finite element integration, detection of material contacts, and particle interaction if smoothed particle hydrody- namics is used to model highly deforming materials. The need to balance all of these computations simultaneously is a difficult challenge that has kept many commercial and government codes from being used effectively on parallel supercomputers with hundreds or thousands of processors. Our strategy is to load-balance each of the significant computations independently with whatever bal- ancing technique is most appropriate. The chief benefit is that each computation can be scalably paraIlelized. The drawback is the data exchange between processors and extra coding that must be written to maintain multiple decompositions in a single code. We discuss these trade-offs and give performance results showing this strategy has led to a parallel implementation of a widely-used solid mechanics code that can now be run efficiently on thousands of processors of the Pentium-based Sandia/Intel TFLOPS machine. We illustrate with several examples the kinds of high-resolution, million-element models that can now be simulated routinely. We also look to the future and dis- cuss what possibilities this new capabUity promises, as well as the new set of challenges it poses in material models, computational techniques, and computing infrastructure.

  8. Appendix E: Parallel Pascal development system

    NASA Technical Reports Server (NTRS)

    1985-01-01

    The Parallel Pascal Development System enables Parallel Pascal programs to be developed and tested on a conventional computer. It consists of several system programs, including a Parallel Pascal to standard Pascal translator, and a library of Parallel Pascal subprograms. The library includes subprograms for using Parallel Pascal on a parallel system with a fixed degree of parallelism, such as the Massively Parallel Processor, to conveniently manipulate arrays which have dimensions than the hardware. Programs can be conveninetly tested with small sized arrays on the conventional computer before attempting to run on a parallel system.

  9. Evaluation of the Interactions between Water Extractable Soil Organic Matter and Metal Cations (Cu(II), Eu(III)) Using Excitation-Emission Matrix Combined with Parallel Factor Analysis

    PubMed Central

    Wei, Jing; Han, Lu; Song, Jing; Chen, Mengfang

    2015-01-01

    The objectives of this study were to evaluate the binding behavior of Cu(II) and Eu(III) with water extractable organic matter (WEOM) in soil, and assess the competitive effect of the cations. Excitation-emission matrix (EEM) fluorescence spectrometry was used in combination with parallel factor analysis (PARAFAC) to obtain four WEOM components: fulvic-like, humic-like, microbial degraded humic-like, and protein-like substances. Fluorescence titration experiments were performed to obtain the binding parameters of PARAFAC-derived components with Cu(II) and Eu(III). The conditional complexation stability constants (logKM) of Cu(II) with the four components ranged from 5.49 to 5.94, and the Eu(III) logKM values were between 5.26 to 5.81. The component-specific binding parameters obtained from competitive binding experiments revealed that Cu(II) and Eu(III) competed for the same binding sites on the WEOM components. These results would help understand the molecular binding mechanisms of Cu(II) and Eu(III) with WEOM in soil environment. PMID:26121300

  10. Address tracing for parallel machines

    NASA Technical Reports Server (NTRS)

    Stunkel, Craig B.; Janssens, Bob; Fuchs, W. Kent

    1991-01-01

    Recently implemented parallel system address-tracing methods based on several metrics are surveyed. The issues specific to collection of traces for both shared and distributed memory parallel computers are highlighted. Five general categories of address-trace collection methods are examined: hardware-captured, interrupt-based, simulation-based, altered microcode-based, and instrumented program-based traces. The problems unique to shared memory and distributed memory multiprocessors are examined separately.

  11. Matching parallel algorithm and architecture

    SciTech Connect

    Chiang, Y.P.; Fu, K.S.

    1983-01-01

    An attributed directed graph model which is a combination of high-level Petri Nets and and/or graphs is described. This model provides a method for matching parallel algorithms to architectures or vice versa. The analysis of parallel computation using this model is described. Examples are given to demonstrate the descriptive power of this model and how it helps us to match an algorithm and an architecture. 18 references.

  12. Parallel architectures for problem solving

    SciTech Connect

    Kale, L.V.

    1985-01-01

    The problem of exploiting a large amount of hardware in parallel is one of the biggest challenges facing computer science today. The problem of designing parallel architectures and execution methods for solving large combinatorially explosive problems is studied here. Such problems typically do not have a regular structure that can be readily exploited for parallel execution. Prolog is chosen as a language to specify computation because it is seen as a language that is conceptually simple as well as amenable to parallel interpretation. A tree representation of Prolog computation called the REDUCE-OR tree is described as an alternative to the AND-OR tree representation. A process model based on this representation is developed; it captures more parallelism than most other proposed models. A class of bus architectures is proposed to implement the process model. A general model of parallel Prolog systems is developed and the proposed architectures examined in its framework. One of the important features of the proposed architectures is that they limit contracting of work to a close neighborhood. Various interconnection networks are analyzed, and a new one called the lattice-mesh is proposed. The lattice-mesh improves on the square grid of buses, while retaining its linear-area property. An extensive simulation framework was built. Results of some of the experiments conducted on the simulation system are given.

  13. Architectures for reasoning in parallel

    NASA Technical Reports Server (NTRS)

    Hall, Lawrence O.

    1989-01-01

    The research conducted has dealt with rule-based expert systems. The algorithms that may lead to effective parallelization of them were investigated. Both the forward and backward chained control paradigms were investigated in the course of this work. The best computer architecture for the developed and investigated algorithms has been researched. Two experimental vehicles were developed to facilitate this research. They are Backpac, a parallel backward chained rule-based reasoning system and Datapac, a parallel forward chained rule-based reasoning system. Both systems have been written in Multilisp, a version of Lisp which contains the parallel construct, future. Applying the future function to a function causes the function to become a task parallel to the spawning task. Additionally, Backpac and Datapac have been run on several disparate parallel processors. The machines are an Encore Multimax with 10 processors, the Concert Multiprocessor with 64 processors, and a 32 processor BBN GP1000. Both the Concert and the GP1000 are switch-based machines. The Multimax has all its processors hung off a common bus. All are shared memory machines, but have different schemes for sharing the memory and different locales for the shared memory. The main results of the investigations come from experiments on the 10 processor Encore and the Concert with partitions of 32 or less processors. Additionally, experiments have been run with a stripped down version of EMYCIN.

  14. Efficiency of parallel direct optimization

    NASA Technical Reports Server (NTRS)

    Janies, D. A.; Wheeler, W. C.

    2001-01-01

    Tremendous progress has been made at the level of sequential computation in phylogenetics. However, little attention has been paid to parallel computation. Parallel computing is particularly suited to phylogenetics because of the many ways large computational problems can be broken into parts that can be analyzed concurrently. In this paper, we investigate the scaling factors and efficiency of random addition and tree refinement strategies using the direct optimization software, POY, on a small (10 slave processors) and a large (256 slave processors) cluster of networked PCs running LINUX. These algorithms were tested on several data sets composed of DNA and morphology ranging from 40 to 500 taxa. Various algorithms in POY show fundamentally different properties within and between clusters. All algorithms are efficient on the small cluster for the 40-taxon data set. On the large cluster, multibuilding exhibits excellent parallel efficiency, whereas parallel building is inefficient. These results are independent of data set size. Branch swapping in parallel shows excellent speed-up for 16 slave processors on the large cluster. However, there is no appreciable speed-up for branch swapping with the further addition of slave processors (>16). This result is independent of data set size. Ratcheting in parallel is efficient with the addition of up to 32 processors in the large cluster. This result is independent of data set size. c2001 The Willi Hennig Society.

  15. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    SciTech Connect

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-08-12

    Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  16. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J; Blocksome, Michael E; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Endpoint-based parallel data processing in a parallel active messaging interface ('PAMI') of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective opeartion through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  17. A parallel algorithm for implicit depletant simulations

    NASA Astrophysics Data System (ADS)

    Glaser, Jens; Karas, Andrew S.; Glotzer, Sharon C.

    2015-11-01

    We present an algorithm to simulate the many-body depletion interaction between anisotropic colloids in an implicit way, integrating out the degrees of freedom of the depletants, which we treat as an ideal gas. Because the depletant particles are statistically independent and the depletion interaction is short-ranged, depletants are randomly inserted in parallel into the excluded volume surrounding a single translated and/or rotated colloid. A configurational bias scheme is used to enhance the acceptance rate. The method is validated and benchmarked both on multi-core processors and graphics processing units for the case of hard spheres, hemispheres, and discoids. With depletants, we report novel cluster phases in which hemispheres first assemble into spheres, which then form ordered hcp/fcc lattices. The method is significantly faster than any method without cluster moves and that tracks depletants explicitly, for systems of colloid packing fraction ?c < 0.50, and additionally enables simulation of the fluid-solid transition.

  18. Parallel transport of long mean-free-path plasma along open magnetic field lines: Parallel heat flux

    SciTech Connect

    Guo Zehua; Tang Xianzhu

    2012-06-15

    In a long mean-free-path plasma where temperature anisotropy can be sustained, the parallel heat flux has two components with one associated with the parallel thermal energy and the other the perpendicular thermal energy. Due to the large deviation of the distribution function from local Maxwellian in an open field line plasma with low collisionality, the conventional perturbative calculation of the parallel heat flux closure in its local or non-local form is no longer applicable. Here, a non-perturbative calculation is presented for a collisionless plasma in a two-dimensional flux expander bounded by absorbing walls. Specifically, closures of previously unfamiliar form are obtained for ions and electrons, which relate two distinct components of the species parallel heat flux to the lower order fluid moments such as density, parallel flow, parallel and perpendicular temperatures, and the field quantities such as the magnetic field strength and the electrostatic potential. The plasma source and boundary condition at the absorbing wall enter explicitly in the closure calculation. Although the closure calculation does not take into account wave-particle interactions, the results based on passing orbits from steady-state collisionless drift-kinetic equation show remarkable agreement with fully kinetic-Maxwell simulations. As an example of the physical implications of the theory, the parallel heat flux closures are found to predict a surprising observation in the kinetic-Maxwell simulation of the 2D magnetic flux expander problem, where the parallel heat flux of the parallel thermal energy flows from low to high parallel temperature region.

  19. Parallel Implicit Algorithms for CFD

    NASA Technical Reports Server (NTRS)

    Keyes, David E.

    1998-01-01

    The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.

  20. Parallel stochastic systems biology in the cloud.

    PubMed

    Aldinucci, Marco; Torquati, Massimo; Spampinato, Concetto; Drocco, Maurizio; Misale, Claudia; Calcagno, Cristina; Coppo, Mario

    2014-09-01

    The stochastic modelling of biological systems, coupled with Monte Carlo simulation of models, is an increasingly popular technique in bioinformatics. The simulation-analysis workflow may result computationally expensive reducing the interactivity required in the model tuning. In this work, we advocate the high-level software design as a vehicle for building efficient and portable parallel simulators for the cloud. In particular, the Calculus of Wrapped Components (CWC) simulator for systems biology, which is designed according to the FastFlow pattern-based approach, is presented and discussed. Thanks to the FastFlow framework, the CWC simulator is designed as a high-level workflow that can simulate CWC models, merge simulation results and statistically analyse them in a single parallel workflow in the cloud. To improve interactivity, successive phases are pipelined in such a way that the workflow begins to output a stream of analysis results immediately after simulation is started. Performance and effectiveness of the CWC simulator are validated on the Amazon Elastic Compute Cloud. PMID:23780997

  1. A parallel Jacobson-Oksman optimization algorithm. [parallel processing (computers)

    NASA Technical Reports Server (NTRS)

    Straeter, T. A.; Markos, A. T.

    1975-01-01

    A gradient-dependent optimization technique which exploits the vector-streaming or parallel-computing capabilities of some modern computers is presented. The algorithm, derived by assuming that the function to be minimized is homogeneous, is a modification of the Jacobson-Oksman serial minimization method. In addition to describing the algorithm, conditions insuring the convergence of the iterates of the algorithm and the results of numerical experiments on a group of sample test functions are presented. The results of these experiments indicate that this algorithm will solve optimization problems in less computing time than conventional serial methods on machines having vector-streaming or parallel-computing capabilities.

  2. Parallel plasma fluid turbulence calculations

    SciTech Connect

    Leboeuf, J.N.; Carreras, B.A.; Charlton, L.A.; Drake, J.B.; Lynch, V.E.; Newman, D.E.; Sidikman, K.L.; Spong, D.A.

    1994-12-31

    The study of plasma turbulence and transport is a complex problem of critical importance for fusion-relevant plasmas. To this day, the fluid treatment of plasma dynamics is the best approach to realistic physics at the high resolution required for certain experimentally relevant calculations. Core and edge turbulence in a magnetic fusion device have been modeled using state-of-the-art, nonlinear, three-dimensional, initial-value fluid and gyrofluid codes. Parallel implementation of these models on diverse platforms--vector parallel (National Energy Research Supercomputer Center`s CRAY Y-MP C90), massively parallel (Intel Paragon XP/S 35), and serial parallel (clusters of high-performance workstations using the Parallel Virtual Machine protocol)--offers a variety of paths to high resolution and significant improvements in real-time efficiency, each with its own advantages. The largest and most efficient calculations have been performed at the 200 Mword memory limit on the C90 in dedicated mode, where an overlap of 12 to 13 out of a maximum of 16 processors has been achieved with a gyrofluid model of core fluctuations. The richness of the physics captured by these calculations is commensurate with the increased resolution and efficiency and is limited only by the ingenuity brought to the analysis of the massive amounts of data generated.

  3. Parallelizing Timed Petri Net simulations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.

    1993-01-01

    The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included.

  4. Parallel computation and computers for artificial intelligence

    SciTech Connect

    Kowalik, J.S. )

    1988-01-01

    This book discusses Parallel Processing in Artificial Intelligence; Parallel Computing using Multilisp; Execution of Common Lisp in a Parallel Environment; Qlisp; Restricted AND-Parallel Execution of Logic Programs; PARLOG: Parallel Programming in Logic; and Data-driven Processing of Semantic Nets. Attention is also given to: Application of the Butterfly Parallel Processor in Artificial Intelligence; On the Range of Applicability of an Artificial Intelligence Machine; Low-level Vision on Warp and the Apply Programming Mode; AHR: A Parallel Computer for Pure Lisp; FAIM-1: An Architecture for Symbolic Multi-processing; and Overview of Al Application Oriented Parallel Processing Research in Japan.

  5. Massively parallel MRI detector arrays.

    PubMed

    Keil, Boris; Wald, Lawrence L

    2013-04-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas via reception, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called "ultimate" SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays. PMID:23453758

  6. Parallel Adaptive Mesh Refinement Library

    NASA Technical Reports Server (NTRS)

    Mac-Neice, Peter; Olson, Kevin

    2005-01-01

    Parallel Adaptive Mesh Refinement Library (PARAMESH) is a package of Fortran 90 subroutines designed to provide a computer programmer with an easy route to extension of (1) a previously written serial code that uses a logically Cartesian structured mesh into (2) a parallel code with adaptive mesh refinement (AMR). Alternatively, in its simplest use, and with minimal effort, PARAMESH can operate as a domain-decomposition tool for users who want to parallelize their serial codes but who do not wish to utilize adaptivity. The package builds a hierarchy of sub-grids to cover the computational domain of a given application program, with spatial resolution varying to satisfy the demands of the application. The sub-grid blocks form the nodes of a tree data structure (a quad-tree in two or an oct-tree in three dimensions). Each grid block has a logically Cartesian mesh. The package supports one-, two- and three-dimensional models.

  7. PARAVT: Parallel Voronoi Tessellation code

    NASA Astrophysics Data System (ADS)

    Gonzalez, Roberto E.

    2016-01-01

    We present a new open source code for massive parallel computation of Voronoi tessellations(VT hereafter) in large data sets. The code is focused for astrophysical purposes where VT densities and neighbors are widely used. There are several serial Voronoi tessellation codes, however no open source and parallel implementations are available to handle the large number of particles/galaxies in current N-body simulations and sky surveys. Parallelization is implemented under MPI and VT using Qhull library. Domain decomposition take into account consistent boundary computation between tasks, and support periodic conditions. In addition, the code compute neighbors lists, Voronoi density and Voronoi cell volumes for each particle, and can compute density on a regular grid.

  8. Visualizing Parallel Computer System Performance

    NASA Technical Reports Server (NTRS)

    Malony, Allen D.; Reed, Daniel A.

    1988-01-01

    Parallel computer systems are among the most complex of man's creations, making satisfactory performance characterization difficult. Despite this complexity, there are strong, indeed, almost irresistible, incentives to quantify parallel system performance using a single metric. The fallacy lies in succumbing to such temptations. A complete performance characterization requires not only an analysis of the system's constituent levels, it also requires both static and dynamic characterizations. Static or average behavior analysis may mask transients that dramatically alter system performance. Although the human visual system is remarkedly adept at interpreting and identifying anomalies in false color data, the importance of dynamic, visual scientific data presentation has only recently been recognized Large, complex parallel system pose equally vexing performance interpretation problems. Data from hardware and software performance monitors must be presented in ways that emphasize important events while eluding irrelevant details. Design approaches and tools for performance visualization are the subject of this paper.

  9. Parallel integrated frame synchronizer chip

    NASA Technical Reports Server (NTRS)

    Ghuman, Parminder Singh (Inventor); Solomon, Jeffrey Michael (Inventor); Bennett, Toby Dennis (Inventor)

    2000-01-01

    A parallel integrated frame synchronizer which implements a sequential pipeline process wherein serial data in the form of telemetry data or weather satellite data enters the synchronizer by means of a front-end subsystem and passes to a parallel correlator subsystem or a weather satellite data processing subsystem. When in a CCSDS mode, data from the parallel correlator subsystem passes through a window subsystem, then to a data alignment subsystem and then to a bit transition density (BTD)/cyclical redundancy check (CRC) decoding subsystem. Data from the BTD/CRC decoding subsystem or data from the weather satellite data processing subsystem is then fed to an output subsystem where it is output from a data output port.

  10. Massively Parallel MRI Detector Arrays

    PubMed Central

    Keil, Boris; Wald, Lawrence L

    2013-01-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called ultimate SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays. PMID:23453758

  11. Fast data parallel polygon rendering

    SciTech Connect

    Ortega, F.A.; Hansen, C.D.

    1993-09-01

    This paper describes a parallel method for polygonal rendering on a massively parallel SIMD machine. This method, based on a simple shading model, is targeted for applications which require very fast polygon rendering for extremely large sets of polygons such as is found in many scientific visualization applications. The algorithms described in this paper are incorporated into a library of 3D graphics routines written for the Connection Machine. The routines are implemented on both the CM-200 and the CM-5. This library enables a scientists to display 3D shaded polygons directly from a parallel machine without the need to transmit huge amounts of data to a post-processing rendering system.

  12. Parallel algorithms for mapping pipelined and parallel computations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.

    1988-01-01

    Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.

  13. Gang scheduling a parallel machine

    SciTech Connect

    Gorda, B.C.; Brooks, E.D. III.

    1991-03-01

    Program development on parallel machines can be a nightmare of scheduling headaches. We have developed a portable time sharing mechanism to handle the problem of scheduling gangs of processors. User program and their gangs of processors are put to sleep and awakened by the gang scheduler to provide a time sharing environment. Time quantums are adjusted according to priority queues and a system of fair share accounting. The initial platform for this software is the 128 processor BBN TC2000 in use in the Massively Parallel Computing Initiative at the Lawrence Livermore National Laboratory. 2 refs., 1 fig.

  14. ITER LHe Plants Parallel Operation

    NASA Astrophysics Data System (ADS)

    Fauve, E.; Bonneton, M.; Chalifour, M.; Chang, H.-S.; Chodimella, C.; Monneret, E.; Vincent, G.; Flavien, G.; Fabre, Y.; Grillot, D.

    The ITER Cryogenic System includes three identical liquid helium (LHe) plants, with a total average cooling capacity equivalent to 75 kW at 4.5 K.The LHe plants provide the 4.5 K cooling power to the magnets and cryopumps. They are designed to operate in parallel and to handle heavy load variations.In this proceedingwe will describe the presentstatusof the ITER LHe plants with emphasis on i) the project schedule, ii) the plantscharacteristics/layout and iii) the basic principles and control strategies for a stable operation of the three LHe plants in parallel.

  15. Medipix2 parallel readout system

    NASA Astrophysics Data System (ADS)

    Fanti, V.; Marzeddu, R.; Randaccio, P.

    2003-08-01

    A fast parallel readout system based on a PCI board has been developed in the framework of the Medipix collaboration. The readout electronics consists of two boards: the motherboard directly interfacing the Medipix2 chip, and the PCI board with digital I/O ports 32 bits wide. The device driver and readout software have been developed at low level in Assembler to allow fast data transfer and image reconstruction. The parallel readout permits a transfer rate up to 64 Mbytes/s. http://medipix.web.cern ch/MEDIPIX/

  16. Postscript: Parallel Distributed Processing in Localist Models without Thresholds

    ERIC Educational Resources Information Center

    Plaut, David C.; McClelland, James L.

    2010-01-01

    The current authors reply to a response by Bowers on a comment by the current authors on the original article. Bowers (2010) mischaracterizes the goals of parallel distributed processing (PDP research)--explaining performance on cognitive tasks is the primary motivation. More important, his claim that localist models, such as the interactive

  17. The AIS-5000 parallel processor

    SciTech Connect

    Schmitt, L.A.; Wilson, S.S.

    1988-05-01

    The AIS-5000 is a commercially available massively parallel processor which has been designed to operate in an industrial environment. It has fine-grained parallelism with up to 1024 processing elements arranged in a single-instruction multiple-data (SIMD) architecture. The processing elements are arranged in a one-dimensional chain that, for computer vision applications, can be as wide as the image itself. This architecture has superior cost/performance characteristics than two-dimensional mesh-connected systems. The design of the processing elements and their interconnections as well as the software used to program the system allow a wide variety of algorithms and applications to be implemented. In this paper, the overall architecture of the system is described. Various components of the system are discussed, including details of the processing elements, data I/O pathways and parallel memory organization. A virtual two-dimensional model for programming image-based algorithms for the system is presented. This model is supported by the AIS-5000 hardware and software and allows the system to be treated as a full-image-size, two-dimensional, mesh-connected parallel processor. Performance bench marks are given for certain simple and complex functions.

  18. Tutorial: Parallel Simulation on Supercomputers

    SciTech Connect

    Perumalla, Kalyan S

    2012-01-01

    This tutorial introduces typical hardware and software characteristics of extant and emerging supercomputing platforms, and presents issues and solutions in executing large-scale parallel discrete event simulation scenarios on such high performance computing systems. Covered topics include synchronization, model organization, example applications, and observed performance from illustrative large-scale runs.

  19. GRay: Massive parallel ODE integrator

    NASA Astrophysics Data System (ADS)

    Chan, Chi-kwan; Psaltis, Dimitrios; Ozel, Feryal

    2014-03-01

    GRay is a massive parallel ordinary differential equation integrator that employs the "stream processing paradigm." It is designed to efficiently integrate billions of photons in curved spacetime according to Einstein's general theory of relativity. The code is implemented in CUDA C/C++.

  20. Matpar: Parallel Extensions for MATLAB

    NASA Technical Reports Server (NTRS)

    Springer, P. L.

    1998-01-01

    Matpar is a set of client/server software that allows a MATLAB user to take advantage of a parallel computer for very large problems. The user can replace calls to certain built-in MATLAB functions with calls to Matpar functions.

  1. Parallel, Distributed Scripting with Python

    SciTech Connect

    Miller, P J

    2002-05-24

    Parallel computers used to be, for the most part, one-of-a-kind systems which were extremely difficult to program portably. With SMP architectures, the advent of the POSIX thread API and OpenMP gave developers ways to portably exploit on-the-box shared memory parallelism. Since these architectures didn't scale cost-effectively, distributed memory clusters were developed. The associated MPI message passing libraries gave these systems a portable paradigm too. Having programmers effectively use this paradigm is a somewhat different question. Distributed data has to be explicitly transported via the messaging system in order for it to be useful. In high level languages, the MPI library gives access to data distribution routines in C, C++, and FORTRAN. But we need more than that. Many reasonable and common tasks are best done in (or as extensions to) scripting languages. Consider sysadm tools such as password crackers, file purgers, etc ... These are simple to write in a scripting language such as Python (an open source, portable, and freely available interpreter). But these tasks beg to be done in parallel. Consider the a password checker that checks an encrypted password against a 25,000 word dictionary. This can take around 10 seconds in Python (6 seconds in C). It is trivial to parallelize if you can distribute the information and co-ordinate the work.

  2. Parallel computing: A case study

    NASA Astrophysics Data System (ADS)

    Slaets, Jan F. W.; Travieso, Gonzalo

    1989-11-01

    A simple molecular dynamics simulation is used to analyze some speed optimization techniques. The efficiency of sequential and parallel algorithms are discussed. An implementation on a T800 transputer array is proposed and the estimated performance is compared with that obtained on a supercomputer.

  3. Identification of mercury and other metals complexes with metallothioneins in dolphin liver by hydrophilic interaction liquid chromatography with the parallel detection by ICP MS and electrospray hybrid linear/orbital trap MS/MS.

    PubMed

    Pedrero, Z; Ouerdane, L; Mounicou, S; Lobinski, R; Monperrus, M; Amouroux, D

    2012-05-01

    A novel analytical procedure for the identification of metal (Hg, Cd, Cu, Zn) complexes with individual metallothionein (MT) isoforms in biological tissues by electrospray MS/MS was developed. The sample preparation was reduced to three rapid steps: the two-fold dilution of the sample cytosol with acetonitrile, the recovery of the supernatant containing MT-complexes by centrifugation and its concentration under nitrogen flow. The replacement of reversed phase HPLC by hydrophilic interaction LC (HILIC) allowed the preservation of the unstable and low abundant metallothionein zinc-mercury mixed complexes (MT-Zn(6)Hg). The MT complexes eluted were detected by ICP MS and identified in terms of molecular mass by electrospray high resolution (100,000) MS. The identification was completed by on line demetallation and the determination of the molecular mass of the apoform, followed by amino acid sequencing in the top-down mode using high energy collision fragmentation (HCD). The method was applied to the identification of MT complexes in a white-sided dolphin (Lagenorhynchus acutus) liver homogenate. The Zn complex of the N-acetylated MT2 isoform was found to be predominant, the presence of mixed complexes with Cd, Cu and, for the first time ever, Hg, was demonstrated. The latter finding has the potential to shed new light on the mercury detoxification mechanism in marine organisms. PMID:22456936

  4. Parallel execution of LISP programs

    SciTech Connect

    Weening, J.S.

    1989-01-01

    This dissertation considers several issues in the execution of Lisp programs on shared-memory multiprocessors. An overview of constructs for explicit parallelism in Lisp is first presented. The problems of partitioning a program into processes and scheduling these processes are then described, and a number of methods for performing these are proposed. These include cutting off process creation based on properties of the computation tree of the program, and basing partitioning decisions on the state of the system at runtime instead of the program. An experimental study of these methods has been performed using a simulator for parallel Lisp. The simulator, written in common Lisp using a continuation-passing style, is described in detail. This is followed by a description of the experiments that were performed and an analysis of the results. Two programs are used as illustrations-a Fast Fourier Transform, which has an abundance of parallelism, and the Cocke-Younger-Kasami parsing algorithm, for which good speedup is not as easy to obtain. The difficulty of using cutoff-based partitioning methods, and the differences between various scheduling methods, are shown. A combination of partitioning and scheduling methods which the author calls dynamic partitioning is analyzed in more detail. This method is based on examining the machine's runtime state; it requires that the programmer only identify parallelism in the program, without deciding which potential parallelism is actually useful. Several theorems are proved providing upper bounds on the amount of overhead produced by this method. He concludes that for programs whose computation trees have small height relative to their total size, dynamic partitioning can achieve asymptotically minimal overhead in the cost of process creation.

  5. File concepts for parallel I/O

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1989-01-01

    The subject of input/output (I/O) was often neglected in the design of parallel computer systems, although for many problems I/O rates will limit the speedup attainable. The I/O problem is addressed by considering the role of files in parallel systems. The notion of parallel files is introduced. Parallel files provide for concurrent access by multiple processes, and utilize parallelism in the I/O system to improve performance. Parallel files can also be used conventionally by sequential programs. A set of standard parallel file organizations is proposed, organizations are suggested, using multiple storage devices. Problem areas are also identified and discussed.

  6. Rochester checkers player: Multi-model parallel programming for animate vision. Technical report

    SciTech Connect

    Marsh, B.D.; Brown, C.M.; LeBlanc, T.J.; Scott, M.L.; Becker, T.G.

    1991-06-01

    Animate vision systems couple computer vision and robotics to achieve robust and accurate vision, as well as other complex behavior. These systems combine low-level sensory processing and effector output with high-level cognitive planning - all computationally intensive tasks that can benefit from parallel processing. No single model of parallel programming is likely to serve for all tasks, however. Early vision algorithms are intensely data parallel, often utilizing fine-grain parallel computations that share an image, while cognition algorithms decompose naturally by function, often consisting of loosely-coupled, coarse-grain parallel units. A typical animate vision application will likely consist of many tasks, each of which may require a different parallel programming model, and all of which must cooperate to achieve the desired behavior. These multi-model programs require an underlying software system that not only supports several different models of parallel computation simultaneously, but which also allows tasks implemented in different models to interact.

  7. Orientation-Enhanced Parallel Coordinate Plots.

    PubMed

    Raidou, Renata Georgia; Eisemann, Martin; Breeuwer, Marcel; Eisemann, Elmar; Vilanova, Anna

    2016-01-01

    Parallel Coordinate Plots (PCPs) is one of the most powerful techniques for the visualization of multivariate data. However, for large datasets, the representation suffers from clutter due to overplotting. In this case, discerning the underlying data information and selecting specific interesting patterns can become difficult. We propose a new and simple technique to improve the display of PCPs by emphasizing the underlying data structure. Our Orientation-enhanced Parallel Coordinate Plots (OPCPs) improve pattern and outlier discernibility by visually enhancing parts of each PCP polyline with respect to its slope. This enhancement also allows us to introduce a novel and efficient selection method, the Orientation-enhanced Brushing (O-Brushing). Our solution is particularly useful when multiple patterns are present or when the view on certain patterns is obstructed by noise. We present the results of our approach with several synthetic and real-world datasets. Finally, we conducted a user evaluation, which verifies the advantages of the OPCPs in terms of discernibility of information in complex data. It also confirms that O-Brushing eases the selection of data patterns in PCPs and reduces the amount of necessary user interactions compared to state-of-the-art brushing techniques. PMID:26529720

  8. Rotary wing aerodynamically generated noise

    NASA Technical Reports Server (NTRS)

    Schmitz, F. J.; Morse, H. A.

    1982-01-01

    The history and methodology of aerodynamic noise reduction in rotary wing aircraft are presented. Thickness noise during hover tests and blade vortex interaction noise are determined and predicted through the use of a variety of computer codes. The use of test facilities and scale models for data acquisition are discussed.

  9. A polymorphic reconfigurable emulator for parallel simulation

    NASA Technical Reports Server (NTRS)

    Parrish, E. A., Jr.; Mcvey, E. S.; Cook, G.

    1980-01-01

    Microprocessor and arithmetic support chip technology was applied to the design of a reconfigurable emulator for real time flight simulation. The system developed consists of master control system to perform all man machine interactions and to configure the hardware to emulate a given aircraft, and numerous slave compute modules (SCM) which comprise the parallel computational units. It is shown that all parts of the state equations can be worked on simultaneously but that the algebraic equations cannot (unless they are slowly varying). Attempts to obtain algorithms that will allow parellel updates are reported. The word length and step size to be used in the SCM's is determined and the architecture of the hardware and software is described.

  10. A mechanism for efficient debugging of parallel programs

    SciTech Connect

    Miller, B.P.; Choi, J.D.

    1988-01-01

    This paper addresses the design and implementation of an integrated debugging system for parallel programs running on shared memory multi-processors (SMMP). The authors describe the use of flowback analysis to provide information on causal relationships between events in a program's execution without re-executing the program for debugging. The authors introduce a mechanism called incremental tracing that, by using semantic analyses of the debugged program, makes the flowback analysis practical with only a small amount of trace generated during execution. The extend flowback analysis to apply to parallel programs and describe a method to detect race conditions in the interactions of the co-operating processes.

  11. Parallel processing spacecraft communication system

    NASA Technical Reports Server (NTRS)

    Bolotin, Gary S. (Inventor); Donaldson, James A. (Inventor); Luong, Huy H. (Inventor); Wood, Steven H. (Inventor)

    1998-01-01

    An uplink controlling assembly speeds data processing using a special parallel codeblock technique. A correct start sequence initiates processing of a frame. Two possible start sequences can be used; and the one which is used determines whether data polarity is inverted or non-inverted. Processing continues until uncorrectable errors are found. The frame ends by intentionally sending a block with an uncorrectable error. Each of the codeblocks in the frame has a channel ID. Each channel ID can be separately processed in parallel. This obviates the problem of waiting for error correction processing. If that channel number is zero, however, it indicates that the frame of data represents a critical command only. That data is handled in a special way, independent of the software. Otherwise, the processed data further handled using special double buffering techniques to avoid problems from overrun. When overrun does occur, the system takes action to lose only the oldest data.

  12. Merlin - Massively parallel heterogeneous computing

    NASA Technical Reports Server (NTRS)

    Wittie, Larry; Maples, Creve

    1989-01-01

    Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.

  13. Parallel multiplex laser feedback interferometry

    SciTech Connect

    Zhang, Song; Tan, Yidong; Zhang, Shulian

    2013-12-15

    We present a parallel multiplex laser feedback interferometer based on spatial multiplexing which avoids the signal crosstalk in the former feedback interferometer. The interferometer outputs two close parallel laser beams, whose frequencies are shifted by two acousto-optic modulators by 2? simultaneously. A static reference mirror is inserted into one of the optical paths as the reference optical path. The other beam impinges on the target as the measurement optical path. Phase variations of the two feedback laser beams are simultaneously measured through heterodyne demodulation with two different detectors. Their subtraction accurately reflects the target displacement. Under typical room conditions, experimental results show a resolution of 1.6 nm and accuracy of 7.8 nm within the range of 100 ?m.

  14. Parallel supercomputing with commodity components

    NASA Technical Reports Server (NTRS)

    Warren, M. S.; Goda, M. P.; Becker, D. J.

    1997-01-01

    We have implemented a parallel computer architecture based entirely upon commodity personal computer components. Using 16 Intel Pentium Pro microprocessors and switched fast ethernet as a communication fabric, we have obtained sustained performance on scientific applications in excess of one Gigaflop. During one production astrophysics treecode simulation, we performed 1.2 x 10(sup 15) floating point operations (1.2 Petaflops) over a three week period, with one phase of that simulation running continuously for two weeks without interruption. We report on a variety of disk, memory and network benchmarks. We also present results from the NAS parallel benchmark suite, which indicate that this architecture is competitive with current commercial architectures. In addition, we describe some software written to support efficient message passing, as well as a Linux device driver interface to the Pentium hardware performance monitoring registers.

  15. Instruction-level parallel processing.

    PubMed

    Fisher, J A; Rau, R

    1991-09-13

    The performance of microprocessors has increased steadily over the past 20 years at a rate of about 50% per year. This is the cumulative result of architectural improvements as well as increases in circuit speed. Moreover, this improvement has been obtained in a transparent fashion, that is, without requiring programmers to rethink their algorithms and programs, thereby enabling the tremendous proliferation of computers that we see today. To continue this performance growth, microprocessor designers have incorporated instruction-level parallelism (ILP) into new designs. ILP utilizes the parallel execution ofthe lowest level computer operations-adds, multiplies, loads, and so on-to increase performance transparently. The use of ILP promises to make possible, within the next few years, microprocessors whose performance is many times that of a CRAY-IS. This article provides an overview of ILP, with an emphasis on ILP architectures-superscalar, VLIW, and dataflow processors-and the compiler techniques necessary to make ILP work well. PMID:17831442

  16. A generalized parallel replica dynamics

    NASA Astrophysics Data System (ADS)

    Binder, Andrew; Lelivre, Tony; Simpson, Gideon

    2015-03-01

    Metastability is a common obstacle to performing long molecular dynamics simulations. Many numerical methods have been proposed to overcome it. One method is parallel replica dynamics, which relies on the rapid convergence of the underlying stochastic process to a quasi-stationary distribution. Two requirements for applying parallel replica dynamics are knowledge of the time scale on which the process converges to the quasi-stationary distribution and a mechanism for generating samples from this distribution. By combining a Fleming-Viot particle system with convergence diagnostics to simultaneously identify when the process converges while also generating samples, we can address both points. This variation on the algorithm is illustrated with various numerical examples, including those with entropic barriers and the 2D Lennard-Jones cluster of seven atoms.

  17. All-exchanges parallel tempering.

    PubMed

    Calvo, F

    2005-09-22

    An alternative exchange strategy for parallel tempering simulations is introduced. Instead of attempting to swap configurations between two randomly chosen but adjacent replicas, the acceptance probabilities of all possible swap moves are calculated a priori. One specific swap move is then selected according to its probability and enforced. The efficiency of the method is illustrated first on the case of two Lennard-Jones (LJ) clusters containing 13 and 31 atoms, respectively. The convergence of the caloric curve is seen to be at least twice as fast as in conventional parallel tempering simulations, especially for the difficult case of LJ31. Further evidence for an improved efficiency is reported on the ergodic measure introduced by Mountain and Thirumalai [J. Phys. Chem. 93, 6975 (1989)], calculated here for LJ13 close to the melting point. Finally, tests on two simple spin systems indicate that the method should be particularly useful when a limited number of replicas are available. PMID:16392474

  18. A brief parallel I/O tutorial.

    SciTech Connect

    Ward, H. Lee

    2010-03-01

    This document provides common best practices for the efficient utilization of parallel file systems for analysts and application developers. A multi-program, parallel supercomputer is able to provide effective compute power by aggregating a host of lower-power processors using a network. The idea, in general, is that one either constructs the application to distribute parts to the different nodes and processors available and then collects the result (a parallel application), or one launches a large number of small jobs, each doing similar work on different subsets (a campaign). The I/O system on these machines is usually implemented as a tightly-coupled, parallel application itself. It is providing the concept of a 'file' to the host applications. The 'file' is an addressable store of bytes and that address space is global in nature. In essence, it is providing a global address space. Beyond the simple reality that the I/O system is normally composed of a small, less capable, collection of hardware, that concept of a global address space will cause problems if not very carefully utilized. How much of a problem and the ways in which those problems manifest will be different, but that it is problem prone has been well established. Worse, the file system is a shared resource on the machine - a system service. What an application does when it uses the file system impacts all users. It is not the case that some portion of the available resource is reserved. Instead, the I/O system responds to requests by scheduling and queuing based on instantaneous demand. Using the system well contributes to the overall throughput on the machine. From a solely self-centered perspective, using it well reduces the time that the application or campaign is subject to impact by others. The developer's goal should be to accomplish I/O in a way that minimizes interaction with the I/O system, maximizes the amount of data moved per call, and provides the I/O system the most information about the I/O transfer per request.

  19. Efficient, massively parallel eigenvalue computation

    NASA Technical Reports Server (NTRS)

    Huo, Yan; Schreiber, Robert

    1993-01-01

    In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.

  20. Parallel strategies for SAR processing

    NASA Astrophysics Data System (ADS)

    Segoviano, Jesus A.

    2004-12-01

    This article proposes a series of strategies for improving the computer process of the Synthetic Aperture Radar (SAR) signal treatment, following the three usual lines of action to speed up the execution of any computer program. On the one hand, it is studied the optimization of both, the data structures and the application architecture used on it. On the other hand it is considered a hardware improvement. For the former, they are studied both, the usually employed SAR process data structures, proposing the use of parallel ones and the way the parallelization of the algorithms employed on the process is implemented. Besides, the parallel application architecture classifies processes between fine/coarse grain. These are assigned to individual processors or separated in a division among processors, all of them in their corresponding architectures. For the latter, it is studied the hardware employed on the computer parallel process used in the SAR handling. The improvement here refers to several kinds of platforms in which the SAR process is implemented, shared memory multicomputers, and distributed memory multiprocessors. A comparison between them gives us some guidelines to follow in order to get a maximum throughput with a minimum latency and a maximum effectiveness with a minimum cost, all together with a limited complexness. It is concluded and described, that the approach consisting of the processing of the algorithms in a GNU/Linux environment, together with a Beowulf cluster platform offers, under certain conditions, the best compromise between performance and cost, and promises the major development in the future for the Synthetic Aperture Radar computer power thirsty applications in the next years.

  1. Parallel Processing in Combustion Analysis

    NASA Technical Reports Server (NTRS)

    Schunk, Richard Gregory; Chung, T. J.

    2000-01-01

    The objective of this research is to demonstrate the application of the Flow-field Dependent Variation (FDV) method to a problem of current interest in supersonic chemical combustion. Due in part to the stiffness of the chemical reactions, the solution of such problems on unstructured three dimensional grids often dictates the use of parallel computers. Preliminary results for the injection of a supersonic hydrogen stream into vitiated air are presented.

  2. Parallel Power Grid Simulation Toolkit

    Energy Science and Technology Software Center (ESTSC)

    2015-09-14

    ParGrid is a 'wrapper' that integrates a coupled Power Grid Simulation toolkit consisting of a library to manage the synchronization and communication of independent simulations. The included library code in ParGid, named FSKIT, is intended to support the coupling multiple continuous and discrete even parallel simulations. The code is designed using modern object oriented C++ methods utilizing C++11 and current Boost libraries to ensure compatibility with multiple operating systems and environments.

  3. Highly parallel sparse Cholesky factorization

    NASA Technical Reports Server (NTRS)

    Gilbert, John R.; Schreiber, Robert

    1990-01-01

    Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.

  4. Task parallelism and high-performance languages

    SciTech Connect

    Foster, I.

    1996-03-01

    The definition of High Performance Fortran (HPF) is a significant event in the maturation of parallel computing: it represents the first parallel language that has gained widespread support from vendors and users. The subject of this paper is to incorporate support for task parallelism. The term task parallelism refers to the explicit creation of multiple threads of control, or tasks, which synchronize and communicate under programmer control. Task and data parallelism are complementary rather than competing programming models. While task parallelism is more general and can be used to implement algorithms that are not amenable to data-parallel solutions, many problems can benefit from a mixed approach, with for example a task-parallel coordination layer integrating multiple data-parallel computations. Other problems admit to both data- and task-parallel solutions, with the better solution depending on machine characteristics, compiler performance, or personal taste. For these reasons, we believe that a general-purpose high-performance language should integrate both task- and data-parallel constructs. The challenge is to do so in a way that provides the expressivity needed for applications, while preserving the flexibility and portability of a high-level language. In this paper, we examine and illustrate the considerations that motivate the use of task parallelism. We also describe one particular approach to task parallelism in Fortran, namely the Fortran M extensions. Finally, we contrast Fortran M with other proposed approaches and discuss the implications of this work for task parallelism and high-performance languages.

  5. Web based parallel/distributed medical data mining using software agents

    SciTech Connect

    Kargupta, H.; Stafford, B.; Hamzaoglu, I.

    1997-12-31

    This paper describes an experimental parallel/distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based interface for interactive data visualization. It also presents the results of applying PADMA for detecting patterns in unstructured texts of postmortem reports and laboratory test data for Hepatitis C patients.

  6. Parallel computation of seismic analysis of high arch dam

    NASA Astrophysics Data System (ADS)

    Chen, Houqun; Ma, Huaifa; Tu, Jin; Cheng, Guangqing; Tang, Juzhen

    2008-03-01

    Parallel computation programs are developed for three-dimensional meso-mechanics analysis of fully-graded dam concrete and seismic response analysis of high arch dams (ADs), based on the Parallel Finite Element Program Generator (PFEPG). The computational algorithms of the numerical simulation of the meso-structure of concrete specimens were studied. Taking into account damage evolution, static preload, strain rate effect, and the heterogeneity of the meso-structure of dam concrete, the fracture processes of damage evolution and configuration of the cracks can be directly simulated. In the seismic response analysis of ADs, all the following factors are involved, such as the nonlinear contact due to the opening and slipping of the contraction joints, energy dispersion of the far-field foundation, dynamic interactions of the dam-foundation-reservoir system, and the combining effects of seismic action with all static loads. The correctness, reliability and efficiency of the two parallel computational programs are verified with practical illustrations.

  7. Multitarget tracking algorithm parallelization for distributed-memory computing systems

    SciTech Connect

    Popp, R.L.; Pattipati, K.R.; Bar-Shalom, Y.

    1996-12-31

    In this paper we present a robust scalable parallelization of a multitarget tracking algorithm developed for air traffic surveillance. We couple the state estimation and data association problems by embedding an Interacting Multiple Model (IMM) state estimator into an optimization-based assignment framework. A SPMD distributed-memory parallelization is described, wherein the interface to the optimization problem, namely, computing the rather numerous gating and IMM state estimates, covariance calculations, and likelihood function evaluations (used as cost coefficients in the assignment problem), is parallelized. We describe several heuristic algorithms developed for the inherent task allocation problem, wherein the problem is one of assigning track tasks, having uncertain processing costs and negligible communication costs, across a set of homogeneous processors to minimize workload imbalances. Using a measurement database based on two FAA air traffic control radars, courtesy of Rome Laboratory, we show that near linear speedups are obtainable on a 32-node Intel Paragon supercomputer using simple task allocation algorithms.

  8. A vestibular prosthesis with highly-isolated parallel multichannel stimulation.

    PubMed

    Jiang, Dai; Cirmirakis, Dominik; Demosthenous, Andreas

    2015-02-01

    This paper presents an implantable vestibular stimulation system capable of providing high flexibility independent parallel stimulation to the semicircular canals in the inner ear for restoring three-dimensional sensation of head movements. To minimize channel interaction during parallel stimulation, the system is implemented with a power isolation method for crosstalk reduction. Experimental results demonstrate that, with this method, electrodes for different stimulation channels located in close proximity ( mm) can deliver current pulses simultaneously with minimum inter-channel crosstalk. The design features a memory-based scheme that manages stimulation to the three canals in parallel. A vestibular evoked potential (VEP) recording unit is included for closed-loop adaptive stimulation control. The main components of the prototype vestibular prosthesis are three ASICs, all implemented in a 0.6- ?m high-voltage CMOS technology. The measured performance was verified using vestibular electrodes in vitro. PMID:25073175

  9. Parallel ecological networks in ecosystems.

    PubMed

    Olff, Han; Alonso, David; Berg, Matty P; Eriksson, B Klemens; Loreau, Michel; Piersma, Theunis; Rooney, Neil

    2009-06-27

    In ecosystems, species interact with other species directly and through abiotic factors in multiple ways, often forming complex networks of various types of ecological interaction. Out of this suite of interactions, predator-prey interactions have received most attention. The resulting food webs, however, will always operate simultaneously with networks based on other types of ecological interaction, such as through the activities of ecosystem engineers or mutualistic interactions. Little is known about how to classify, organize and quantify these other ecological networks and their mutual interplay. The aim of this paper is to provide new and testable ideas on how to understand and model ecosystems in which many different types of ecological interaction operate simultaneously. We approach this problem by first identifying six main types of interaction that operate within ecosystems, of which food web interactions are one. Then, we propose that food webs are structured among two main axes of organization: a vertical (classic) axis representing trophic position and a new horizontal 'ecological stoichiometry' axis representing decreasing palatability of plant parts and detritus for herbivores and detrivores and slower turnover times. The usefulness of these new ideas is then explored with three very different ecosystems as test cases: temperate intertidal mudflats; temperate short grass prairie; and tropical savannah. PMID:19451126

  10. Parallel ecological networks in ecosystems

    PubMed Central

    Olff, Han; Alonso, David; Berg, Matty P.; Eriksson, B. Klemens; Loreau, Michel; Piersma, Theunis; Rooney, Neil

    2009-01-01

    In ecosystems, species interact with other species directly and through abiotic factors in multiple ways, often forming complex networks of various types of ecological interaction. Out of this suite of interactions, predator–prey interactions have received most attention. The resulting food webs, however, will always operate simultaneously with networks based on other types of ecological interaction, such as through the activities of ecosystem engineers or mutualistic interactions. Little is known about how to classify, organize and quantify these other ecological networks and their mutual interplay. The aim of this paper is to provide new and testable ideas on how to understand and model ecosystems in which many different types of ecological interaction operate simultaneously. We approach this problem by first identifying six main types of interaction that operate within ecosystems, of which food web interactions are one. Then, we propose that food webs are structured among two main axes of organization: a vertical (classic) axis representing trophic position and a new horizontal ‘ecological stoichiometry’ axis representing decreasing palatability of plant parts and detritus for herbivores and detrivores and slower turnover times. The usefulness of these new ideas is then explored with three very different ecosystems as test cases: temperate intertidal mudflats; temperate short grass prairie; and tropical savannah. PMID:19451126

  11. Global Arrays Parallel Programming Toolkit

    SciTech Connect

    Nieplocha, Jaroslaw; Krishnan, Manoj Kumar; Palmer, Bruce J.; Tipparaju, Vinod; Harrison, Robert J.; Chavarra-Miranda, Daniel

    2011-01-01

    The two predominant classes of programming models for parallel computing are distributed memory and shared memory. Both shared memory and distributed memory models have advantages and shortcomings. Shared memory model is much easier to use but it ignores data locality/placement. Given the hierarchical nature of the memory subsystems in modern computers this characteristic can have a negative impact on performance and scalability. Careful code restructuring to increase data reuse and replacing fine grain load/stores with block access to shared data can address the problem and yield performance for shared memory that is competitive with message-passing. However, this performance comes at the cost of compromising the ease of use that the shared memory model advertises. Distributed memory models, such as message-passing or one-sided communication, offer performance and scalability but they are difficult to program. The Global Arrays toolkit attempts to offer the best features of both models. It implements a shared-memory programming model in which data locality is managed by the programmer. This management is achieved by calls to functions that transfer data between a global address space (a distributed array) and local storage. In this respect, the GA model has similarities to the distributed shared-memory models that provide an explicit acquire/release protocol. However, the GA model acknowledges that remote data is slower to access than local data and allows data locality to be specified by the programmer and hence managed. GA is related to the global address space languages such as UPC, Titanium, and, to a lesser extent, Co-Array Fortran. In addition, by providing a set of data-parallel operations, GA is also related to data-parallel languages such as HPF, ZPL, and Data Parallel C. However, the Global Array programming model is implemented as a library that works with most languages used for technical computing and does not rely on compiler technology for achieving parallel efficiency. It also supports a combination of task- and data-parallelism and is available as an extension of the message passing (MPI) model. The GA model exposes to the programmer the hierarchical memory of modern high-performance computer systems, and by recognizing the communication overhead for remote data transfer, it promotes data reuse and locality of reference. Virtually all the scalable architectures possess non-uniform memory access characteristics that reflect their multi-level memory hierarchies. These hierarchies typically comprise processor registers, multiple levels of cache, local memory, and remote memory. Over time, both the number of levels and the cost (in processor cycles) of accessing deeper levels has been increasing. It is important for any scalable programming model to address memory hierarchy since it is critical to the efficient execution of scalable applications.

  12. Implementing clips on a parallel computer

    NASA Technical Reports Server (NTRS)

    Riley, Gary

    1987-01-01

    The C language integrated production system (CLIPS) is a forward chaining rule based language to provide training and delivery for expert systems. Conceptually, rule based languages have great potential for benefiting from the inherent parallelism of the algorithms that they employ. During each cycle of execution, a knowledge base of information is compared against a set of rules to determine if any rules are applicable. Parallelism also can be employed for use with multiple cooperating expert systems. To investigate the potential benefits of using a parallel computer to speed up the comparison of facts to rules in expert systems, a parallel version of CLIPS was developed for the FLEX/32, a large grain parallel computer. The FLEX implementation takes a macroscopic approach in achieving parallelism by splitting whole sets of rules among several processors rather than by splitting the components of an individual rule among processors. The parallel CLIPS prototype demonstrates the potential advantages of integrating expert system tools with parallel computers.

  13. Parallelizing alternating direction implicit solver on GPUs

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We present a parallel Alternating Direction Implicit (ADI) solver on GPUs. Our implementation significantly improves existing implementations in two aspects. First, we address the scalability issue of existing Parallel Cyclic Reduction (PCR) implementations by eliminating their hardware resource con...

  14. Parallel computational fluid dynamics - Implementations and results

    NASA Astrophysics Data System (ADS)

    Simon, Horst D.

    The present volume on parallel CFD discusses implementations on parallel machines, numerical algorithms for parallel CFD, and performance evaluation and computer science issues. Attention is given to a parallel algorithm for compressible flows through rotor-stator combinations, a massively parallel Euler solver for unstructured grids, a fast scheme to analyze 3D disk airflow on a parallel computer, and a block implicit multigrid solution of the Euler equations. Topics addressed include a 3D ADI algorithm on distributed memory multiprocessors, clustered element-by-element computations for fluid flow, hypercube FFT and the Fourier pseudospectral method, and an investigation of parallel iterative algorithms for CFD. Also discussed are fluid dynamics using interface methods on parallel processors, sorting for particle flow simulation on the connection machine, a large grain mapping method, and efforts toward a Teraflops capability for CFD.

  15. High Performance Parallel Computational Nanotechnology

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Craw, James M. (Technical Monitor)

    1995-01-01

    At a recent press conference, NASA Administrator Dan Goldin encouraged NASA Ames Research Center to take a lead role in promoting research and development of advanced, high-performance computer technology, including nanotechnology. Manufacturers of leading-edge microprocessors currently perform large-scale simulations in the design and verification of semiconductor devices and microprocessors. Recently, the need for this intensive simulation and modeling analysis has greatly increased, due in part to the ever-increasing complexity of these devices, as well as the lessons of experiences such as the Pentium fiasco. Simulation, modeling, testing, and validation will be even more important for designing molecular computers because of the complex specification of millions of atoms, thousands of assembly steps, as well as the simulation and modeling needed to ensure reliable, robust and efficient fabrication of the molecular devices. The software for this capacity does not exist today, but it can be extrapolated from the software currently used in molecular modeling for other applications: semi-empirical methods, ab initio methods, self-consistent field methods, Hartree-Fock methods, molecular mechanics; and simulation methods for diamondoid structures. In as much as it seems clear that the application of such methods in nanotechnology will require powerful, highly powerful systems, this talk will discuss techniques and issues for performing these types of computations on parallel systems. We will describe system design issues (memory, I/O, mass storage, operating system requirements, special user interface issues, interconnects, bandwidths, and programming languages) involved in parallel methods for scalable classical, semiclassical, quantum, molecular mechanics, and continuum models; molecular nanotechnology computer-aided designs (NanoCAD) techniques; visualization using virtual reality techniques of structural models and assembly sequences; software required to control mini robotic manipulators for positional control; scalable numerical algorithms for reliability, verifications and testability. There appears no fundamental obstacle to simulating molecular compilers and molecular computers on high performance parallel computers, just as the Boeing 777 was simulated on a computer before manufacturing it.

  16. Fault-tolerant parallel processor

    SciTech Connect

    Harper, R.E.; Lala, J.H. )

    1991-06-01

    This paper addresses issues central to the design and operation of an ultrareliable, Byzantine resilient parallel computer. Interprocessor connectivity requirements are met by treating connectivity as a resource that is shared among many processing elements, allowing flexibility in their configuration and reducing complexity. Redundant groups are synchronized solely by message transmissions and receptions, which aslo provide input data consistency and output voting. Reliability analysis results are presented that demonstrate the reduced failure probability of such a system. Performance analysis results are presented that quantify the temporal overhead involved in executing such fault-tolerance-specific operations. Empirical performance measurements of prototypes of the architecture are presented. 30 refs.

  17. Parallel Assembly of LIGA Components

    SciTech Connect

    Christenson, T.R.; Feddema, J.T.

    1999-03-04

    In this paper, a prototype robotic workcell for the parallel assembly of LIGA components is described. A Cartesian robot is used to press 386 and 485 micron diameter pins into a LIGA substrate and then place a 3-inch diameter wafer with LIGA gears onto the pins. Upward and downward looking microscopes are used to locate holes in the LIGA substrate, pins to be pressed in the holes, and gears to be placed on the pins. This vision system can locate parts within 3 microns, while the Cartesian manipulator can place the parts within 0.4 microns.

  18. Parallel Mapping Approaches for GNUMAP

    PubMed Central

    Clement, Nathan L.; Clement, Mark J.; Snell, Quinn; Johnson, W. Evan

    2013-01-01

    Mapping short next-generation reads to reference genomes is an important element in SNP calling and expression studies. A major limitation to large-scale whole-genome mapping is the large memory requirements for the algorithm and the long run-time necessary for accurate studies. Several parallel implementations have been performed to distribute memory on different processors and to equally share the processing requirements. These approaches are compared with respect to their memory footprint, load balancing, and accuracy. When using MPI with multi-threading, linear speedup can be achieved for up to 256 processors. PMID:23396612

  19. True Shear Parallel Plate Viscometer

    NASA Technical Reports Server (NTRS)

    Ethridge, Edwin; Kaukler, William

    2010-01-01

    This viscometer (which can also be used as a rheometer) is designed for use with liquids over a large temperature range. The device consists of horizontally disposed, similarly sized, parallel plates with a precisely known gap. The lower plate is driven laterally with a motor to apply shear to the liquid in the gap. The upper plate is freely suspended from a double-arm pendulum with a sufficiently long radius to reduce height variations during the swing to negligible levels. A sensitive load cell measures the shear force applied by the liquid to the upper plate. Viscosity is measured by taking the ratio of shear stress to shear rate.

  20. The PARTY parallel runtime system

    NASA Technical Reports Server (NTRS)

    Saltz, J. H.; Mirchandaney, Ravi; Smith, R. M.; Crowley, Kay; Nicol, D. M.

    1989-01-01

    In the present automated system for the organization of the data and computational operations entailed by parallel problems, in ways that optimize multiprocessor performance, general heuristics for partitioning program data and control are implemented by capturing and manipulating representations of a computation at run time. These heuristics are directed toward the dynamic identification and allocation of concurrent work in computations with irregular computational patterns. An optimized static-workload partitioning is computed for such repetitive-computation pattern problems as the iterative ones employed in scientific computation.

  1. Method for resource control in parallel environments using program organization and run-time support

    NASA Technical Reports Server (NTRS)

    Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)

    2001-01-01

    A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.

  2. Method for resource control in parallel environments using program organization and run-time support

    NASA Technical Reports Server (NTRS)

    Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)

    1999-01-01

    A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.

  3. Parallel operation of variable speed pumps in chilled water systems

    SciTech Connect

    Hansen, E.G.

    1995-10-01

    In the last two to three decades the cost of variable speed devices has come down considerably and their energy consumption has improved to the point that they no longer waste energy. Additionally, speed control of the circulating pumps in variable volume flow hydronic systems permits matching the head generated by the pumps to the frictional resistance to flow in the system. This will improve the operation of the control valves and save energy. In view of these advantages, the use of variable speed pumps has become widespread. Nevertheless, the speed control devices are costly, and the question arises whether all pumps in a given installation need be equipped with them. This article will explore the interaction between multiple speed pumps operating in parallel and their interface with the hydronic system they serve. It will specifically address two questions: can one pump be operated at varying speed in parallel with another pump at fixed speed? Also, what is the most economical method of applying variable speed pumping in a given chilled water system? It will be seen that the benefits to be gained from unequal speed operation of parallel pumps are minimal and the benefits are outweighed by the danger inherent in such operation. This practice must be discouraged. The study also will show that in a correctly engineered and analyzed system the number of parallel pumps can be reduced and that not all need be provided with speed control. However, all pumps in parallel operation must be run at the same speed.

  4. Parallel processing of atmospheric chemistry calculations: Preliminary considerations

    SciTech Connect

    Elliott, S.; Jones, P.

    1995-01-01

    Global climate calculations are already saturating the class modern vector supercomputers with only a few central processing units. Increased resolution and inclusion of routines to deal with biogeochemical portions of the terrestrial climate system will soon demand massively parallel approaches. The atmospheric photochemistry ensemble is intimately linked to climate through the trace greenhouse gases ozone and methane and modules for representing it are being attached to global three dimensional transport and GCM frameworks. Atmospheric kinetics involve dozens of highly interactive tracers and so will accentuate the need for parallel processing of earth system simulations. In the present text we lay some of the groundwork for addition of atmospheric kinetics packages to GCM and global scale atmospheric models on multiply parallel computers. The discussion is tailored for consumption by the photochemical modelling community. After a review of numerical atmospheric chemistry methods, we examine how kinetics can be implemented on a parallel computer. We concentrate especially on data layout and flexibility and how these can be implemented in various programming models. We conclude that chemistry can be implemented rather easily within existing frameworks of several parallel atmospheric models. However, memory limitations may preclude high resolution studies of global chemistry.

  5. Parallel multi-computers and artificial intelligence

    SciTech Connect

    Uhr, L.

    1986-01-01

    This book examines the present state and future direction of multicomputer parallel architectures for artificial intelligence research and development of artificial intelligence applications. The book provides a survey of the large variety of parallel architectures, describing the current state of the art and suggesting promising architectures to produce artificial intelligence systems such as intelligence systems such as intelligent robots. This book integrates artificial intelligence and parallel processing research areas and discusses parallel processing from the viewpoint of artificial intelligence.

  6. Parallel machine architecture and compiler design facilities

    NASA Technical Reports Server (NTRS)

    Kuck, David J.; Yew, Pen-Chung; Padua, David; Sameh, Ahmed; Veidenbaum, Alex

    1990-01-01

    The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role.

  7. Automatic Multilevel Parallelization Using OpenMP

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

    2002-01-01

    In this paper we describe the extension of the CAPO (CAPtools (Computer Aided Parallelization Toolkit) OpenMP) parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report some results for several benchmark codes and one full application that have been parallelized using our system.

  8. Interacting Parallel Constructions of Knowledge in a CAS Context

    ERIC Educational Resources Information Center

    Kidron, Ivy; Dreyfus, Tommy

    2010-01-01

    We consider the influence of a CAS context on a learner's process of constructing a justification for the bifurcations in a logistic dynamical process. We describe how instrumentation led to cognitive constructions and how the roles of the learner and the CAS intertwine, especially close to the branching and combining of constructing actions. The

  9. Units of interaction, evolution, and replication: Organic and behavioral parallels

    PubMed Central

    Glenn, Sigrid S.; Madden, Gregory J.

    1995-01-01

    Organic and behavioral evolution both involve variation, selection, and replication with retention; but the individuals involved in these processes differ in the two kinds of evolution. In this paper, biological units of evolution, selection, and retention are compared with analogous units at the behavioral level. In organic evolution, natural selection operates on variations among organisms within a species, with the result of preserving in future generations of organisms those heritable characteristics that contributed to the organism's survival and reproduction. Species evolve as characteristics of the population change as a result of past selection. Continuity in a lineage in the biosphere is maintained by replication of genes with retention of organismic characteristics across successive generations of organisms. In behavioral evolution, reinforcement operates on variations among responses within an operant, with the result of preserving in future responses those characteristics that resulted in reinforcement. Continuity in a behavioral lineage, within the repertoire of a given organism, appears to involve retention and replication, but the unit of retention and replication is unknown. We suggest that the locus of retention and replication is the nervous system of the behaving organism. PMID:22478221

  10. Collective Interaction of a Compressible Periodic Parallel Jet Flow

    NASA Technical Reports Server (NTRS)

    Miles, Jeffrey Hilton

    1997-01-01

    A linear instability model for multiple spatially periodic supersonic rectangular jets is solved using Floquet-Bloch theory. The disturbance environment is investigated using a two dimensional perturbation of a mean flow. For all cases large temporal growth rates are found. This work is motivated by an increase in mixing found in experimental measurements of spatially periodic supersonic rectangular jets with phase-locked screech. The results obtained in this paper suggests that phase-locked screech or edge tones may produce correlated spatially periodic jet flow downstream of the nozzles which creates a large span wise multi-nozzle region where a disturbance can propagate. The large temporal growth rates for eddies obtained by model calculation herein are related to the increased mixing since eddies are the primary mechanism that transfer energy from the mean flow to the large turbulent structures. Calculations of growth rates are presented for a range of Mach numbers and nozzle spacings corresponding to experimental test conditions where screech synchronized phase locking was observed. The model may be of significant scientific and engineering value in the quest to understand and construct supersonic mixer-ejector nozzles which provide increased mixing and reduced noise.

  11. Parallel Computing Using Web Servers and "Servlets".

    ERIC Educational Resources Information Center

    Lo, Alfred; Bloor, Chris; Choi, Y. K.

    2000-01-01

    Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common

  12. Coordination in serial-parallel image processing

    NASA Astrophysics Data System (ADS)

    Wjcik, Waldemar; Dubovoi, Vladymyr M.; Duda, Marina E.; Romaniuk, Ryszard S.; Yesmakhanova, Laura; Kozbakova, Ainur

    2015-12-01

    Serial-parallel systems used to convert the image. The control of their work results with the need to solve coordination problem. The paper summarizes the model of coordination of resource allocation in relation to the task of synchronizing parallel processes; the genetic algorithm of coordination developed, its adequacy verified in relation to the process of parallel image processing.

  13. Inductive Information Retrieval Using Parallel Distributed Computation.

    ERIC Educational Resources Information Center

    Mozer, Michael C.

    This paper reports on an application of parallel models to the area of information retrieval and argues that massively parallel, distributed models of computation, called connectionist, or parallel distributed processing (PDP) models, offer a new approach to the representation and manipulation of knowledge. Although this document focuses on

  14. Identifying, Quantifying, Extracting and Enhancing Implicit Parallelism

    ERIC Educational Resources Information Center

    Agarwal, Mayank

    2009-01-01

    The shift of the microprocessor industry towards multicore architectures has placed a huge burden on the programmers by requiring explicit parallelization for performance. Implicit Parallelization is an alternative that could ease the burden on programmers by parallelizing applications "under the covers" while maintaining sequential semantics…

  15. Parallel transport and band theory in crystals

    NASA Astrophysics Data System (ADS)

    Fruchart, Michel; Carpentier, David; Gaw?dzki, Krzysztof

    2014-06-01

    We show that different conventions for Bloch Hamiltonians on non-Bravais lattices correspond to different natural definitions of parallel transport of Bloch eigenstates. Generically the Berry curvatures associated with these parallel transports differ, while physical quantities are naturally related to a canonical choice of the parallel transport.

  16. Identifying, Quantifying, Extracting and Enhancing Implicit Parallelism

    ERIC Educational Resources Information Center

    Agarwal, Mayank

    2009-01-01

    The shift of the microprocessor industry towards multicore architectures has placed a huge burden on the programmers by requiring explicit parallelization for performance. Implicit Parallelization is an alternative that could ease the burden on programmers by parallelizing applications "under the covers" while maintaining sequential semantics

  17. Parallel Processing at the High School Level.

    ERIC Educational Resources Information Center

    Sheary, Kathryn Anne

    This study investigated the ability of high school students to cognitively understand and implement parallel processing. Data indicates that most parallel processing is being taught at the university level. Instructional modules on C, Linux, and the parallel processing language, P4, were designed to show that high school students are highly…

  18. Extensive Parallel Processing on Scale-Free Networks

    NASA Astrophysics Data System (ADS)

    Sollich, Peter; Tantari, Daniele; Annibale, Alessia; Barra, Adriano

    2014-12-01

    We adapt belief-propagation techniques to study the equilibrium behavior of a bipartite spin glass, with interactions between two sets of N and P =? N spins each having an arbitrary degree, i.e., number of interaction partners in the opposite set. An equivalent view is then of a system of N neurons storing P diluted patterns via Hebbian learning, in the high storage regime. Our method allows analysis of parallel pattern processing on a broad class of graphs, including those with pattern asymmetry and heterogeneous dilution; previous replica approaches assumed homogeneity. We show that in a large part of the parameter space of noise, dilution, and storage load, delimited by a critical surface, the network behaves as an extensive parallel processor, retrieving all P patterns in parallel without falling into spurious states due to pattern cross talk, as would be typical of the structural glassiness built into the network. Parallel extensive retrieval is more robust for homogeneous degree distributions, and is not disrupted by asymmetric pattern distributions. For scale-free pattern degree distributions, Hebbian learning induces modularity in the neural network; thus, our Letter gives the first theoretical description for extensive information processing on modular and scale-free networks.

  19. Xyce parallel electronic simulator design.

    SciTech Connect

    Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.

    2010-09-01

    This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.

  20. Parallel computation of electromagnetic fields

    SciTech Connect

    Madsen, N.K.

    1997-05-21

    The DSI3D code is designed to numerically solve electromagnetics problems involving complex objects by solving Maxwell`s curl equations in the time-domain and in three space dimensions. The code has been designed to run on the new parallel processing computers as well as on conventional serial computers. The DSI3D code is unique for the following reasons: It runs efficiently on a variety of parallel computers, Allows the use of unstructured non-orthogonal grids, Allows a variety of cell or element types, Reduces to be the Finite Difference Time Domain (FDID) method when orthogonal grids are used, Preserves charge or divergence locally (and globally), Is non- dissipative, and Is accurate for non-orthogonal grids. This method is derived using a Discrete Surface Integration (DSI) technique. As formulated, the DSI technique can be used with essentially arbitrary unstructured grids composed of convex polyhedral cells. This implementation of the DSI algorithm allows the use of unstructured grids that are composed of combinations of non-orthogonal hexahedrons, tetrahedrons, triangular prisms and pyramids. This algorithm reduces to the conventional FDTD method when applied on a structured orthogonal hexahedral grid.

  1. Climate modelling using parallel processors

    NASA Astrophysics Data System (ADS)

    Dash, S. K.; Selvakumar, S.; Jha, B.

    A spectral General Circulation Model at horizontal resolutions T21 and T42 has been integrated upto 30 d on 16 and 32 processors of Meiko T800. The model at resolution T21 is also implemented on 16 processors (T800) of a parallel computer (CHIPPS) built in India. The wallclock timings of model integration for 1, 10 and 30 d are noted and the speedup and efficiency of 16 and 32 processors have been computed. Results show that a T42 parallel model with nine levels in the vertical takes less than 36 elapsed minutes on 32 processors for 1 d integration. In case of T21 model integration, the maximum speedup and efficiency achieved on 16 processors are about 10 and 63%, respectively. When the horizontal resolution of the model is doubled to T42, the maximum speedup and efficiency obtained on 32 processors are about 9 and 29%, respectively. It is also found that when the physical parametrisation schemes are included in the model and thereby the number of arithmetic operations are increased, the speedup and efficiency of 16 as well as 32 processors increase compared to the case with no physics in the model.

  2. Efficient parallel global garbage collection on massively parallel computers

    SciTech Connect

    Kamada, Tomio; Matsuoka, Satoshi; Yonezawa, Akinori

    1994-12-31

    On distributed-memory high-performance MPPs where processors are interconnected by an asynchronous network, efficient Garbage Collection (GC) becomes difficult due to inter-node references and references within pending, unprocessed messages. The parallel global GC algorithm (1) takes advantage of reference locality, (2) efficiently traverses references over nodes, (3) admits minimum pause time of ongoing computations, and (4) has been shown to scale up to 1024 node MPPs. The algorithm employs a global weight counting scheme to substantially reduce message traffic. The two methods for confirming the arrival of pending messages are used: one counts numbers of messages and the other uses network `bulldozing.` Performance evaluation in actual implementations on a multicomputer with 32-1024 nodes, Fujitsu AP1000, reveals various favorable properties of the algorithm.

  3. Use of networked workstations for parallel nonlinear structural dynamic simulations of rotating bladed-disk assemblies

    NASA Technical Reports Server (NTRS)

    Hsieh, Shang-Hsien; Abel, J. F.

    1993-01-01

    The principal objective of this research is to investigate, develop and demonstrate coarse-grained, parallel-processing strategies for nonlinear dynamic simulations for rotating bladed-disk assemblies. The parallel -processing strategies addressed include numerical algorithms for parallel nonlinear solutions and techniques to effect load balancing among processors. The parallel environment employed is a distributed-memory, coarse-grained one consisting of networked workstations. A parallel explicit time integration method has been implemented for transient nonlinear solutions of rotationg bladed-disk assemblies. Automatic domain partitioning techniques have been investigated for load balancing among processors. Advanced computing environments, data structures and interactive computer graphics all contribute to an integrated parallel finite element analysis system to facilitate more efficient and powerful dynamic simulations.

  4. Parallel multiscale simulations of a brain aneurysm

    NASA Astrophysics Data System (ADS)

    Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em

    2013-07-01

    Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multiscale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver NɛκTαr. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers (NɛκTαr and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300 K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work.

  5. Parallel multiscale simulations of a brain aneurysm.

    PubMed

    Grinberg, Leopold; Fedosov, Dmitry A; Karniadakis, George Em

    2013-07-01

    Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multi-scale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver εκ αr . The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers ( εκ αr and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work. PMID:23734066

  6. Parallel program debugging with flowback analysis

    SciTech Connect

    Choi, Jongdeok.

    1989-01-01

    This thesis describes the design and implementation of an integrated debugging system for parallel programs running on shared memory multi-processors. The goal of the debugging system is to present to the programmer a graphical view of the dynamic program dependences while keeping the execution-time overhead low. The author first describes the use of flowback analysis to provide information on causal relationship between events in a programs' execution without re-executing the program for debugging. Execution time overhead is kept low by recording only a small amount of trace during a program's execution. He uses semantic analysis and a technique called incremental tracing to keep the time and space overhead low. As part of the semantic analysis, he uses a static program dependence graph structure that reduces the amount of work done at compile time and takes advantage of the dynamic information produced during execution time. The cornerstone of the incremental tracing concept is to generate a coarse trace during execution and fill incrementally, during the interactive portion of the debugging session, the gap between the information gathered in the coarse trace and the information needed to do the flowback analysis using the coarse trace. Then, he describes how to extend the flowback analysis to parallel programs. The flowback analysis can span process boundaries; i.e., the most recent modification to a shared variable might be traced to a different process than the one that contains the current reference. The static and dynamic program dependence graphs of the individual processes are tied together with synchronization and data dependence information to form complete graphs that represent the entire program.

  7. Parallel multiscale simulations of a brain aneurysm

    SciTech Connect

    Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em

    2013-07-01

    Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multiscale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier–Stokes solver NεκTαr. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers (NεκTαr and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300 K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future work.

  8. Debugging and analysis of large-scale parallel programs. Doctoral thesis

    SciTech Connect

    Mellor-Crummey, J.M.

    1989-09-01

    One of the most serious problems in the development cycle of large-scale parallel programs is the lack of tools for debugging and performance analysis. Parallel programs are more difficult to analyze than their sequential counterparts for several reasons. First, race conditions in parallel programs can cause non-deterministic behavior, which reduces the effectiveness of traditional cyclic debugging techniques. Second, invasive, interactive analysis can distort a parallel program's execution beyond recognition. Finally, comprehensive analysis of a parallel program's execution requires collection, management, and presentation of an enormous amount of information. This dissertation addresses the problem of debugging and analysis of large-scale parallel programs executing on shared-memory multiprocessors. It proposes a methodology for top-down analysis of parallel program executions that replaces previous ad-hoc approaches. To support this methodology, a formal model for shared-memory communication among processes in a parallel program is developed. It is shown how synchronization traces based on this abstract model can be used to create indistinguishable executions that form the basis for debugging. This result is used to develop a practical technique for tracing parallel program executions on shared-memory parallel processors so that their executions can be repeated deterministically on demand.

  9. A CS1 pedagogical approach to parallel thinking

    NASA Astrophysics Data System (ADS)

    Rague, Brian William

    Almost all collegiate programs in Computer Science offer an introductory course in programming primarily devoted to communicating the foundational principles of software design and development. The ACM designates this introduction to computer programming course for first-year students as CS1, during which methodologies for solving problems within a discrete computational context are presented. Logical thinking is highlighted, guided primarily by a sequential approach to algorithm development and made manifest by typically using the latest, commercially successful programming language. In response to the most recent developments in accessible multicore computers, instructors of these introductory classes may wish to include training on how to design workable parallel code. Novel issues arise when programming concurrent applications which can make teaching these concepts to beginning programmers a seemingly formidable task. Student comprehension of design strategies related to parallel systems should be monitored to ensure an effective classroom experience. This research investigated the feasibility of integrating parallel computing concepts into the first-year CS classroom. To quantitatively assess student comprehension of parallel computing, an experimental educational study using a two-factor mixed group design was conducted to evaluate two instructional interventions in addition to a control group: (1) topic lecture only, and (2) topic lecture with laboratory work using a software visualization Parallel Analysis Tool (PAT) specifically designed for this project. A new evaluation instrument developed for this study, the Perceptions of Parallelism Survey (PoPS), was used to measure student learning regarding parallel systems. The results from this educational study show a statistically significant main effect among the repeated measures, implying that student comprehension levels of parallel concepts as measured by the PoPS improve immediately after the delivery of any initial three-week CS1 level module when compared with student comprehension levels just prior to starting the course. Survey results measured during the ninth week of the course reveal that performance levels remained high compared to pre-course performance scores. A second result produced by this study reveals no statistically significant interaction effect between the intervention method and student performance as measured by the evaluation instrument over three separate testing periods. However, visual inspection of survey score trends and the low p-value generated by the interaction analysis (0.062) indicate that further studies may verify improved concept retention levels for the lecture w/PAT group.

  10. GROMACS: A message-passing parallel molecular dynamics implementation

    NASA Astrophysics Data System (ADS)

    Berendsen, H. J. C.; van der Spoel, D.; van Drunen, R.

    1995-09-01

    A parallel message-passing implementation of a molecular dynamics (MD) program that is useful for bio(macro)molecules in aqueous environment is described. The software has been developed for a custom-designed 32-processor ring GROMACS (GROningen MAchine for Chemical Simulation) with communication to and from left and right neighbours, but can run on any parallel system onto which a a ring of processors can be mapped and which supports PVM-like block send and receive calls. The GROMACS software consists of a preprocessor, a parallel MD and energy minimization program that can use an arbitrary number of processors (including one), an optional monitor, and several analysis tools. The programs are written in ANSI C and available by ftp (information: gromacs@chem.rug.nl). The functionality is based on the GROMOS (GROningen MOlecular Simulation) package (van Gunsteren and Berendsen, 1987; BIOMOS B.V., Nijenborgh 4, 9747 AG Groningen). Conversion programs between GROMOS and GROMACS formats are included. The MD program can handle rectangular periodic boundary conditions with temperature and pressure scaling. The interactions that can be handled without modification are variable non-bonded pair interactions with Coulomb and Lennard-Jones or Buckingham potentials, using a twin-range cut-off based on charge groups, and fixed bonded interactions of either harmonic or constraint type for bonds and bond angles and either periodic or cosine power series interactions for dihedral angles. Special forces can be added to groups of particles (for non-equilibrium dynamics or for position restraining) or between particles (for distance restraints). The parallelism is based on particle decomposition. Interprocessor communication is largely limited to position and force distribution over the ring once per time step.

  11. Parallel Visualization Co-Processing of Overnight CFD Propulsion Applications

    NASA Technical Reports Server (NTRS)

    Edwards, David E.; Haimes, Robert

    1999-01-01

    An interactive visualization system pV3 is being developed for the investigation of advanced computational methodologies employing visualization and parallel processing for the extraction of information contained in large-scale transient engineering simulations. Visual techniques for extracting information from the data in terms of cutting planes, iso-surfaces, particle tracing and vector fields are included in this system. This paper discusses improvements to the pV3 system developed under NASA's Affordable High Performance Computing project.

  12. Parallel programming in MIMD type parallel systems using transputer and i860 in physical simulations

    NASA Astrophysics Data System (ADS)

    Ido, S.; Hikosaka, S.

    1992-05-01

    Parallel programming and calculation performance were examined by using two types of MIMD parallel systems, that is, a transputer (T800) network and iPSC/860. Some interface subroutines were developed to apply the programs parallelized by using a transputer network to iPSC/860. Compatibility and performance of parallelized programs are discussed.

  13. Performance prediction for complex parallel applications

    SciTech Connect

    Brehm, J.; Worley, P.H.

    1997-04-01

    Today`s massively parallel machines are typically message-passing systems consisting of hundreds or thousands of processors. Implementing parallel applications efficiently in this environment is a challenging task, and poor parallel design decisions can be expensive to correct. Tools and techniques that allow the fast and accurate evaluation of different parallelization strategies would significantly improve the productivity of application developers and increase throughput on parallel architectures. This paper investigates one of the major issues in building tools to compare parallelization strategies: determining what type of performance models of the application code and of the computer system are sufficient for a fast and accurate comparison of different strategies. The paper is built around a case study employing the Performance Prediction Tool (PerPreT) to predict performance of the Parallel Spectral Transform Shallow Water Model code (PSTSWM) on the Intel Paragon. 13 refs., 6 tabs.

  14. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1991-01-01

    The main contribution of the effort in the last two years is the introduction of the MOPPS system. After doing extensive literature search, we introduced the system which is described next. MOPPS employs a new solution to the problem of managing programs which solve scientific and engineering applications on a distributed processing environment. Autonomous computers cooperate efficiently in solving large scientific problems with this solution. MOPPS has the advantage of not assuming the presence of any particular network topology or configuration, computer architecture, or operating system. It imposes little overhead on network and processor resources while efficiently managing programs concurrently. The core of MOPPS is an intelligent program manager that builds a knowledge base of the execution performance of the parallel programs it is managing under various conditions. The manager applies this knowledge to improve the performance of future runs. The program manager learns from experience.

  15. Information hiding in parallel programs

    SciTech Connect

    Foster, I.

    1992-01-30

    A fundamental principle in program design is to isolate difficult or changeable design decisions. Application of this principle to parallel programs requires identification of decisions that are difficult or subject to change, and the development of techniques for hiding these decisions. We experiment with three complex applications, and identify mapping, communication, and scheduling as areas in which decisions are particularly problematic. We develop computational abstractions that hide such decisions, and show that these abstractions can be used to develop elegant solutions to programming problems. In particular, they allow us to encode common structures, such as transforms, reductions, and meshes, as software cells and templates that can reused in different applications. An important characteristic of these structures is that they do not incorporate mapping, communication, or scheduling decisions: these aspects of the design are specified separately, when composing existing structures to form applications. This separation of concerns allows the same cells and templates to be reused in different contexts.

  16. Device for balancing parallel strings

    DOEpatents

    Mashikian, Matthew S. (Storrs, CT)

    1985-01-01

    A battery plant is described which features magnetic circuit means in association with each of the battery strings in the battery plant for balancing the electrical current flow through the battery strings by equalizing the voltage across each of the battery strings. Each of the magnetic circuit means generally comprises means for sensing the electrical current flow through one of the battery strings, and a saturable reactor having a main winding connected electrically in series with the battery string, a bias winding connected to a source of alternating current and a control winding connected to a variable source of direct current controlled by the sensing means. Each of the battery strings is formed by a plurality of batteries connected electrically in series, and these battery strings are connected electrically in parallel across common bus conductors.

  17. Hybrid Optimization Parallel Search PACKage

    SciTech Connect

    2009-11-10

    HOPSPACK is open source software for solving optimization problems without derivatives. Application problems may have a fully nonlinear objective function, bound constraints, and linear and nonlinear constraints. Problem variables may be continuous, integer-valued, or a mixture of both. The software provides a framework that supports any derivative-free type of solver algorithm. Through the framework, solvers request parallel function evaluation, which may use MPI (multiple machines) or multithreading (multiple processors/cores on one machine). The framework provides a Cache and Pending Cache of saved evaluations that reduces execution time and facilitates restarts. Solvers can dynamically create other algorithms to solve subproblems, a useful technique for handling multiple start points and integer-valued variables. HOPSPACK ships with the Generating Set Search (GSS) algorithm, developed at Sandia as part of the APPSPACK open source software project.

  18. Parallel Performance Characterization of Columbia

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak

    2004-01-01

    Using a collection of benchmark problems of increasing levels of realism and computational effort, we will characterize the strengths and limitations of the 10,240 processor Columbia system to deliver supercomputing value to application scientists. Scientists need to be able to determine if and how they can utilize Columbia to carry extreme workloads, either in terms of ultra-large applications that cannot be run otherwise (capability), or in terms of very large ensembles of medium-scale applications to populate response matrices (capacity). We select existing application benchmarks that scale from a small number of processors to the entire machine, and that highlight different issues in running supercomputing-calss applicaions, such as the various types of memory access, file I/O, inter- and intra-node communications and parallelization paradigms. http://www.nas.nasa.gov/Software/NPB/

  19. Hybrid Optimization Parallel Search PACKage

    Energy Science and Technology Software Center (ESTSC)

    2009-11-10

    HOPSPACK is open source software for solving optimization problems without derivatives. Application problems may have a fully nonlinear objective function, bound constraints, and linear and nonlinear constraints. Problem variables may be continuous, integer-valued, or a mixture of both. The software provides a framework that supports any derivative-free type of solver algorithm. Through the framework, solvers request parallel function evaluation, which may use MPI (multiple machines) or multithreading (multiple processors/cores on one machine). The framework providesmore » a Cache and Pending Cache of saved evaluations that reduces execution time and facilitates restarts. Solvers can dynamically create other algorithms to solve subproblems, a useful technique for handling multiple start points and integer-valued variables. HOPSPACK ships with the Generating Set Search (GSS) algorithm, developed at Sandia as part of the APPSPACK open source software project.« less

  20. Parallel Rendering of Large Time-Varying Volume Data

    NASA Technical Reports Server (NTRS)

    Garbutt, Alexander E.

    2005-01-01

    Interactive visualization of large time-varying 3D volume datasets has been and still is a great challenge to the modem computational world. It stretches the limits of the memory capacity, the disk space, the network bandwidth and the CPU speed of a conventional computer. In this SURF project, we propose to develop a parallel volume rendering program on SGI's Prism, a cluster computer equipped with state-of-the-art graphic hardware. The proposed program combines both parallel computing and hardware rendering in order to achieve an interactive rendering rate. We use 3D texture mapping and a hardware shader to implement 3D volume rendering on each workstation. We use SGI's VisServer to enable remote rendering using Prism's graphic hardware. And last, we will integrate this new program with ParVox, a parallel distributed visualization system developed at JPL. At the end of the project, we Will demonstrate remote interactive visualization using this new hardware volume renderer on JPL's Prism System using a time-varying dataset from selected JPL applications.

  1. Parallel computing in enterprise modeling.

    SciTech Connect

    Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.

    2008-08-01

    This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.

  2. Integrated Task and Data Parallel Programming

    NASA Technical Reports Server (NTRS)

    Grimshaw, A. S.

    1998-01-01

    This research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers 1995 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program. Additional 1995 Activities During the fall I collaborated with Andrew Grimshaw and Adam Ferrari to write a book chapter which will be included in Parallel Processing in C++ edited by Gregory Wilson. I also finished two courses, Compilers and Advanced Compilers, in 1995. These courses complete my class requirements at the University of Virginia. I have only my dissertation research and defense to complete.

  3. Parallel processing considerations for image recognition tasks

    NASA Astrophysics Data System (ADS)

    Simske, Steven J.

    2011-01-01

    Many image recognition tasks are well-suited to parallel processing. The most obvious example is that many imaging tasks require the analysis of multiple images. From this standpoint, then, parallel processing need be no more complicated than assigning individual images to individual processors. However, there are three less trivial categories of parallel processing that will be considered in this paper: parallel processing (1) by task; (2) by image region; and (3) by meta-algorithm. Parallel processing by task allows the assignment of multiple workflows-as diverse as optical character recognition [OCR], document classification and barcode reading-to parallel pipelines. This can substantially decrease time to completion for the document tasks. For this approach, each parallel pipeline is generally performing a different task. Parallel processing by image region allows a larger imaging task to be sub-divided into a set of parallel pipelines, each performing the same task but on a different data set. This type of image analysis is readily addressed by a map-reduce approach. Examples include document skew detection and multiple face detection and tracking. Finally, parallel processing by meta-algorithm allows different algorithms to be deployed on the same image simultaneously. This approach may result in improved accuracy.

  4. Fully Parallel MHD Stability Analysis Tool

    NASA Astrophysics Data System (ADS)

    Svidzinski, Vladimir; Galkin, Sergei; Kim, Jin-Soo; Liu, Yueqiang

    2014-10-01

    Progress on full parallelization of the plasma stability code MARS will be reported. MARS calculates eigenmodes in 2D axisymmetric toroidal equilibria in MHD-kinetic plasma models. It is a powerful tool for studying MHD and MHD-kinetic instabilities and it is widely used by fusion community. Parallel version of MARS is intended for simulations on local parallel clusters. It will be an efficient tool for simulation of MHD instabilities with low, intermediate and high toroidal mode numbers within both fluid and kinetic plasma models, already implemented in MARS. Parallelization of the code includes parallelization of the construction of the matrix for the eigenvalue problem and parallelization of the inverse iterations algorithm, implemented in MARS for the solution of the formulated eigenvalue problem. Construction of the matrix is parallelized by distributing the load among processors assigned to different magnetic surfaces. Parallelization of the solution of the eigenvalue problem is made by repeating steps of the present MARS algorithm using parallel libraries and procedures. Initial results of the code parallelization will be reported. Work is supported by the U.S. DOE SBIR program.

  5. STALK : an interactive virtual molecular docking system.

    SciTech Connect

    Levine, D.; Facello, M.; Hallstrom, P.; Reeder, G.; Walenz, B.; Stevens, F.; Univ. of Illinois

    1997-04-01

    Several recent technologies-genetic algorithms, parallel and distributed computing, virtual reality, and high-speed networking-underlie a new approach to the computational study of how biomolecules interact or 'dock' together. With the Stalk system, a user in a virtual reality environment can interact with a genetic algorithm running on a parallel computer to help in the search for likely geometric configurations.

  6. Linearly exact parallel closures for slab geometry

    SciTech Connect

    Ji, Jeong-Young; Held, Eric D.; Jhang, Hogun

    2013-08-15

    Parallel closures are obtained by solving a linearized kinetic equation with a model collision operator using the Fourier transform method. The closures expressed in wave number space are exact for time-dependent linear problems to within the limits of the model collision operator. In the adiabatic, collisionless limit, an inverse Fourier transform is performed to obtain integral (nonlocal) parallel closures in real space; parallel heat flow and viscosity closures for density, temperature, and flow velocity equations replace Braginskii's parallel closure relations, and parallel flow velocity and heat flow closures for density and temperature equations replace Spitzer's parallel transport relations. It is verified that the closures reproduce the exact linear response function of Hammett and Perkins [Phys. Rev. Lett. 64, 3019 (1990)] for Landau damping given a temperature gradient. In contrast to their approximate closures where the vanishing viscosity coefficient numerically gives an exact response, our closures relate the heat flow and nonvanishing viscosity to temperature and flow velocity (gradients)

  7. Parallel computing for probabilistic fatigue analysis

    NASA Technical Reports Server (NTRS)

    Sues, Robert H.; Lua, Yuan J.; Smith, Mark D.

    1993-01-01

    This paper presents the results of Phase I research to investigate the most effective parallel processing software strategies and hardware configurations for probabilistic structural analysis. We investigate the efficiency of both shared and distributed-memory architectures via a probabilistic fatigue life analysis problem. We also present a parallel programming approach, the virtual shared-memory paradigm, that is applicable across both types of hardware. Using this approach, problems can be solved on a variety of parallel configurations, including networks of single or multiprocessor workstations. We conclude that it is possible to effectively parallelize probabilistic fatigue analysis codes; however, special strategies will be needed to achieve large-scale parallelism to keep large number of processors busy and to treat problems with the large memory requirements encountered in practice. We also conclude that distributed-memory architecture is preferable to shared-memory for achieving large scale parallelism; however, in the future, the currently emerging hybrid-memory architectures will likely be optimal.

  8. Towards Distributed Memory Parallel Program Analysis

    SciTech Connect

    Quinlan, D; Barany, G; Panas, T

    2008-06-17

    This paper presents a parallel attribute evaluation for distributed memory parallel computer architectures where previously only shared memory parallel support for this technique has been developed. Attribute evaluation is a part of how attribute grammars are used for program analysis within modern compilers. Within this work, we have extended ROSE, a open compiler infrastructure, with a distributed memory parallel attribute evaluation mechanism to support user defined global program analysis required for some forms of security analysis which can not be addressed by a file by file view of large scale applications. As a result, user defined security analyses may now run in parallel without the user having to specify the way data is communicated between processors. The automation of communication enables an extensible open-source parallel program analysis infrastructure.

  9. Design considerations for parallel graphics libraries

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1994-01-01

    Applications which run on parallel supercomputers are often characterized by massive datasets. Converting these vast collections of numbers to visual form has proven to be a powerful aid to comprehension. For a variety of reasons, it may be desirable to provide this visual feedback at runtime. One way to accomplish this is to exploit the available parallelism to perform graphics operations in place. In order to do this, we need appropriate parallel rendering algorithms and library interfaces. This paper provides a tutorial introduction to some of the issues which arise in designing parallel graphics libraries and their underlying rendering algorithms. The focus is on polygon rendering for distributed memory message-passing systems. We illustrate our discussion with examples from PGL, a parallel graphics library which has been developed on the Intel family of parallel systems.

  10. Parallel automated adaptive procedures for unstructured meshes

    NASA Technical Reports Server (NTRS)

    Shephard, M. S.; Flaherty, J. E.; Decougny, H. L.; Ozturan, C.; Bottasso, C. L.; Beall, M. W.

    1995-01-01

    Consideration is given to the techniques required to support adaptive analysis of automatically generated unstructured meshes on distributed memory MIMD parallel computers. The key areas of new development are focused on the support of effective parallel computations when the structure of the numerical discretization, the mesh, is evolving, and in fact constructed, during the computation. All the procedures presented operate in parallel on already distributed mesh information. Starting from a mesh definition in terms of a topological hierarchy, techniques to support the distribution, redistribution and communication among the mesh entities over the processors is given, and algorithms to dynamically balance processor workload based on the migration of mesh entities are given. A procedure to automatically generate meshes in parallel, starting from CAD geometric models, is given. Parallel procedures to enrich the mesh through local mesh modifications are also given. Finally, the combination of these techniques to produce a parallel automated finite element analysis procedure for rotorcraft aerodynamics calculations is discussed and demonstrated.

  11. Parallel Activities in the Classroom

    ERIC Educational Resources Information Center

    Koole, Tom

    2007-01-01

    This paper reports on a study of classroom interaction as a multi-party and multi-activity phenomenon. On the basis of video-recorded lessons in secondary education schools in the Netherlands, observational records were made of the behaviour of individual students throughout lessons. The main argument in this paper is that when students engage in…

  12. Multipactor saturation in parallel-plate waveguides

    SciTech Connect

    Sorolla, E.; Mattes, M.

    2012-07-15

    The saturation stage of a multipactor discharge is considered of interest, since it can guide towards a criterion to assess the multipactor onset. The electron cloud under multipactor regime within a parallel-plate waveguide is modeled by a thin continuous distribution of charge and the equations of motion are calculated taking into account the space charge effects. The saturation is identified by the interaction of the electron cloud with its image charge. The stability of the electron population growth is analyzed and two mechanisms of saturation to explain the steady-state multipactor for voltages near above the threshold onset are identified. The impact energy in the collision against the metal plates decreases during the electron population growth due to the attraction of the electron sheet on the image through the initial plate. When this growth remains stable till the impact energy reaches the first cross-over point, the electron surface density tends to a constant value. When the stability is broken before reaching the first cross-over point the surface charge density oscillates chaotically bounded within a certain range. In this case, an expression to calculate the maximum electron surface charge density is found whose predictions agree with the simulations when the voltage is not too high.

  13. Automatic Multilevel Parallelization Using OpenMP

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

    2002-01-01

    In this paper we describe the extension of the CAPO parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report first results for several benchmark codes and one full application that have been parallelized using our system.

  14. Instant well-log inversion with a parallel computer

    SciTech Connect

    Kimminau, S.J.; Trivedi, H.

    1993-08-01

    Well-log analysis requires several vectors of input data to be inverted with a physical model that produces more vectors of output data. The problem is inherently suited to either vectorization or parallelization. PLATO (parallel log analysis, timely output) is a research prototype system that uses a parallel architecture computer with memory-mapped graphics to invert vector data and display the result rapidly. By combining this high-performance computing and display system with a graphical user interface, the analyst can interact with the system in real time'' and can visualize the result of changing parameters on up to 1,000 levels of computed volumes and reconstructed logs. It is expected that such instant'' inversion will remove the main disadvantages frequently cited for simultaneous analysis methods, namely difficulty in assessing sensitivity to different parameters and slow output response. Although the prototype system uses highly specific features of a parallel processor, a subsequent version has been implemented on a conventional (Serial) workstation with less performance but adequate functionality to preserve the apparently instant response. PLATO demonstrates the feasibility of petroleum computing applications combining an intuitive graphical interface, high-performance computing of physical models, and real-time output graphics.

  15. Toward an automated parallel computing environment for geosciences

    NASA Astrophysics Data System (ADS)

    Zhang, Huai; Liu, Mian; Shi, Yaolin; Yuen, David A.; Yan, Zhenzhen; Liang, Guoping

    2007-08-01

    Software for geodynamic modeling has not kept up with the fast growing computing hardware and network resources. In the past decade supercomputing power has become available to most researchers in the form of affordable Beowulf clusters and other parallel computer platforms. However, to take full advantage of such computing power requires developing parallel algorithms and associated software, a task that is often too daunting for geoscience modelers whose main expertise is in geosciences. We introduce here an automated parallel computing environment built on open-source algorithms and libraries. Users interact with this computing environment by specifying the partial differential equations, solvers, and model-specific properties using an English-like modeling language in the input files. The system then automatically generates the finite element codes that can be run on distributed or shared memory parallel machines. This system is dynamic and flexible, allowing users to address different problems in geosciences. It is capable of providing web-based services, enabling users to generate source codes online. This unique feature will facilitate high-performance computing to be integrated with distributed data grids in the emerging cyber-infrastructures for geosciences. In this paper we discuss the principles of this automated modeling environment and provide examples to demonstrate its versatility.

  16. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1995-01-01

    The scope of this project dealt with the investigation of the requirements to support distributed computing of scientific computations over a cluster of cooperative workstations. Various experiments on computations for the solution of simultaneous linear equations were performed in the early phase of the project to gain experience in the general nature and requirements of scientific applications. A specification of a distributed integrated computing environment, DICE, based on a distributed shared memory communication paradigm has been developed and evaluated. The distributed shared memory model facilitates porting existing parallel algorithms that have been designed for shared memory multiprocessor systems to the new environment. The potential of this new environment is to provide supercomputing capability through the utilization of the aggregate power of workstations cooperating in a cluster interconnected via a local area network. Workstations, generally, do not have the computing power to tackle complex scientific applications, making them primarily useful for visualization, data reduction, and filtering as far as complex scientific applications are concerned. There is a tremendous amount of computing power that is left unused in a network of workstations. Very often a workstation is simply sitting idle on a desk. A set of tools can be developed to take advantage of this potential computing power to create a platform suitable for large scientific computations. The integration of several workstations into a logical cluster of distributed, cooperative, computing stations presents an alternative to shared memory multiprocessor systems. In this project we designed and evaluated such a system.

  17. Do parallel beta-helix proteins have a unique fourier transform infrared spectrum?

    PubMed Central

    Khurana, R; Fink, A L

    2000-01-01

    Several polypeptides have been found to adopt an unusual domain structure known as the parallel beta-helix. These domains are characterized by parallel beta-strands, three of which form a single parallel beta-helix coil, and lead to long, extended beta-sheets. We have used ATR-FTIR (attenuated total reflectance-fourier transform infrared spectroscopy) to analyze the secondary structure of representative examples of this class of protein. Because the three-dimensional structures of parallel beta-helix proteins are unique, we initiated this study to determine if there was a corresponding unique FTIR signal associated with the parallel beta-helix conformation. Analysis of the amide I region, emanating from the carbonyl stretch vibration, reveals a strong absorbance band at 1638 cm(-1) in each of the parallel beta-helix proteins. This band is assigned to the parallel beta-sheet structure. However, components at this frequency are also commonly observed for beta-sheets in many classes of globular proteins. Thus we conclude that there is no unique infrared signature for parallel beta-helix structure. Additional contributions in the 1638 cm(-1) region, and at lower frequencies, were ascribed to hydrogen bonding between the coils in the loop/turn regions and amide side-chain interactions, respectively. A 13-residue peptide that forms fibrils and has been proposed to form beta-helical structure was also examined, and its FTIR spectrum was compared to that of the parallel beta-helix proteins. PMID:10653812

  18. Parallel debugging using graphical views. Technical report

    SciTech Connect

    Bailey, M.; Socha, D.; Notkin, D.

    1988-03-01

    Graphical views are essential for debugging parallel programs because of the large quantity of state information contained in parallel programs. Voyeur, a prototype system for creating graphical views of parallel programs, provides a cost-effective way to construct such views for any parallel-programming system. We illustrate Voyeur by discussing four views created for debugging Poker programs. One is a general trace facility for any Poker program. The other three are tailored to display a specific type of algorithmic information. Each of these views has been instrumental in detecting bugs that would have been difficult to detect otherwise, yet were obvious with the views.

  19. A parallel algorithm for global routing

    NASA Technical Reports Server (NTRS)

    Brouwer, Randall J.; Banerjee, Prithviraj

    1990-01-01

    A Parallel Hierarchical algorithm for Global Routing (PHIGURE) is presented. The router is based on the work of Burstein and Pelavin, but has many extensions for general global routing and parallel execution. Main features of the algorithm include structured hierarchical decomposition into separate independent tasks which are suitable for parallel execution and adaptive simplex solution for adding feedthroughs and adjusting channel heights for row-based layout. Alternative decomposition methods and the various levels of parallelism available in the algorithm are examined closely. The algorithm is described and results are presented for a shared-memory multiprocessor implementation.

  20. Data-parallel algorithms for image computing

    NASA Astrophysics Data System (ADS)

    Carlotto, Mark J.

    1990-11-01

    Data-parallel algorithms for image computing on the Connection Machine are described. After a brief review of some basic programming concepts in *Lip, a parallel extension of Common Lisp, data-parallel programming paradigms based on a local (diffusion-like) model of computation, the scan model of computation, a general interprocessor communications model, and a region-based model are introduced. Algorithms for connected component labeling, distance transformation, Voronoi diagrams, finding minimum cost paths, local means, shape-from-shading, hidden surface calculations, affine transformation, oblique parallel projection, and spatial operations over regions are presented. An new algorithm for interpolating irregularly spaced data via Voronoi diagrams is also described.

  1. Parallel programming in Split-C

    SciTech Connect

    Culler, D.E.; Dusseau, A.; Goldstein, S.C.; Krishnamurthy, A.; Lumetta, S.; Eicken, T. von; Yelick, K.

    1993-12-31

    The authors introduce the Split-C language, a parallel extension of C intended for high performance programming on distributed memory multiprocessors, and demonstrate the use of the language in optimizing parallel programs. Split-C provides a global address space with a clear concept of locality and unusual assignment operators. These are used as tools to reduce the frequency and cost of remote access. The language allows a mixture of shared memory, message passing, and data parallel programming styles while providing efficient access to the underlying machine. They demonstrate the basic language concepts using regular and irregular parallel programs and give performance results for various stages of program optimization.

  2. Parallel Genetic Algorithm for Alpha Spectra Fitting

    NASA Astrophysics Data System (ADS)

    Garca-Orellana, Carlos J.; Rubio-Montero, Pilar; Gonzlez-Velasco, Horacio

    2005-01-01

    We present a performance study of alpha-particle spectra fitting using parallel Genetic Algorithm (GA). The method uses a two-step approach. In the first step we run parallel GA to find an initial solution for the second step, in which we use Levenberg-Marquardt (LM) method for a precise final fit. GA is a high resources-demanding method, so we use a Beowulf cluster for parallel simulation. The relationship between simulation time (and parallel efficiency) and processors number is studied using several alpha spectra, with the aim of obtaining a method to estimate the optimal processors number that must be used in a simulation.

  3. Parallel auto-correlative statistics with VTK.

    SciTech Connect

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2013-08-01

    This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.

  4. VLSI complexity of parallel Fourier transform algorithms

    SciTech Connect

    Baradaran Seyed, T.

    1989-01-01

    Scope and method of study. The purpose of this study is to present a set of new parallel algorithms for discrete Fourier transform and compare the VLSI time and area complexity of the associated designs with the existing designs. The proposed parallel algorithms may be implemented easily in pipeline and mesh-connected parallel processing systems. Findings and conclusions. Several parallel algorithms have been proposed and associated cell layout for VLSI implementation have been presented. Comparative analysis shows that two of the designs presented by this study have better area-time performance than the existing designs in their architectural category.

  5. Parallelization of Apriori algorithm using Charm++ library

    NASA Astrophysics Data System (ADS)

    Pu?cian, Marek; Grabski, Waldemar

    2015-09-01

    This paper deals with the problem of adapting sequential frequent item sets mining algorithm to parallel processing. The original Bodon's Apriori algorithm has been partitioned into loosely coupled tasks and prepared to be executed on several computation nodes using Charm++ library. Variety of optimization methods have been proposed and successfully implemented in parallel environment. The work provides enhancements to achieve good efficiency during parallelization of existing solutions, e.g.: how to organize communication between tasks. The presented approach has been illustrated with many experiments and measurements performed on parallelized algorithm.

  6. Analysis of the numerical effects of parallelism on a parallel genetic algorithm

    SciTech Connect

    Hart, W.E.; Belew, R.K.; Kohn, S.; Baden, S.

    1995-09-18

    This paper examines the effects of relaxed synchronization on both the numerical and parallel efficiency of parallel genetic algorithms (GAs). We describe a coarse-grain geographically structured parallel genetic algorithm. Our experiments show that asynchronous versions of these algorithms have a lower run time than-synchronous GAs. Furthermore, we demonstrate that this improvement in performance is partly due to the fact that the numerical efficiency of the asynchronous genetic algorithm is better than the synchronous genetic algorithm. Our analysis includes a critique of the utility of traditional parallel performance measures for parallel GAs, and we evaluate the claims made by several researchers that parallel GAs can have superlinear speedup.

  7. Use Computer-Aided Tools to Parallelize Large CFD Applications

    NASA Technical Reports Server (NTRS)

    Jin, H.; Frumkin, M.; Yan, J.

    2000-01-01

    Porting applications to high performance parallel computers is always a challenging task. It is time consuming and costly. With rapid progressing in hardware architectures and increasing complexity of real applications in recent years, the problem becomes even more sever. Today, scalability and high performance are mostly involving handwritten parallel programs using message-passing libraries (e.g. MPI). However, this process is very difficult and often error-prone. The recent reemergence of shared memory parallel (SMP) architectures, such as the cache coherent Non-Uniform Memory Access (ccNUMA) architecture used in the SGI Origin 2000, show good prospects for scaling beyond hundreds of processors. Programming on an SMP is simplified by working in a globally accessible address space. The user can supply compiler directives, such as OpenMP, to parallelize the code. As an industry standard for portable implementation of parallel programs for SMPs, OpenMP is a set of compiler directives and callable runtime library routines that extend Fortran, C and C++ to express shared memory parallelism. It promises an incremental path for parallel conversion of existing software, as well as scalability and performance for a complete rewrite or an entirely new development. Perhaps the main disadvantage of programming with directives is that inserted directives may not necessarily enhance performance. In the worst cases, it can create erroneous results. While vendors have provided tools to perform error-checking and profiling, automation in directive insertion is very limited and often failed on large programs, primarily due to the lack of a thorough enough data dependence analysis. To overcome the deficiency, we have developed a toolkit, CAPO, to automatically insert OpenMP directives in Fortran programs and apply certain degrees of optimization. CAPO is aimed at taking advantage of detailed inter-procedural dependence analysis provided by CAPTools, developed by the University of Greenwich, to reduce potential errors made by users. Earlier tests on NAS Benchmarks and ARC3D have demonstrated good success of this tool. In this study, we have applied CAPO to parallelize three large applications in the area of computational fluid dynamics (CFD): OVERFLOW, TLNS3D and INS3D. These codes are widely used for solving Navier-Stokes equations with complicated boundary conditions and turbulence model in multiple zones. Each one comprises of from 50K to 1,00k lines of FORTRAN77. As an example, CAPO took 77 hours to complete the data dependence analysis of OVERFLOW on a workstation (SGI, 175MHz, R10K processor). A fair amount of effort was spent on correcting false dependencies due to lack of necessary knowledge during the analysis. Even so, CAPO provides an easy way for user to interact with the parallelization process. The OpenMP version was generated within a day after the analysis was completed. Due to sequential algorithms involved, code sections in TLNS3D and INS3D need to be restructured by hand to produce more efficient parallel codes. An included figure shows preliminary test results of the generated OVERFLOW with several test cases in single zone. The MPI data points for the small test case were taken from a handcoded MPI version. As we can see, CAPO's version has achieved 18 fold speed up on 32 nodes of the SGI O2K. For the small test case, it outperformed the MPI version. These results are very encouraging, but further work is needed. For example, although CAPO attempts to place directives on the outer- most parallel loops in an interprocedural framework, it does not insert directives based on the best manual strategy. In particular, it lacks the support of parallelization at the multi-zone level. Future work will emphasize on the development of methodology to work in a multi-zone level and with a hybrid approach. Development of tools to perform more complicated code transformation is also needed.

  8. Dynamic force spectroscopy of parallel individual mucin1-antibody bonds

    SciTech Connect

    Sulchek, T A; Friddle, R W; Langry, K; Lau, E; Albrecht, H; Ratto, T; DeNardo, S; Colvin, M E; Noy, A

    2005-05-02

    We used atomic force microscopy (AFM) to measure the binding forces between Mucin1 (MUC1) peptide and a single chain antibody fragment (scFv) selected from a scFv library screened against MUC1. This binding interaction is central to the design of the molecules for targeted delivery of radioimmunotherapeutic agents for prostate and breast cancer treatment. Our experiments separated the specific binding interaction from non-specific interactions by tethering the antibody and MUC1 molecules to the AFM tip and sample surface with flexible polymer spacers. Rupture force magnitude and elastic characteristics of the spacers allowed identification of the bond rupture events corresponding to different number of interacting proteins. We used dynamic force spectroscopy to estimate the intermolecular potential widths and equivalent thermodynamic off rates for mono-, bi-, and tri-valent interactions. Measured interaction potential parameters agree with the results of molecular docking simulation. Our results demonstrate that an increase of the interaction valency leads to a precipitous decline in the dissociation rate. Binding forces measured for mono and multivalent interactions match the predictions of a Markovian model for the strength of multiple uncorrelated bonds in parallel configuration. Our approach is promising for comparison of the specific effects of molecular modifications as well as for determination of the best configuration of antibody-based multivalent targeting agents.

  9. Parallel methods for the flight simulation model

    SciTech Connect

    Xiong, Wei Zhong; Swietlik, C.

    1994-06-01

    The Advanced Computer Applications Center (ACAC) has been involved in evaluating advanced parallel architecture computers and the applicability of these machines to computer simulation models. The advanced systems investigated include parallel machines with shared. memory and distributed architectures consisting of an eight processor Alliant FX/8, a twenty four processor sor Sequent Symmetry, Cray XMP, IBM RISC 6000 model 550, and the Intel Touchstone eight processor Gamma and 512 processor Delta machines. Since parallelizing a truly efficient application program for the parallel machine is a difficult task, the implementation for these machines in a realistic setting has been largely overlooked. The ACAC has developed considerable expertise in optimizing and parallelizing application models on a collection of advanced multiprocessor systems. One of aspect of such an application model is the Flight Simulation Model, which used a set of differential equations to describe the flight characteristics of a launched missile by means of a trajectory. The Flight Simulation Model was written in the FORTRAN language with approximately 29,000 lines of source code. Depending on the number of trajectories, the computation can require several hours to full day of CPU time on DEC/VAX 8650 system. There is an impetus to reduce the execution time and utilize the advanced parallel architecture computing environment available. ACAC researchers developed a parallel method that allows the Flight Simulation Model to be able to run in parallel on the multiprocessor system. For the benchmark data tested, the parallel Flight Simulation Model implemented on the Alliant FX/8 has achieved nearly linear speedup. In this paper, we describe a parallel method for the Flight Simulation Model. We believe the method presented in this paper provides a general concept for the design of parallel applications. This concept, in most cases, can be adapted to many other sequential application programs.

  10. Parallel genotypic adaptation: when evolution repeats itself

    PubMed Central

    Wood, Troy E.; Burke, John M.; Rieseberg, Loren H.

    2008-01-01

    Until recently, parallel genotypic adaptation was considered unlikely because phenotypic differences were thought to be controlled by many genes. There is increasing evidence, however, that phenotypic variation sometimes has a simple genetic basis and that parallel adaptation at the genotypic level may be more frequent than previously believed. Here, we review evidence for parallel genotypic adaptation derived from a survey of the experimental evolution, phylogenetic, and quantitative genetic literature. The most convincing evidence of parallel genotypic adaptation comes from artificial selection experiments involving microbial populations. In some experiments, up to half of the nucleotide substitutions found in independent lineages under uniform selection are the same. Phylogenetic studies provide a means for studying parallel genotypic adaptation in non-experimental systems, but conclusive evidence may be difficult to obtain because homoplasy can arise for other reasons. Nonetheless, phylogenetic approaches have provided evidence of parallel genotypic adaptation across all taxonomic levels, not just microbes. Quantitative genetic approaches also suggest parallel genotypic evolution across both closely and distantly related taxa, but it is important to note that this approach cannot distinguish between parallel changes at homologous loci versus convergent changes at closely linked non-homologous loci. The finding that parallel genotypic adaptation appears to be frequent and occurs at all taxonomic levels has important implications for phylogenetic and evolutionary studies. With respect to phylogenetic analyses, parallel genotypic changes, if common, may result in faulty estimates of phylogenetic relationships. From an evolutionary perspective, the occurrence of parallel genotypic adaptation provides increasing support for determinism in evolution and may provide a partial explanation for how species with low levels of gene flow are held together. PMID:15881688

  11. Parallel preoptic pathways for thermoregulation.

    PubMed

    Yoshida, Kyoko; Li, Xiaodong; Cano, Georgina; Lazarus, Michael; Saper, Clifford B

    2009-09-23

    Sympathetic premotor neurons in the rostral medullary raphe (RMR) regulate heat conservation by tail artery vasoconstriction and brown adipose tissue thermogenesis. These neurons are a critical relay in the pathway that increases body temperature. However, the origins of the inputs that activate the RMR during cold exposure have not been definitively identified. We investigated the afferents to the RMR that were activated during cold by examining Fos expression in retrogradely labeled neurons after injection of cholera toxin B subunit (CTb) in the RMR. These experiments identified a cluster of Fos-positive neurons in the dorsomedial hypothalamic nucleus and dorsal hypothalamic area (DMH/DHA) with projections to the RMR that may mediate cold-induced elevation of body temperature. Also, neurons in the median preoptic nucleus (MnPO) and dorsolateral preoptic area (DLPO) and in the A7 noradrenergic cell group were retrogradely labeled but lacked Fos expression, suggesting that they may inhibit the RMR. To investigate whether individual or common preoptic neurons project to the RMR and DMH/DHA, we injected CTb into the RMR and Fluorogold into the DMH/DHA. We found that projections from the DLPO and MnPO to the RMR and DMH/DHA emerge from largely separate neuronal populations, indicating they may be differentially regulated. Combined cell-specific lesions of MnPO and DLPO, but not lesions of either one alone, caused baseline hyperthermia. Our data suggest that the MnPO and DLPO provide parallel inhibitory pathways that tonically inhibit the DMH/DHA and the RMR at baseline, and that hyperthermia requires the release of this inhibition from both nuclei. PMID:19776281

  12. Nerve-pulse interactions

    SciTech Connect

    Scott, A.C.

    1982-01-01

    Some recent experimental and theoretical results on mechanisms through which individual nerve pulses can interact are reviewed. Three modes of interactions are considered: (1) interaction of pulses as they travel along a single fiber which leads to velocity dispersion; (2) propagation of pairs of pulses through a branching region leading to quantum pulse code transformations; and (3) interaction of pulses on parallel fibers through which they may form a pulse assembly. This notion is analogous to Hebb's concept of a cell assembly, but on a lower level of the neural hierarchy.

  13. The language parallel Pascal and other aspects of the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  14. MULTIOBJECTIVE PARALLEL GENETIC ALGORITHM FOR WASTE MINIMIZATION

    EPA Science Inventory

    In this research we have developed an efficient multiobjective parallel genetic algorithm (MOPGA) for waste minimization problems. This MOPGA integrates PGAPack (Levine, 1996) and NSGA-II (Deb, 2000) with novel modifications. PGAPack is a master-slave parallel implementation of a...

  15. RAM-Based parallel-output controller

    NASA Technical Reports Server (NTRS)

    Niswander, J. K.; Stattel, R. J.

    1980-01-01

    Selected bit strings in serial-data link are extracted for processing. Controller is programmable interface between serial-data link and peripherals that accept parallel data. It can be used to drive displays, printers, plotters, digital-to-analog converters, and parallel-output ports.

  16. Parallel Activation in Bilingual Phonological Processing

    ERIC Educational Resources Information Center

    Lee, Su-Yeon

    2011-01-01

    In bilingual language processing, the parallel activation hypothesis suggests that bilinguals activate their two languages simultaneously during language processing. Support for the parallel activation mainly comes from studies of lexical (word-form) processing, with relatively less attention to phonological (sound) processing. According to

  17. Software For Diagnosis Of Parallel Processing

    NASA Technical Reports Server (NTRS)

    Hontalas, Philip; Yan, Jerry; Fineman, Charles

    1995-01-01

    Ames Instrumentation System (AIMS) computer program package of software tools measuring and analyzing performances of parallel-processing application programs. Helps programmer to debug and refine, and to monitor and visualize execution of, parallel-processing application software for Intel iPSC/860 (or equivalent) multicomputer. Performance data collected displayed graphically on computer workstations supporting X-Windows.

  18. Parallel Computing Strategies for Irregular Algorithms

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

    2002-01-01

    Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.

  19. 17 CFR 12.24 - Parallel proceedings.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 17 Commodity and Securities Exchanges 1 2011-04-01 2011-04-01 false Parallel proceedings. 12.24 Section 12.24 Commodity and Securities Exchanges COMMODITY FUTURES TRADING COMMISSION RULES RELATING TO REPARATIONS General Information and Preliminary Consideration of Pleadings 12.24 Parallel proceedings. (a) Definition. For purposes of...

  20. EPIC: E-field Parallel Imaging Correlator

    NASA Astrophysics Data System (ADS)

    Thyagarajan, Nithyanandan; Beardsley, Adam P.; Bowman, Judd D.; Morales, Miguel F.

    2015-11-01

    E-field Parallel Imaging Correlator (EPIC), a highly parallelized Object Oriented Python package, implements the Modular Optimal Frequency Fourier (MOFF) imaging technique. It also includes visibility-based imaging using the software holography technique and a simulator for generating electric fields from a sky model. EPIC can accept dual-polarization inputs and produce images of all four instrumental cross-polarizations.

  1. Calculating real Delbrck amplitudes on parallel processors

    NASA Astrophysics Data System (ADS)

    Kahane, Sylvian

    1991-12-01

    Calculation of the real Delbrck scattering amplitudes is parallelized by concurent evaluation of 20 four-dimensional integrals. Two approaches were used: (a) a farm of master and workers tasks, and (b) the Cubix concept of parallelization. We discuss load balancing, timing and the efficiency of the implementation.

  2. Single-cell mechanics: the parallel plates technique.

    PubMed

    Bufi, Nathalie; Durand-Smet, Pauline; Asnacios, Atef

    2015-01-01

    We describe here the parallel plates technique which enables quantifying single-cell mechanics, either passive (cell deformability) or active (whole-cell traction forces). Based on the bending of glass microplates of calibrated stiffness, it is easy to implement on any microscope, and benefits from protocols and equipment already used in biology labs (coating of glass slides, pipette pullers, micromanipulators, etc.). We first present the principle of the technique, the design and calibration of the microplates, and various surface coatings corresponding to different cell-substrate interactions. Then we detail the specific cell preparation for the assays, and the different mechanical assays that can be carried out. Finally, we discuss the possible technical simplifications and the specificities of each mechanical protocol, as well as the possibility of extending the use of the parallel plates to investigate the mechanics of cell aggregates or tissues. PMID:25640430

  3. Parallel discrete event simulation: A shared memory approach

    NASA Technical Reports Server (NTRS)

    Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.

    1987-01-01

    With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models.

  4. Parallel computation of 2D Morse-Smale complexes.

    PubMed

    Shivashankar, Nithin; Senthilnathan, M; Natarajan, Vijay

    2012-10-01

    The Morse-Smale complex is a useful topological data structure for the analysis and visualization of scalar data. This paper describes an algorithm that processes all mesh elements of the domain in parallel to compute the Morse-Smale complex of large 2D datasets at interactive speeds. We employ a reformulation of the Morse-Smale complex using Formans Discrete Morse Theory and achieve scalability by computing the discrete gradient using local accesses only. We also introduce a novel approach to merge gradient paths that ensures accurate geometry of the computed complex. We demonstrate that our algorithm performs well on both multicore environments and on massively parallel architectures such as the GPU. PMID:22156106

  5. Parallelization of MRCI based on hole-particle symmetry.

    PubMed

    Suo, Bing; Zhai, Gaohong; Wang, Yubin; Wen, Zhenyi; Hu, Xiangqian; Li, Lemin

    2005-01-15

    The parallel implementation of multireference configuration interaction program based on the hole-particle symmetry is described. The platform to implement the parallelization is an Intel-Architectural cluster consisting of 12 nodes, each of which is equipped with two 2.4-G XEON processors, 3-GB memory, and 36-GB disk, and are connected by a Gigabit Ethernet Switch. The dependence of speedup on molecular symmetries and task granularities is discussed. Test calculations show that the scaling with the number of nodes is about 1.9 (for C1 and Cs), 1.65 (for C2v), and 1.55 (for D2h) when the number of nodes is doubled. The largest calculation performed on this cluster involves 5.6 x 10(8) CSFs. PMID:15538769

  6. A fast ultrasonic simulation tool based on massively parallel implementations

    NASA Astrophysics Data System (ADS)

    Lambert, Jason; Rougeron, Gilles; Lacassagne, Lionel; Chatillon, Sylvain

    2014-02-01

    This paper presents a CIVA optimized ultrasonic inspection simulation tool, which takes benefit of the power of massively parallel architectures: graphical processing units (GPU) and multi-core general purpose processors (GPP). This tool is based on the classical approach used in CIVA: the interaction model is based on Kirchoff, and the ultrasonic field around the defect is computed by the pencil method. The model has been adapted and parallelized for both architectures. At this stage, the configurations addressed by the tool are : multi and mono-element probes, planar specimens made of simple isotropic materials, planar rectangular defects or side drilled holes of small diameter. Validations on the model accuracy and performances measurements are presented.

  7. Parallel line analysis: multifunctional software for the biomedical sciences

    NASA Technical Reports Server (NTRS)

    Swank, P. R.; Lewis, M. L.; Damron, K. L.; Morrison, D. R.

    1990-01-01

    An easy to use, interactive FORTRAN program for analyzing the results of parallel line assays is described. The program is menu driven and consists of five major components: data entry, data editing, manual analysis, manual plotting, and automatic analysis and plotting. Data can be entered from the terminal or from previously created data files. The data editing portion of the program is used to inspect and modify data and to statistically identify outliers. The manual analysis component is used to test the assumptions necessary for parallel line assays using analysis of covariance techniques and to determine potency ratios with confidence limits. The manual plotting component provides a graphic display of the data on the terminal screen or on a standard line printer. The automatic portion runs through multiple analyses without operator input. Data may be saved in a special file to expedite input at a future time.

  8. a Parallel Raycast Algorithm of Csg Models on CM2

    NASA Astrophysics Data System (ADS)

    Pili, Piero

    One of the main problems for CSG models manipulators is the fast visualization of the results. The high computational cost on single-processor architectures makes the CSG scheme useless for the interactive creation of the model. This article deals with a parallel algorithm based on general purpose SIMD architecture, such as the Connection Machine 2, for the visualization of high quality shaded images of CSG models in nearly real time. The technique is based on a pixels parallelization of the Ray Casting algorithm. The Frame Buffer is divided into severals regions and for each of them we determine the CSG reduced model projected on it. Then a processor is assigned to each pixel in the sub-area; it determines the equation of the ray crossing the pixel and, using the Ray Casting technique, the nearest intersection point to the pixel on which it calculates the illumination model.

  9. Reducing neural network training time with parallel processing

    NASA Technical Reports Server (NTRS)

    Rogers, James L., Jr.; Lamarsh, William J., II

    1995-01-01

    Obtaining optimal solutions for engineering design problems is often expensive because the process typically requires numerous iterations involving analysis and optimization programs. Previous research has shown that a near optimum solution can be obtained in less time by simulating a slow, expensive analysis with a fast, inexpensive neural network. A new approach has been developed to further reduce this time. This approach decomposes a large neural network into many smaller neural networks that can be trained in parallel. Guidelines are developed to avoid some of the pitfalls when training smaller neural networks in parallel. These guidelines allow the engineer: to determine the number of nodes on the hidden layer of the smaller neural networks; to choose the initial training weights; and to select a network configuration that will capture the interactions among the smaller neural networks. This paper presents results describing how these guidelines are developed.

  10. On the dimensionally correct kinetic theory of turbulence for parallel propagation

    SciTech Connect

    Gaelzer, R. E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br; Ziebell, L. F. E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br; Yoon, P. H. E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br; Kim, Sunjung E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br

    2015-03-15

    Yoon and Fang [Phys. Plasmas 15, 122312 (2008)] formulated a second-order nonlinear kinetic theory that describes the turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field. Their theory also includes discrete-particle effects, or the effects due to spontaneously emitted thermal fluctuations. However, terms associated with the spontaneous fluctuations in particle and wave kinetic equations in their theory contain proper dimensionality only for an artificial one-dimensional situation. The present paper extends the analysis and re-derives the dimensionally correct kinetic equations for three-dimensional case. The new formalism properly describes the effects of spontaneous fluctuations emitted in three-dimensional space, while the collectively emitted turbulence propagates predominantly in directions parallel/anti-parallel to the ambient magnetic field. As a first step, the present investigation focuses on linear wave-particle interaction terms only. A subsequent paper will include the dimensionally correct nonlinear wave-particle interaction terms.

  11. Parallel numerical reservoir simulation: A feasibility study

    SciTech Connect

    Michielse, P.H.

    1994-12-31

    This paper discusses a feasibility study to implement a parallel reservoir simulator on parallel computers. The basis of this study is a reservoir simulator that models an injection-production mechanism. The simulator implements a multigrid solver for the elliptic part of the equations, and uses adaptive local grid refinement to rack moving fronts in the reservoir. The parallelization method is based on a domain decomposition method, which assigns the subdomains to the processors. In order to obtain a correct solution, communication across the internal boundaries between the subdomains is required. The implementation of the multigrid method imposes restrictions on the domain decomposition. Furthermore, the adaptive local grid refinement may cause the work load distribution over the processors to be out of balance. Hence, some load balancing technique is required to ensure parallel efficiency. This parallel efficiency is illustrated by experiments on a Convex MetaSeries system.

  12. Implementation and performance of parallel Prolog interpreter

    SciTech Connect

    Wei, S.; Kale, L.V.; Balkrishna, R. . Dept. of Computer Science)

    1988-01-01

    In this paper, the authors discuss the implementation of a parallel Prolog interpreter on different parallel machines. The implementation is based on the REDUCE--OR process model which exploits both AND and OR parallelism in logic programs. It is machine independent as it runs on top of the chare-kernel--a machine-independent parallel programming system. The authors also give the performance of the interpreter running a diverse set of benchmark pargrams on parallel machines including shared memory systems: an Alliant FX/8, Sequent and a MultiMax, and a non-shared memory systems: Intel iPSC/32 hypercube, in addition to its performance on a multiprocessor simulation system.

  13. Parallel tempering for the traveling salesman problem

    SciTech Connect

    Percus, Allon; Wang, Richard; Hyman, Jeffrey; Caflisch, Russel

    2008-01-01

    We explore the potential of parallel tempering as a combinatorial optimization method, applying it to the traveling salesman problem. We compare simulation results of parallel tempering with a benchmark implementation of simulated annealing, and study how different choices of parameters affect the relative performance of the two methods. We find that a straightforward implementation of parallel tempering can outperform simulated annealing in several crucial respects. When parameters are chosen appropriately, both methods yield close approximation to the actual minimum distance for an instance with 200 nodes. However, parallel tempering yields more consistently accurate results when a series of independent simulations are performed. Our results suggest that parallel tempering might offer a simple but powerful alternative to simulated annealing for combinatorial optimization problems.

  14. Configuration space representation in parallel coordinates

    NASA Technical Reports Server (NTRS)

    Fiorini, Paolo; Inselberg, Alfred

    1989-01-01

    By means of a system of parallel coordinates, a nonprojective mapping from R exp N to R squared is obtained for any positive integer N. In this way multivariate data and relations can be represented in the Euclidean plane (embedded in the projective plane). Basically, R squared with Cartesian coordinates is augmented by N parallel axes, one for each variable. The N joint variables of a robotic device can be represented graphically by using parallel coordinates. It is pointed out that some properties of the relation are better perceived visually from the parallel coordinate representation, and that new algorithms and data structures can be obtained from this representation. The main features of parallel coordinates are described, and an example is presented of their use for configuration space representation of a mechanical arm (where Cartesian coordinates cannot be used).

  15. National Combustion Code: Parallel Implementation and Performance

    NASA Technical Reports Server (NTRS)

    Quealy, A.; Ryder, R.; Norris, A.; Liu, N.-S.

    2000-01-01

    The National Combustion Code (NCC) is being developed by an industry-government team for the design and analysis of combustion systems. CORSAIR-CCD is the current baseline reacting flow solver for NCC. This is a parallel, unstructured grid code which uses a distributed memory, message passing model for its parallel implementation. The focus of the present effort has been to improve the performance of the NCC flow solver to meet combustor designer requirements for model accuracy and analysis turnaround time. Improving the performance of this code contributes significantly to the overall reduction in time and cost of the combustor design cycle. This paper describes the parallel implementation of the NCC flow solver and summarizes its current parallel performance on an SGI Origin 2000. Earlier parallel performance results on an IBM SP-2 are also included. The performance improvements which have enabled a turnaround of less than 15 hours for a 1.3 million element fully reacting combustion simulation are described.

  16. Differences Between Distributed and Parallel Systems

    SciTech Connect

    Brightwell, R.; Maccabe, A.B.; Rissen, R.

    1998-10-01

    Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.

  17. Broadcasting a message in a parallel computer

    DOEpatents

    Berg, Jeremy E. (Rochester, MN); Faraj, Ahmad A. (Rochester, MN)

    2011-08-02

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.

  18. Parallel hypergraph partitioning for scientific computing.

    SciTech Connect

    Heaphy, Robert; Devine, Karen Dragon; Catalyurek, Umit; Bisseling, Robert; Hendrickson, Bruce Alan; Boman, Erik Gunnar

    2005-07-01

    Graph partitioning is often used for load balancing in parallel computing, but it is known that hypergraph partitioning has several advantages. First, hypergraphs more accurately model communication volume, and second, they are more expressive and can better represent nonsymmetric problems. Hypergraph partitioning is particularly suited to parallel sparse matrix-vector multiplication, a common kernel in scientific computing. We present a parallel software package for hypergraph (and sparse matrix) partitioning developed at Sandia National Labs. The algorithm is a variation on multilevel partitioning. Our parallel implementation is novel in that it uses a two-dimensional data distribution among processors. We present empirical results that show our parallel implementation achieves good speedup on several large problems (up to 33 million nonzeros) with up to 64 processors on a Linux cluster.

  19. Conservation of writhe helicity under anti-parallel reconnection

    NASA Astrophysics Data System (ADS)

    Laing, Christian E.; Ricca, Renzo L.; Sumners, De Witt L.

    2015-03-01

    Reconnection is a fundamental event in many areas of science, from the interaction of vortices in classical and quantum fluids, and magnetic flux tubes in magnetohydrodynamics and plasma physics, to the recombination in polymer physics and DNA biology. By using fundamental results in topological fluid mechanics, the helicity of a flux tube can be calculated in terms of writhe and twist contributions. Here we show that the writhe is conserved under anti-parallel reconnection. Hence, for a pair of interacting flux tubes of equal flux, if the twist of the reconnected tube is the sum of the original twists of the interacting tubes, then helicity is conserved during reconnection. Thus, any deviation from helicity conservation is entirely due to the intrinsic twist inserted or deleted locally at the reconnection site. This result has important implications for helicity and energy considerations in various physical contexts.

  20. Virtual reality visualization of parallel molecular dynamics simulation

    SciTech Connect

    Disz, T.; Papka, M.; Stevens, R.; Pellegrino, M.; Taylor, V.

    1995-12-31

    When performing communications mapping experiments for massively parallel processors, it is important to be able to visualize the mappings and resulting communications. In a molecular dynamics model, visualization of the atom to atom interaction and the processor mappings provides insight into the effectiveness of the communications algorithms. The basic quantities available for visualization in a model of this type are the number of molecules per unit volume, the mass, and velocity of each molecule. The computational information available for visualization is the atom to atom interaction within each time step, the atom to processor mapping, and the energy resealing events. We use the CAVE (CAVE Automatic Virtual Environment) to provide interactive, immersive visualization experiences.

  1. Conservation of writhe helicity under anti-parallel reconnection

    PubMed Central

    Laing, Christian E.; Ricca, Renzo L.; Sumners, De Witt L.

    2015-01-01

    Reconnection is a fundamental event in many areas of science, from the interaction of vortices in classical and quantum fluids, and magnetic flux tubes in magnetohydrodynamics and plasma physics, to the recombination in polymer physics and DNA biology. By using fundamental results in topological fluid mechanics, the helicity of a flux tube can be calculated in terms of writhe and twist contributions. Here we show that the writhe is conserved under anti-parallel reconnection. Hence, for a pair of interacting flux tubes of equal flux, if the twist of the reconnected tube is the sum of the original twists of the interacting tubes, then helicity is conserved during reconnection. Thus, any deviation from helicity conservation is entirely due to the intrinsic twist inserted or deleted locally at the reconnection site. This result has important implications for helicity and energy considerations in various physical contexts. PMID:25820408

  2. Codes for the Modelling of Stellar Structures: Parallel Implementations on a Workstation LAN

    NASA Astrophysics Data System (ADS)

    Pucillo, M.; Bono, G.; Mazzali, P. A.; Pasian, F.; Smareglia, R.

    In the past couple of years, modelling of stellar structures has become increasingly important at our Institute. The codes used for this kind of computations are very demanding in terms of CPU power, and most of them are suited to parallel processing. Purchasing parallel hardware can be very expensive for our medium-sized institute but, on the other hand, a LAN of fairly powerful workstations, normally used for interactive data reduction and analysis is available. In this paper, we discuss practical examples of porting stable and well-known codes for stellar structures modelling in a parallel environment, given by the workstation LAN and by a software for distributed processing.

  3. Parallelism extraction and program restructuring for parallel simulation of digital systems

    SciTech Connect

    Vellandi, B.L.

    1990-01-01

    Two topics currently of interest to the computer aided design (CADF) for the very-large-scale integrated circuit (VLSI) community are using the VHSIC Hardware Description Language (VHDL) effectively and decreasing simulation times of VLSI designs through parallel execution of the simulator. The goal of this research is to increase the degree of parallelism obtainable in VHDL simulation, and consequently to decrease simulation times. The research targets simulation on massively parallel architectures. Experimentation and instrumentation were done on the SIMD Connection Machine. The author discusses her method used to extract parallelism and restructure a VHDL program, experimental results using this method, and requirements for a parallel architecture for fast simulation.

  4. Parallel systems in the control of speech.

    PubMed

    Simmonds, Anna J; Wise, Richard J S; Collins, Catherine; Redjep, Ozlem; Sharp, David J; Iverson, Paul; Leech, Robert

    2014-05-01

    Modern neuroimaging techniques have advanced our understanding of the distributed anatomy of speech production, beyond that inferred from clinico-pathological correlations. However, much remains unknown about functional interactions between anatomically distinct components of this speech production network. One reason for this is the need to separate spatially overlapping neural signals supporting diverse cortical functions. We took three separate human functional magnetic resonance imaging (fMRI) datasets (two speech production, one "rest"). In each we decomposed the neural activity within the left posterior perisylvian speech region into discrete components. This decomposition robustly identified two overlapping spatio-temporal components, one centered on the left posterior superior temporal gyrus (pSTG), the other on the adjacent ventral anterior parietal lobe (vAPL). The pSTG was functionally connected with bilateral superior temporal and inferior frontal regions, whereas the vAPL was connected with other parietal regions, lateral and medial. Surprisingly, the components displayed spatial anti-correlation, in which the negative functional connectivity of each component overlapped with the other component's positive functional connectivity, suggesting that these two systems operate separately and possibly in competition. The speech tasks reliably modulated activity in both pSTG and vAPL suggesting they are involved in speech production, but their activity patterns dissociate in response to different speech demands. These components were also identified in subjects at "rest" and not engaged in overt speech production. These findings indicate that the neural architecture underlying speech production involves parallel distinct components that converge within posterior peri-sylvian cortex, explaining, in part, why this region is so important for speech production. PMID:23723184

  5. Performance of the Galley Parallel File System

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David

    1996-01-01

    As the input/output (I/O) needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. Initial experiments, reported in this paper, indicate that Galley is capable of providing high-performance 1/O to applications the applications that rely on them. In Section 3 we describe that access data in patterns that have been observed to be common.

  6. Applications of Parallel Processing in Configuration Analyses

    NASA Technical Reports Server (NTRS)

    Sundaram, Ppchuraman; Hager, James O.; Biedron, Robert T.

    1999-01-01

    The paper presents the recent progress made towards developing an efficient and user-friendly parallel environment for routine analysis of large CFD problems. The coarse-grain parallel version of the CFL3D Euler/Navier-Stokes analysis code, CFL3Dhp, has been ported onto most available parallel platforms. The CFL3Dhp solution accuracy on these parallel platforms has been verified with the CFL3D sequential analyses. User-friendly pre- and post-processing tools that enable a seamless transfer from sequential to parallel processing have been written. Static load balancing tool for CFL3Dhp analysis has also been implemented for achieving good parallel efficiency. For large problems, load balancing efficiency as high as 95% can be achieved even when large number of processors are used. Linear scalability of the CFL3Dhp code with increasing number of processors has also been shown using a large installed transonic nozzle boattail analysis. To highlight the fast turn-around time of parallel processing, the TCA full configuration in sideslip Navier-Stokes drag polar at supersonic cruise has been obtained in a day. CFL3Dhp is currently being used as a production analysis tool.

  7. Parallel computation of manipulator inverse dynamics

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Bejczy, Antal K.

    1991-01-01

    In this article, parallel computation of manipulator inverse dynamics is investigated. A hierarchical graph-based mapping approach is devised to analyze the inherent parallelism in the Newton-Euler formulation at several computational levels, and to derive the features of an abstract architecture for exploitation of parallelism. At each level, a parallel algorithm represents the application of a parallel model of computation that transforms the computation into a graph whose structure defines the features of an abstract architecture, i.e., number of processors, communication structure, etc. Data-flow analysis is employed to derive the time lower bound in the computation as well as the sequencing of the abstract architecture. The features of the target architecture are defined by optimization of the abstract architecture to exploit maximum parallelism while minimizing architectural complexity. An architecture is designed and implemented that is capable of efficient exploitation of parallelism at several computational levels. The computation time of the Newton-Euler formulation for a 6-degree-of-freedom (dof) general manipulator is measured as 187 microsec. The increase in computation time for each additional dof is 23 microsec, which leads to a computation time of less than 500 microsec, even for a 12-dof redundant arm.

  8. Xyce parallel electronic simulator : users' guide.

    SciTech Connect

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2011-05-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

  9. High-Throughput parallel blind Virtual Screening using BINDSURF

    PubMed Central

    2012-01-01

    Background Virtual Screening (VS) methods can considerably aid clinical research, predicting how ligands interact with drug targets. Most VS methods suppose a unique binding site for the target, usually derived from the interpretation of the protein crystal structure. However, it has been demonstrated that in many cases, diverse ligands interact with unrelated parts of the target and many VS methods do not take into account this relevant fact. Results We present BINDSURF, a novel VS methodology that scans the whole protein surface in order to find new hotspots, where ligands might potentially interact with, and which is implemented in last generation massively parallel GPU hardware, allowing fast processing of large ligand databases. Conclusions BINDSURF is an efficient and fast blind methodology for the determination of protein binding sites depending on the ligand, that uses the massively parallel architecture of GPUs for fast pre-screening of large ligand databases. Its results can also guide posterior application of more detailed VS methods in concrete binding sites of proteins, and its utilization can aid in drug discovery, design, repurposing and therefore help considerably in clinical research. PMID:23095663

  10. Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

    2000-01-01

    The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.

  11. Parallel processing in power systems computation

    SciTech Connect

    Tylavsky, D.J.; Bose, A.; Alvarado, F.; Betancourt, R.; Clements, K.; Heydt, G.T.; Huang, G.; Ilic, M.; La Scala, M.; Pai, M.A.

    1992-05-01

    The availability of parallel processing hardware and software presents an opportunity and a challenge to apply this new computation technology to solve power system problems. The allure of parallel processing is that this technology has the potential to be cost effectively used on computationally intense problems. The objective of this paper is to define the state of the art and identify what the authors see to be the most fertile grounds for future research in parallel processing as applied to power system computation. As always, such projections are risky in a fast changing field, but the authors hope that this paper will be useful to the researchers and practitioners in this growing area.

  12. Structural mechanics computations on parallel computing platforms

    SciTech Connect

    Kulak, R.F.; Plaskacz, E.J.; Pfeiffer, P.A.

    1995-06-01

    With recent advances in parallel supercomputers and network-connected workstations, the solution to large scale structural engineering problems has now become tractable. High-performance computer architectures, which are usually available at large universities and national laboratories, now can solve large nonlinear problems. At the other end of the spectrum, network connected workstations can be configured to become a distributed-parallel computer. This approach is attractive to small, medium and large engineering firms. This paper describes the development of a parallelized finite element computer program for the solution of static, nonlinear structural mechanics problems.

  13. Parallel Climate Analysis Toolkit (ParCAT)

    Energy Science and Technology Software Center (ESTSC)

    2013-06-30

    The parallel analysis toolkit (ParCAT) provides parallel statistical processing of large climate model simulation datasets. ParCAT provides parallel point-wise average calculations, frequency distributions, sum/differences of two datasets, and difference-of-average and average-of-difference for two datasets for arbitrary subsets of simulation time. ParCAT is a command-line utility that can be easily integrated in scripts or embedded in other application. ParCAT supports CMIP5 post-processed datasets as well as non-CMIP5 post-processed datasets. ParCAT reads and writes standard netCDF files.

  14. Parallel path aspects of transmission modeling

    SciTech Connect

    Kavicky, J.A.; Shahidehpour, S.M.

    1996-11-01

    This paper examines the present methods and modeling techniques available to address the effects of parallel flows resulting from various firm and short-term energy transactions. A survey of significant methodologies is conducted to determine the present status of parallel flow transaction modeling. The strengths and weaknesses of these approaches are identified to suggest areas of further modeling improvements. The motivating force behind this research is to improve transfer capability assessment accuracy by suggesting a real-time modeling environment that adequately represents the influences of parallel flows while recognizing operational constraints and objectives.

  15. Language constructs for modular parallel programs

    SciTech Connect

    Foster, I.

    1996-03-01

    We describe programming language constructs that facilitate the application of modular design techniques in parallel programming. These constructs allow us to isolate resource management and processor scheduling decisions from the specification of individual modules, which can themselves encapsulate design decisions concerned with concurrence, communication, process mapping, and data distribution. This approach permits development of libraries of reusable parallel program components and the reuse of these components in different contexts. In particular, alternative mapping strategies can be explored without modifying other aspects of program logic. We describe how these constructs are incorporated in two practical parallel programming languages, PCN and Fortran M. Compilers have been developed for both languages, allowing experimentation in substantial applications.

  16. Parallelization of the Implicit RPLUS Algorithm

    NASA Technical Reports Server (NTRS)

    Orkwis, Paul D.

    1994-01-01

    The multiblock reacting Navier-Stokes flow-solver RPLUS2D was modified for parallel implementation. Results for non-reacting flow calculations of this code indicate parallelization efficiencies greater than 84% are possible for a typical test problem. Results tend to improve as the size of the problem increases. The convergence rate of the scheme is degraded slightly when additional artificial block boundaries are included for the purpose of parallelization. However, this degradation virtually disappears if the solution is converged near to machine zero. Recommendations are made for further code improvements to increase efficiency, correct bugs in the original version, and study decomposition effectiveness.

  17. Parallelization of the Implicit RPLUS Algorithm

    NASA Technical Reports Server (NTRS)

    Orkwis, Paul D.

    1997-01-01

    The multiblock reacting Navier-Stokes flow solver RPLUS2D was modified for parallel implementation. Results for non-reacting flow calculations of this code indicate parallelization efficiencies greater than 84% are possible for a typical test problem. Results tend to improve as the size of the problem increases. The convergence rate of the scheme is degraded slightly when additional artificial block boundaries are included for the purpose of parallelization. However, this degradation virtually disappears if the solution is converged near to machine zero. Recommendations are made for further code improvements to increase efficiency, correct bugs in the original version, and study decomposition effectiveness.

  18. Distributed parallel messaging for multiprocessor systems

    DOEpatents

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  19. Knowledge representation into Ada parallel processing

    NASA Technical Reports Server (NTRS)

    Masotto, Tom; Babikyan, Carol; Harper, Richard

    1990-01-01

    The Knowledge Representation into Ada Parallel Processing project is a joint NASA and Air Force funded project to demonstrate the execution of intelligent systems in Ada on the Charles Stark Draper Laboratory fault-tolerant parallel processor (FTPP). Two applications were demonstrated - a portion of the adaptive tactical navigator and a real time controller. Both systems are implemented as Activation Framework Objects on the Activation Framework intelligent scheduling mechanism developed by Worcester Polytechnic Institute. The implementations, results of performance analyses showing speedup due to parallelism and initial efficiency improvements are detailed and further areas for performance improvements are suggested.

  20. Parallel processor programs in the Federal Government

    NASA Technical Reports Server (NTRS)

    Schneck, P. B.; Austin, D.; Squires, S. L.; Lehmann, J.; Mizell, D.; Wallgren, K.

    1985-01-01

    In 1982, a report dealing with the nation's research needs in high-speed computing called for increased access to supercomputing resources for the research community, research in computational mathematics, and increased research in the technology base needed for the next generation of supercomputers. Since that time a number of programs addressing future generations of computers, particularly parallel processors, have been started by U.S. government agencies. The present paper provides a description of the largest government programs in parallel processing. Established in fiscal year 1985 by the Institute for Defense Analyses for the National Security Agency, the Supercomputing Research Center will pursue research to advance the state of the art in supercomputing. Attention is also given to the DOE applied mathematical sciences research program, the NYU Ultracomputer project, the DARPA multiprocessor system architectures program, NSF research on multiprocessor systems, ONR activities in parallel computing, and NASA parallel processor projects.