Science.gov

Sample records for parallel blade-vortex interaction

  1. An experimental investigation of the parallel blade-vortex interaction

    NASA Technical Reports Server (NTRS)

    Caradonna, F. X.; Laub, G. H.; Tung, C.

    1984-01-01

    A scheme for investigating the parallel blade vortex interaction (BVI) has been designed and tested. The scheme involves setting a vortex generator upstream of a nonlifting rotor so that the vortex interacts with the blade at the forward azimuth. The method has revealed two propagation mechanisms: a type C shock propagation from the leading edge induced by the vortex at high tip speeds, and a rapid but continuous pressure pulse associated with the proximity of the vortex to the leading edge. The latter is thought to be the more important source. The effects of Mach number and vortex proximity are discussed.

  2. Rotor blade vortex interaction noise

    NASA Astrophysics Data System (ADS)

    Yu, Yung H.

    2000-02-01

    Blade-vortex interaction noise-generated by helicopter main rotor blades is one of the most severe noise problems and is very important both in military applications and community acceptance of rotorcraft. Research over the decades has substantially improved physical understanding of noise-generating mechanisms, and various design concepts have been investigated to control noise radiation using advanced blade planform shapes and active blade control techniques. The important parameters to control rotor blade-vortex interaction noise and vibration have been identified: blade tip vortex structures and its trajectory, blade aeroelastic deformation, and airloads. Several blade tip design concepts have been investigated for diffusing tip vortices and also for reducing noise. However, these tip shapes have not been able to substantially reduce blade-vortex interaction noise without degradation of rotor performance. Meanwhile, blade root control techniques, such as higher-harmonic pitch control (HHC) and individual blade control (IBC) concepts, have been extensively investigated for noise and vibration reduction. The HHC technique has proved the substantial blade-vortex interaction noise reduction, up to 6 dB, while vibration and low-frequency noise have been increased. Tests with IBC techniques have shown the simultaneous reduction of rotor noise and vibratory loads with 2/rev pitch control inputs. Recently, active blade control concepts with smart structures have been investigated with the emphasis on active blade twist and trailing edge flap. Smart structures technologies are very promising, but further advancements are needed to meet all the requirements of rotorcraft applications in frequency, force, and displacement.

  3. Rotorcraft Blade-Vortex Interaction Controller

    NASA Technical Reports Server (NTRS)

    Schmitz, Fredric H. (Inventor)

    1995-01-01

    Blade-vortex interaction noises, sometimes referred to as 'blade slap', are avoided by increasing the absolute value of inflow to the rotor system of a rotorcraft. This is accomplished by creating a drag force which causes the angle of the tip-path plane of the rotor system to become more negative or more positive.

  4. An analysis of blade vortex interaction aerodynamics and acoustics

    NASA Technical Reports Server (NTRS)

    Lee, D. J.

    1985-01-01

    The impulsive noise associated with helicopter flight due to Blade-Vortex Interaction, sometimes called blade slap is analyzed especially for the case of a close encounter of the blade-tip vortex with a following blade. Three parts of the phenomena are considered: the tip-vortex structure generated by the rotating blade, the unsteady pressure produced on the following blade during the interaction, and the acoustic radiation due to the unsteady pressure field. To simplify the problem, the analysis was confined to the situation where the vortex is aligned parallel to the blade span in which case the maximum acoustic pressure results. Acoustic radiation due to the interaction is analyzed in space-fixed coordinates and in the time domain with the unsteady pressure on the blade surface as the source of chordwise compact, but spanwise non-compact radiation. Maximum acoustic pressure is related to the vortex core size and Reynolds number which are in turn functions of the blade-tip aerodynamic parameters. Finally noise reduction and performance are considered.

  5. Rotating hot-wire investigation of the vortex responsible for blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Fontana, Richard Remo

    1988-01-01

    This distribution of the circumferential velocity of the vortex responsible for blade-vortex interaction noise was measured using a rotating hot-wire rake synchronously meshed with a model helicopter rotor at the blade passage frequency. Simultaneous far-field acoustic data and blade differential pressure measurements were obtained. Results show that the shape of the measured far-field acoustic blade-vortex interaction signature depends on the blade-vortex interaction geometry. The experimental results are compared with the Widnall-Wolf model for blade-vortex interaction noise.

  6. Rotor blade system with reduced blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Leishman, John G. (Inventor); Han, Yong Oun (Inventor)

    2005-01-01

    A rotor blade system with reduced blade-vortex interaction noise includes a plurality of tube members embedded in proximity to a tip of each rotor blade. The inlets of the tube members are arrayed at the leading edge of the blade slightly above the chord plane, while the outlets are arrayed at the blade tip face. Such a design rapidly diffuses the vorticity contained within the concentrated tip vortex because of enhanced flow mixing in the inner core, which prevents the development of a laminar core region.

  7. A Novel Method for Reducing Rotor Blade-Vortex Interaction

    NASA Technical Reports Server (NTRS)

    Glinka, A. T.

    2000-01-01

    One of the major hindrances to expansion of the rotorcraft market is the high-amplitude noise they produce, especially during low-speed descent, where blade-vortex interactions frequently occur. In an attempt to reduce the noise levels caused by blade-vortex interactions, the flip-tip rotor blade concept was devised. The flip-tip rotor increases the miss distance between the shed vortices and the rotor blades, reducing BVI noise. The distance is increased by rotating an outboard portion of the rotor tip either up or down depending on the flight condition. The proposed plan for the grant consisted of a computational simulation of the rotor aerodynamics and its wake geometry to determine the effectiveness of the concept, coupled with a series of wind tunnel experiments exploring the value of the device and validating the computer model. The computational model did in fact show that the miss distance could be increased, giving a measure of the effectiveness of the flip-tip rotor. However, the wind experiments were not able to be conducted. Increased outside demand for the 7'x lO' wind tunnel at NASA Ames and low priority at Ames for this project forced numerous postponements of the tests, eventually pushing the tests beyond the life of the grant. A design for the rotor blades to be tested in the wind tunnel was completed and an analysis of the strength of the model blades based on predicted loads, including dynamic forces, was done.

  8. A parametric study of transonic blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Lyrintzis, A. S.

    1991-01-01

    Several parameters of transonic blade-vortex interactions (BVI) are being studied and some ideas for noise reduction are introduced and tested using numerical simulation. The model used is the two-dimensional high frequency transonic small disturbance equation with regions of distributed vorticity (VTRAN2 code). The far-field noise signals are obtained by using the Kirchhoff method with extends the numerical 2-D near-field aerodynamic results to the linear acoustic 3-D far-field. The BVI noise mechanisms are explained and the effects of vortex type and strength, and angle of attack are studied. Particularly, airfoil shape modifications which lead to noise reduction are investigated. The results presented are expected to be helpful for better understanding of the nature of the BVI noise and better blade design.

  9. Helicopter tail rotor blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    George, Albert R.; Chou, S.-T.

    1987-01-01

    A study is made of helicopter tail rotor noise, particularly that due to the interactions with main rotor tip vortices. Summarized here are present analysis, the computer codes, and the results of several test cases. Amiet's unsteady thin airfoil theory is used to calculate the acoustics of blade-vortex interaction. The noise source is modelled as a force dipole resulting from an airfoil of infinite span chopping through a skewed line vortex. To analyze the interactions between helicopter tail rotor and main rotor tip vortices, we developed a two-step approach: (1) the main rotor tip vortex system is obtained through a free wake geometry calculation of the main rotor using CAMRAD code; (2) acoustic analysis takes the results from the aerodynamic interaction analysis and calculates the farfield pressure signatures for the interactions. It is found that under a wide range of helicopter flight conditions, acoustic pressure fluctuations of significant magnitude can be generated by tail rotors due to a series of interactions with main rotor tip vortices. This noise mechanism depends strongly on the helicopter flight conditions and the relative location and phasing of the main and tail rotors. fluctuations of significant magnitude can be generated by tail rotors due to a series of interactions with main rotor tip vortices. This noise mechanism depends strongly upon the helicopter flight conditions and the relative location and phasing of the main and tail rotors.

  10. Transonic blade-vortex interactions noise: A parametric study

    NASA Technical Reports Server (NTRS)

    Lyrintzis, A. S.; Xue, Y.

    1990-01-01

    Transonic Blade-Vortex Interactions (BVI) are simulated numerically and the noise mechanisms are investigated. The 2-D high frequency transonic small disturbance equation is solved numerically (VTRAN2 code). An Alternating Direction Implicit (ADI) scheme with monotone switches is used; viscous effects are included on the boundary and the vortex is simulated by the cloud-in-cell method. The Kirchoff method is used for the extension of the numerical 2-D near field aerodynamic results to the linear acoustic 3-D far field. The viscous effect (shock/boundary layer interaction) on BVI is investigated. The different types of shock motion are identified and compared. Two important disturbances with different directivity exist in the pressure signal and are believed to be related to the fluctuating lift and drag forces. Noise directivity for different cases is shown. The maximum radiation occurs at an angle between 60 and 90 deg below the horizontal for an airfoil fixed coordinate system and depends on the details of the airfoil shape. Different airfoil shapes are studied and classified according to the BVI noise produced.

  11. HART-II: Prediction of Blade-Vortex Interaction Loading

    NASA Technical Reports Server (NTRS)

    Lim, Joon W.; Tung, Chee; Yu, Yung H.; Burley, Casey L.; Brooks, Thomas; Boyd, Doug; vanderWall, Berend; Schneider, Oliver; Richard, Hugues; Raffel, Markus

    2003-01-01

    During the HART-I data analysis, the need for comprehensive wake data was found including vortex creation and aging, and its re-development after blade-vortex interaction. In October 2001, US Army AFDD, NASA Langley, German DLR, French ONERA and Dutch DNW performed the HART-II test as an international joint effort. The main objective was to focus on rotor wake measurement using a PIV technique along with the comprehensive data of blade deflections, airloads, and acoustics. Three prediction teams made preliminary correlation efforts with HART-II data: a joint US team of US Army AFDD and NASA Langley, German DLR, and French ONERA. The predicted results showed significant improvements over the HART-I predicted results, computed about several years ago, which indicated that there has been better understanding of complicated wake modeling in the comprehensive rotorcraft analysis. All three teams demonstrated satisfactory prediction capabilities, in general, though there were slight deviations of prediction accuracies for various disciplines.

  12. The effect of tip vortex structure on helicopter noise due to blade/vortex interaction

    NASA Technical Reports Server (NTRS)

    Wolf, T. L.; Widnall, S. E.

    1978-01-01

    A potential cause of helicopter impulsive noise, commonly called blade slap, is the unsteady lift fluctuation on a rotor blade due to interaction with the vortex trailed from another blade. The relationship between vortex structure and the intensity of the acoustic signal is investigated. The analysis is based on a theoretical model for blade/vortex interaction. Unsteady lift on the blades due to blade/vortex interaction is calculated using linear unsteady aerodynamic theory, and expressions are derived for the directivity, frequency spectrum, and transient signal of the radiated noise. An inviscid rollup model is used to calculate the velocity profile in the trailing vortex from the spanwise distribution of blade tip loading. A few cases of tip loading are investigated, and numerical results are presented for the unsteady lift and acoustic signal due to blade/vortex interaction. The intensity of the acoustic signal is shown to be quite sensitive to changes in tip vortex structure.

  13. Reduction of Helicopter Blade-Vortex Interaction Noise by Active Rotor Control Technology

    NASA Technical Reports Server (NTRS)

    Yu, Yung H.; Gmelin, Bernd; Splettstoesser, Wolf; Brooks, Thomas F.; Philippe, Jean J.; Prieur, Jean

    1997-01-01

    Helicopter blade-vortex interaction noise is one of the most severe noise sources and is very important both in community annoyance and military detection. Research over the decades has substantially improved basic physical understanding of the mechanisms generating rotor blade-vortex interaction noise and also of controlling techniques, particularly using active rotor control technology. This paper reviews active rotor control techniques currently available for rotor blade vortex interaction noise reduction, including higher harmonic pitch control, individual blade control, and on-blade control technologies. Basic physical mechanisms of each active control technique are reviewed in terms of noise reduction mechanism and controlling aerodynamic or structural parameters of a blade. Active rotor control techniques using smart structures/materials are discussed, including distributed smart actuators to induce local torsional or flapping deformations, Published by Elsevier Science Ltd.

  14. A comparison of model helicopter rotor Primary and Secondary blade/vortex interaction blade slap

    NASA Technical Reports Server (NTRS)

    Hubbard, J. E., Jr.; Leighton, K. P.

    1983-01-01

    A study of the relative importance of blade/vortex interactions which occur on the retreating side of a model helicopter rotor disk is described. Some of the salient characteristics of this phenomenon are presented and discussed. It is shown that the resulting Secondary blade slap may be of equal or greater intensity than the advancing side (Primary) blade slap. Instrumented model helicopter rotor data is presented which reveals the nature of the retreating blade/vortex interaction. The importance of Secondary blade slap as it applies to predictive techniques or approaches is discussed. When Secondary blade slap occurs it acts to enlarge the window of operating conditions for which blade slap exists.

  15. Helicopter Non-Unique Trim Strategies for Blade-Vortex Interaction (BVI) Noise Reduction

    DTIC Science & Technology

    2016-01-22

    1 Helicopter Non-Unique Trim Strategies for Blade-Vortex Interaction (BVI) Noise Reduction Carlos Malpica Aerospace Engineer NASA Ames...34 global " effect of fuselage drag on BVI noise is illustrated in Figure 8 and Figure 9. Figure 8 highlights the effect of decreasing and increasing

  16. Helicopter Blade-Vortex Interaction Noise with Comparisons to CFD Calculations

    NASA Technical Reports Server (NTRS)

    McCluer, Megan S.

    1996-01-01

    A comparison of experimental acoustics data and computational predictions was performed for a helicopter rotor blade interacting with a parallel vortex. The experiment was designed to examine the aerodynamics and acoustics of parallel Blade-Vortex Interaction (BVI) and was performed in the Ames Research Center (ARC) 80- by 120-Foot Subsonic Wind Tunnel. An independently generated vortex interacted with a small-scale, nonlifting helicopter rotor at the 180 deg azimuth angle to create the interaction in a controlled environment. Computational Fluid Dynamics (CFD) was used to calculate near-field pressure time histories. The CFD code, called Transonic Unsteady Rotor Navier-Stokes (TURNS), was used to make comparisons with the acoustic pressure measurement at two microphone locations and several test conditions. The test conditions examined included hover tip Mach numbers of 0.6 and 0.7, advance ratio of 0.2, positive and negative vortex rotation, and the vortex passing above and below the rotor blade by 0.25 rotor chords. The results show that the CFD qualitatively predicts the acoustic characteristics very well, but quantitatively overpredicts the peak-to-peak sound pressure level by 15 percent in most cases. There also exists a discrepancy in the phasing (about 4 deg) of the BVI event in some cases. Additional calculations were performed to examine the effects of vortex strength, thickness, time accuracy, and directionality. This study validates the TURNS code for prediction of near-field acoustic pressures of controlled parallel BVI.

  17. On the Use of Vortex-Fitting in the Numerical Simulation of Blade-Vortex Interaction

    NASA Technical Reports Server (NTRS)

    Srinivasan, G. R.; VanDalsem, William (Technical Monitor)

    1997-01-01

    The usefulness of vortex-fitting in the computational fluid dynamics (CFD) methods to preserve the vortex strength and structure while convecting in a uniform free stream is demonstrated through the numerical simulations of two- and three-dimensional blade-vortex interactions. The fundamental premise of the formulation is the velocity and pressure field of the interacting vortex are unaltered either in the presence of an airfoil or a rotor blade or by the resulting nonlinear interactional flowfield. Although, the governing Euler and Navier-Stokes equations are nonlinear and independent solutions cannot be superposed, the interactional flowfield can be accurately captured by adding and subtracting the flowfield of the convecting vortex at each instant. The aerodynamics and aeroacoustics of two- and three-dimensional blade-vortex interactions have been calculated in Refs. 1-6 using this concept. Some of the results from these publications and similar other published material will be summarized in this paper.

  18. Flow structure generated by perpendicular blade vortex interaction and implications for helicopter noise predictions

    NASA Technical Reports Server (NTRS)

    Devenport, William J.; Glegg, Stewart A. L.

    1995-01-01

    This report summarizes accomplishments and progress for the period ending April 1995. Much of the work during this period has concentrated on preparation for an analysis of data produced by an extensive wind tunnel test. Time has also been spent further developing an empirical theory to account for the effects of blade-vortex interaction upon the circulation distribution of the vortex and on preliminary measurements aimed at controlling the vortex core size.

  19. Full-Potential Modeling of Blade-Vortex Interactions. Degree awarded by George Washington Univ., Feb. 1987

    NASA Technical Reports Server (NTRS)

    Jones, Henry E.

    1997-01-01

    A study of the full-potential modeling of a blade-vortex interaction was made. A primary goal of this study was to investigate the effectiveness of the various methods of modeling the vortex. The model problem restricts the interaction to that of an infinite wing with an infinite line vortex moving parallel to its leading edge. This problem provides a convenient testing ground for the various methods of modeling the vortex while retaining the essential physics of the full three-dimensional interaction. A full-potential algorithm specifically tailored to solve the blade-vortex interaction (BVI) was developed to solve this problem. The basic algorithm was modified to include the effect of a vortex passing near the airfoil. Four different methods of modeling the vortex were used: (1) the angle-of-attack method, (2) the lifting-surface method, (3) the branch-cut method, and (4) the split-potential method. A side-by-side comparison of the four models was conducted. These comparisons included comparing generated velocity fields, a subcritical interaction, and a critical interaction. The subcritical and critical interactions are compared with experimentally generated results. The split-potential model was used to make a survey of some of the more critical parameters which affect the BVI.

  20. Experimental Study of Active Techniques for Blade/Vortex Interaction Noise Reduction

    NASA Astrophysics Data System (ADS)

    Kobiki, Noboru; Murashige, Atsushi; Tsuchihashi, Akihiko; Yamakawa, Eiichi

    This paper presents the experimental results of the effect of Higher Harmonic Control (HHC) and Active Flap on the Blade/Vortex Interaction (BVI) noise. Wind tunnel tests were performed with a 1-bladed rotor system to evaluate the simplified BVI phenomenon avoiding the complicated aerodynamic interference which is characteristically and inevitably caused by a multi-bladed rotor. Another merit to use this 1-bladed rotor system is that the several objective active techniques can be evaluated under the same condition installed in the same rotor system. The effects of the active techniques on the BVI noise reduction were evaluated comprehensively by the sound pressure, the blade/vortex miss distance obtained by Laser light Sheet (LLS), the blade surface pressure distribution and the tip vortex structure by Particle Image Velocimetry (PIV). The correlation among these quantities to describe the effect of the active techniques on the BVI conditions is well obtained. The experiments show that the blade/vortex miss distance is more dominant for BVI noise than the other two BVI governing factors, such as blade lift and vortex strength at the moment of BVI.

  1. Experimental blade vortex interaction noise characteristics of a utility helicopter at 1/4 scale

    NASA Technical Reports Server (NTRS)

    Conner, D. A.; Hoad, D. R.

    1984-01-01

    Models of both the advanced main rotor system and the standard or "baseline" UH-1 main rotor system were tested at one-quarter scale in the Langley 4- by 7-Meter (V/STOL) Tunnel using the general rotor model system. Tests were conducted over a range of descent angles which bracketed the blade-vortex interaction phenomenon for a range of simulated forward speeds. The tunnel was operated in the open-throat configuration with acoustic treatment to improve the semi-anechoic characteristics of the test chamber. Acoustical data obtained for these two rotor systems operating at similar flight conditions are presented without analysis or discussion.

  2. An Euler code calculation of blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Hardin, J. C.; Lamkin, S. L.

    1987-01-01

    An Euler code has been developed for calculation of noise radiation due to the interaction of a distributed vortex with a Joukowski airfoil. THe time-dependent incompressible flow field is first determined and then integrated to yield the resulting sound production through use of the elegant low-frequency Green's function approach. This code has several interesting numerical features involved in the vortex motion and in continuous satisfaction of the Kutta condition. In addition, it removes the limitations on Reynolds number and is much more efficient than an earlier Navier-Stokes code. Results indicate that the noise production is due to the deceleration and subsequent acceleration of the vortex as it approaches and passes the airfoil. Predicted acoustic levels and frequencies agree with measured data although a precise comparison would require the strength, size, and position of the incoming vortex to be known.

  3. Tip-path-plane angle effects on rotor blade-vortex interaction noise levels and directivity

    NASA Technical Reports Server (NTRS)

    Burley, Casey L.; Martin, Ruth M.

    1988-01-01

    Acoustic data of a scale model BO-105 main rotor acquired in a large aeroacoustic wind tunnel are presented to investigate the parametric effects of rotor operating conditions on blade-vortex interaction (BVI) impulsive noise. Contours of a BVI noise metric are employed to quantify the effects of rotor advance ratio and tip-path-plane angle on BVI noise directivity and amplitude. Acoustic time history data are presented to illustrate the variations in impulsive characteristics. The directionality, noise levels and impulsive content of both advancing and retreating side BVI are shown to vary significantly with tip-path-plane angle and advance ratio over the range of low and moderate flight speeds considered.

  4. Blade-Vortex Interaction (BVI) Noise and Airload Prediction Using Loose Aerodynamic/Structural Coupling

    NASA Technical Reports Server (NTRS)

    Sim, B. W.; Lim, J. W.

    2007-01-01

    Predictions of blade-vortex interaction (BVI) noise, using blade airloads obtained from a coupled aerodynamic and structural methodology, are presented. This methodology uses an iterative, loosely-coupled trim strategy to cycle information between the OVERFLOW-2 (CFD) and CAMRAD-II (CSD) codes. Results are compared to the HART-II baseline, minimum noise and minimum vibration conditions. It is shown that this CFD/CSD state-of-the-art approach is able to capture blade airload and noise radiation characteristics associated with BVI. With the exception of the HART-II minimum noise condition, predicted advancing and retreating side BVI for the baseline and minimum vibration conditions agrees favorably with measured data. Although the BVI airloads and noise amplitudes are generally under-predicted, this CFD/CSD methodology provides an overall noteworthy improvement over the lifting line aerodynamics and free-wake models typically used in CSD comprehensive analysis codes.

  5. Correlation of helicopter impulsive noise from blade-vortex interaction with rotor mean inflow

    NASA Technical Reports Server (NTRS)

    Connor, Andrew B.; Martin, R. M.

    1987-01-01

    Data from a test made in the Langley 4 x 7 Meter Tunnel were parametrically studied with respect to the occurrence of blade-vortex interaction (BVI) as a function of tunnel speed and rotor angle of attack. Three microphones on the tunnel centerline forward of the model and one microphone forward and 45 degrees to the right provided the data. The rotor model was tested with a set of high-twist blades (-10 degrees) and a set of low-twist blades (-5 degrees) over the midspeed range (50 to 80 knots) at angles of attack ranging from -6 degrees (shallow climb) to 10 degrees (steep descent). The data from all four microphones indicated that the most probable time of occurrence of BVI is when the rotor descent is approximately equal to the rotor mean inflow velocity. However, some of the data showed no conclusive relationship to the mean inflow velocity.

  6. Mach number scaling of helicopter rotor blade/vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Leighton, Kenneth P.; Harris, Wesley L.

    1985-01-01

    A parametric study of model helicopter rotor blade slap due to blade vortex interaction (BVI) was conducted in a 5 by 7.5-foot anechoic wind tunnel using model helicopter rotors with two, three, and four blades. The results were compared with a previously developed Mach number scaling theory. Three- and four-bladed rotor configurations were found to show very good agreement with the Mach number to the sixth power law for all conditions tested. A reduction of conditions for which BVI blade slap is detected was observed for three-bladed rotors when compared to the two-bladed baseline. The advance ratio boundaries of the four-bladed rotor exhibited an angular dependence not present for the two-bladed configuration. The upper limits for the advance ratio boundaries of the four-bladed rotors increased with increasing rotational speed.

  7. Studies of blade-vortex interaction noise reduction by rotor blade modification

    NASA Astrophysics Data System (ADS)

    Brooks, Thomas F.

    Blade-vortex interaction (BVI) noise is one of the most objectionable types of helicopter noise. This impulsive blade-slap noise can be particularly intense during low-speed landing approach and maneuvers. Over the years, a number of flight and model rotor tests have examined blade tip modification and other blade design changes to reduce this noise. Many times these tests have produced conflicting results. In the present paper, a number of these studies are reviewed in light of the current understanding of the BVI noise problem. Results from one study in particular are used to help establish the noise reduction potential and to shed light on the role of blade design. Current blade studies and some new concepts under development are also described.

  8. Lift distribution and velocity field measurements for a three-dimensional, steady blade/vortex interaction

    NASA Technical Reports Server (NTRS)

    Dunagan, Stephen E.; Norman, Thomas R.

    1987-01-01

    A wind tunnel experiment simulating a steady three-dimensional helicopter rotor blade/vortex interaction is reported. The experimental configuration consisted of a vertical semispan vortex-generating wing, mounted upstream of a horizontal semispan rotor blade airfoil. A three-dimensional laser velocimeter was used to measure the velocity field in the region of the blade. Sectional lift coefficients were calculated by integrating the velocity field to obtain the bound vorticity. Total lift values, obtained by using an internal strain-gauge balance, verified the laser velocimeter data. Parametric variations of vortex strength, rotor blade angle of attack, and vortex position relative to the rotor blade were explored. These data are reported (with attention to experimental limitations) to provide a dataset for the validation of analytical work.

  9. Effect of wake structure on blade-vortex interaction phenomena: Acoustic prediction and validation

    NASA Technical Reports Server (NTRS)

    Gallman, Judith M.; Tung, Chee; Schultz, Klaus J.; Splettstoesser, Wolf; Buchholz, Heino

    1995-01-01

    During the Higher Harmonic Control Aeroacoustic Rotor Test, extensive measurements of the rotor aerodynamics, the far-field acoustics, the wake geometry, and the blade motion for powered, descent, flight conditions were made. These measurements have been used to validate and improve the prediction of blade-vortex interaction (BVI) noise. The improvements made to the BVI modeling after the evaluation of the test data are discussed. The effects of these improvements on the acoustic-pressure predictions are shown. These improvements include restructuring the wake, modifying the core size, incorporating the measured blade motion into the calculations, and attempting to improve the dynamic blade response. A comparison of four different implementations of the Ffowcs Williams and Hawkings equation is presented. A common set of aerodynamic input has been used for this comparison.

  10. Wake Geometry Effects on Rotor Blade-Vortex Interaction Noise Directivity

    NASA Technical Reports Server (NTRS)

    Martin, R. M.; Marcolini, Michael A.; Splettstoesser, W. R.; Schultz, K.-J.

    1990-01-01

    Acoustic measurements from a model rotor wind tunnel test are presented which show that the directionality of rotor blade vortex interaction (BVI) noise is strongly dependent on the rotor advance ratio and disk attitude. A rotor free wake analysis is used to show that the general locus of interactions on the rotor disk is also strongly dependent on advance ratio and disk attitude. A comparison of the changing directionality of the BVI noise with changes in the interaction locations shows that the strongest noise radiation occurs in the direction of motion normal to the blade span at the time of interaction, for both advancing and retreating side BVI. For advancing side interactions, the BVI radiation angle down from the tip-path plane appears relatively insensitive to rotor operating condition and is typically between 40 and 55 deg below the disk. However, the azimuthal radiation direction shows a clear trend with descent speed, moving towards the right of the flight path with increasing descent speed. The movement of the strongest radiation direction is attributed to the movement of the interaction locations on the rotor disk with increasing descent speed.

  11. Analysis of helicopter blade-vortex interaction noise with application to adaptive-passive and active alleviation methods

    NASA Astrophysics Data System (ADS)

    Tauszig, Lionel Christian

    This study focuses on detection and analysis methods of helicopter blade-vortex interactions (BVI) and applies these methods to two different BVI noise alleviation schemes---an adaptive-passive and an active scheme. A standard free-wake analysis based on relaxation methods is extended in this study to compute high-resolution blade loading, to account for blade-to-blade dissimilarities, and dual vortices when there is negative loading at the blade tips. The free-wake geometry is still calculated on a coarse azimuthal grid and then interpolated to a high-resolution grid to calculate the BVI induced impulsive loading. Blade-to-blade dissimilarities are accounted by allowing the different blades to release their own vortices. A number of BVI detection criteria, including the spherical method (a geometric criterion developed in this thesis) are critically examined. It was determined that high-resolution azimuthal discretization is required in virtually all detection methods except the spherical method which detected the occurrence of parallel BVI even while using a low-resolution azimuthal mesh. Detection methods based on inflow and blade loads were, in addition, found to be sensitive to vortex core size. While most BVI studies use the high-resolution airloads to compute BVI noise, the total noise can often be due to multiple dominant interactions on the advancing and retreating sides. A methodology is developed to evaluate the contribution of an individual interaction to the total BVI noise, based on using the loading due to an individual vortex as an input to the acoustic code WOPWOP. The adaptive-passive BVI alleviation method considered in this study comprises of reducing the length of one set of opposite blades (of a 4-bladed rotor) in low-speed descent. Results showed that differential coning resulting from the blade dissimilarity increases the blade-vortex miss-distances and reduces the BVI noise by 4 dB. The Higher Harmonic Control Aeroacoustic Rotor Test (HART

  12. Acoustic measurements from a rotor blade-vortex interaction noise experiment in the German-Dutch Wind Tunnel (DNW)

    NASA Technical Reports Server (NTRS)

    Martin, Ruth M.; Splettstoesser, W. R.; Elliott, J. W.; Schultz, K.-J.

    1988-01-01

    Acoustic data are presented from a 40 percent scale model of the 4-bladed BO-105 helicopter main rotor, measured in the large European aeroacoustic wind tunnel, the DNW. Rotor blade-vortex interaction (BVI) noise data in the low speed flight range were acquired using a traversing in-flow microphone array. The experimental apparatus, testing procedures, calibration results, and experimental objectives are fully described. A large representative set of averaged acoustic signals is presented.

  13. A study of the noise mechanisms of transonic blade-vortex interactions

    NASA Technical Reports Server (NTRS)

    Lyrintzis, Anastasios S.; Xue, Y.

    1990-01-01

    Transonic blade-vortex interactions (BVI) are simulated numerically and the noise mechanisms are investigated. The two-dimensional high frequency transonic small disturbance equation is solved numerically (VTRAN2 code). An ADI scheme with monotone switches is used; viscous effects are included on the boundary, and the vortex is simulated by the cloud in cell method. The Kirchhoff method is used for the extension of the numerical two-dimensional near-field aerodynamic results to the linear acoustic three dimensional far field. The viscous effects (shock/boundary layer interactions) on BVI is investigated. The different types of shock motion are identified and compared. Two important disturbances with different directivity exist in the pressure signal and are believed to be related to the fluctuating lift and drag forces. Noise directivity for different cases is shown. The maximum radiation occurs at an angle between 60 and 90 degrees below the horizontal for an airfoil-fixed coordinate system and depends on the details of the airfoil shape. Different airfoil shapes are studied and classified according to the BVI noise produced.

  14. Helicopter model rotor-blade vortex interaction impulsive noise: Scalability and parametric variations

    NASA Technical Reports Server (NTRS)

    Splettstoesser, W. R.; Schultz, K. J.; Boxwell, D. A.; Schmitz, F. H.

    1984-01-01

    Acoustic data taken in the anechoic Deutsch-Niederlaendischer Windkanal (DNW) have documented the blade vortex interaction (BVI) impulsive noise radiated from a 1/7-scale model main rotor of the AH-1 series helicopter. Averaged model scale data were compared with averaged full scale, inflight acoustic data under similar nondimensional test conditions. At low advance ratios (mu = 0.164 to 0.194), the data scale remarkable well in level and waveform shape, and also duplicate the directivity pattern of BVI impulsive noise. At moderate advance ratios (mu = 0.224 to 0.270), the scaling deteriorates, suggesting that the model scale rotor is not adequately simulating the full scale BVI noise; presently, no proved explanation of this discrepancy exists. Carefully performed parametric variations over a complete matrix of testing conditions have shown that all of the four governing nondimensional parameters - tip Mach number at hover, advance ratio, local inflow ratio, and thrust coefficient - are highly sensitive to BVI noise radiation.

  15. The spectral characteristics of rotor blade-vortex interaction noise - Experimental and mathematical results

    NASA Technical Reports Server (NTRS)

    Martin, Ruth M.; Hardin, Jay C.

    1987-01-01

    The BVI impulsive content of a rotor acoustic signal is shown to appear in the mid-frequency range of the power spectrum, between the fifth and thirtieth harmonics of the blade passage frequency, concentrated at the harmonics of the blade passage frequency. These harmonics exhibit a humped or scalloped shape in this mid-frequency spectral region. Increased energy at the harmonics of the shaft frequency appears when the BVI impulsive content demonstrates unsteadiness and blade-to-blade differences in the time domain. A mathematical model of a generalized BVI acoustic signal and its power spectrum shows that the power spectrum is scalloped and filtered by a comb function. The spectrum amplitude is defined by the impulse amplitude and emission time. The scalloping of the spectrum is related to the emission time of the impulse itself, and the spacing of the comb function is related to the repetition time (period) of the impulse. The decay rate of the spectral humps is governed by the inverse of frequency squared. The mathematical model validates the characteristics observed in the data and verify that these characteristics are due to blade-vortex interaction activity.

  16. Reduction of blade-vortex interaction noise using higher harmonic pitch control

    NASA Technical Reports Server (NTRS)

    Brooks, Thomas F.; Booth, Earl R., Jr.; Jolly, J. Ralph, Jr.; Yeager, William T., Jr.; Wilbur, Matthew L.

    1989-01-01

    An acoustics test using an aeroelastically scaled rotor was conducted to examine the effectiveness of higher harmonic blade pitch control for the reduction of impulsive blade-vortex interaction (BVI) noise. A four-bladed, 110 in. diameter, articulated rotor model was tested in a heavy gas (Freon-12) medium in Langley's Transonic Dynamics Tunnel. Noise and vibration measurements were made for a range of matched flight conditions, where prescribed (open-loop) higher harmonic pitch was superimposed on the normal (baseline) collective and cyclic trim pitch. For the inflow-microphone noise measurements, advantage was taken of the reverberance in the hard walled tunnel by using a sound power determination approach. Initial findings from on-line data processing for three of the test microphones are reported for a 4/rev (4P) collective pitch control for a range of input amplitudes and phases. By comparing these results to corresponding baseline (no control) conditions, significant noise reductions (4 to 5 dB) were found for low-speed descent conditions, where helicopter BVI noise is most intense. For other rotor flight conditions, the overall noise was found to increase. All cases show increased vibration levels.

  17. Reduction of blade-vortex interaction noise through higher harmonic pitch control

    NASA Technical Reports Server (NTRS)

    Brooks, Thomas F.; Booth, Earl R., Jr.; Jolly, J. Ralph, Jr.; Yeager, William T., Jr.; Wilbur, Matthew L.

    1990-01-01

    An acoustics test using an aeroelastically scaled rotor was conducted to examine the effectiveness of higher harmonic blade pitch control for the reduction of impulsive blade-vortex interaction (BVI) noise. A four-bladed, 110 in. diameter, articulated rotor model was tested in a heavy gas (Freon-12) medium in Langley's Transonic Dynamics Tunnel. Noise and vibration measurements were made for a range of matched flight conditions, where prescribed (open-loop) higher harmonic pitch was superimposed on the normal (baseline) collective and cyclic trim pitch. For the inflow-microphone noise measurements, advantage was taken of the reverberance in the hard walled tunnel by using a sound power determination approach. Initial findings from on-line data processing for three of the test microphones are reported for a 4/rev (4P) collective pitch control for a range of input amplitudes and phases. By comparing these results to corresponding baseline (no control) conditions, significant noise reductions (4 to 5 dB) were found for low-speed descent conditions, where helicopter BVI noise is most intense. For other rotor flight conditions, the overall noise was found to increase. All cases show increased vibration levels.

  18. Advancing-side directivity and retreating-side interactions of model rotor blade-vortex interaction noise

    NASA Technical Reports Server (NTRS)

    Martin, R. M.; Splettstoesser, W. R.; Elliott, J. W.; Schultz, K.-J.

    1988-01-01

    Acoustic data are presented from a 40 percent scale model of the four-bladed BO-105 helicopter main rotor, tested in a large aerodynamic wind tunnel. Rotor blade-vortex interaction (BVI) noise data in the low-speed flight range were acquired using a traversing in-flow microphone array. Acoustic results presented are used to assess the acoustic far field of BVI noise, to map the directivity and temporal characteristics of BVI impulsive noise, and to show the existence of retreating-side BVI signals. The characterics of the acoustic radiation patterns, which can often be strongly focused, are found to be very dependent on rotor operating condition. The acoustic signals exhibit multiple blade-vortex interactions per blade with broad impulsive content at lower speeds, while at higher speeds, they exhibit fewer interactions per blade, with much sharper, higher amplitude acoustic signals. Moderate-amplitude BVI acoustic signals measured under the aft retreating quadrant of the rotor are shown to originate from the retreating side of the rotor.

  19. Signal Analysis of Helicopter Blade-Vortex-Interaction Acoustic Noise Data

    NASA Technical Reports Server (NTRS)

    Rogers, James C.; Dai, Renshou

    1998-01-01

    Blade-Vortex-Interaction (BVI) produces annoying high-intensity impulsive noise. NASA Ames collected several sets of BVI noise data during in-flight and wind tunnel tests. The goal of this work is to extract the essential features of the BVI signals from the in-flight data and examine the feasibility of extracting those features from BVI noise recorded inside a large wind tunnel. BVI noise generating mechanisms and BVI radiation patterns an are considered and a simple mathematical-physical model is presented. It allows the construction of simple synthetic BVI events that are comparable to free flight data. The boundary effects of the wind tunnel floor and ceiling are identified and more complex synthetic BVI events are constructed to account for features observed in the wind tunnel data. It is demonstrated that improved recording of BVI events can be attained by changing the geometry of the rotor hub, floor, ceiling and microphone. The Euclidean distance measure is used to align BVI events from each blade and improved BVI signals are obtained by time-domain averaging the aligned data. The differences between BVI events for individual blades are then apparent. Removal of wind tunnel background noise by optimal Wiener-filtering is shown to be effective provided representative noise-only data have been recorded. Elimination of wind tunnel reflections by cepstral and optimal filtering deconvolution is examined. It is seen that the cepstral method is not applicable but that a pragmatic optimal filtering approach gives encouraging results. Recommendations for further work include: altering measurement geometry, real-time data observation and evaluation, examining reflection signals (particularly those from the ceiling) and performing further analysis of expected BVI signals for flight conditions of interest so that microphone placement can be optimized for each condition.

  20. New techniques for experimental generation of two-dimensional blade-vortex interaction at low Reynolds numbers

    NASA Technical Reports Server (NTRS)

    Booth, E., Jr.; Yu, J. C.

    1986-01-01

    An experimental investigation of two dimensional blade vortex interaction was held at NASA Langley Research Center. The first phase was a flow visualization study to document the approach process of a two dimensional vortex as it encountered a loaded blade model. To accomplish the flow visualization study, a method for generating two dimensional vortex filaments was required. The numerical study used to define a new vortex generation process and the use of this process in the flow visualization study were documented. Additionally, photographic techniques and data analysis methods used in the flow visualization study are examined.

  1. Parametric Investigation of the Effect of Hub Pitching Moment on Blade Vortex Interaction (BVI) Noise of an Isolated Rotor

    NASA Technical Reports Server (NTRS)

    Malpica, Carlos; Greenwood, Eric; Sim, Ben

    2016-01-01

    At the most fundamental level, main rotor loading noise is caused by the harmonically-varying aerodynamic loads (acoustic pressures) exerted by the rotating blades on the air. Rotorcraft main rotor noise is therefore, in principle, a function of rotor control inputs, and thus the forces and moments required to achieve steady, or "trim", flight equilibrium. In certain flight conditions, the ensuing aerodynamic loading on the rotor(s) can result in highly obtrusive harmonic noise. The effect of the propulsive force, or X-force, on Blade-Vortex Interaction (BVI) noise is well documented. This paper presents an acoustics parametric sensitivity analysis of the effect of varying rotor aerodynamic pitch hub trim moments on BVI noise radiated by an S-70 helicopter main rotor. Results show that changing the hub pitching moment for an isolated rotor, trimmed in nominal 80 knot, 6 and 12 deg descent, flight conditions, alters the miss distance between the blades and the vortex in ways that have varied and noticeable effects on the BVI radiated-noise directionality. Peak BVI noise level is however not significantly altered. The application of hub pitching moment allows the attitude of the fuselage to be controlled; for example, to compensate for the uncomfortable change in fuselage pitch attitude introduced by a fuselage-mounted X-force controller.

  2. Effects of a trailing edge flap on the aerodynamics and acoustics of rotor blade-vortex interactions

    NASA Technical Reports Server (NTRS)

    Charles, B. D.; Tadghighi, H.; Hassan, A. A.

    1992-01-01

    The use of a trailing edge flap on a helicopter rotor has been numerically simulated to determine if such a device can mitigate the acoustics of blade vortex interactions (BVI). The numerical procedure employs CAMRAD/JA, a lifting-line helicopter rotor trim code, in conjunction with RFS2, an unsteady transonic full-potential flow solver, and WOPWOP, an acoustic model based on Farassat's formulation 1A. The codes were modified to simulate trailing edge flap effects. The CAMRAD/JA code was used to compute the far wake inflow effects and the vortex wake trajectories and strengths which are utilized by RFS2 to predict the blade surface pressure variations. These pressures were then analyzed using WOPWOP to determine the high frequency acoustic response at several fixed observer locations below the rotor disk. Comparisons were made with different flap deflection amplitudes and rates to assess flap effects on BVI. Numerical experiments were carried out using a one-seventh scale AH-1G rotor system for flight conditions simulating BVI encountered during low speed descending flight with and without flaps. Predicted blade surface pressures and acoustic sound pressure levels obtained have shown good agreement with the baseline no-flap test data obtained in the DNW wind tunnel. Numerical results indicate that the use of flaps is beneficial in reducing BVI noise.

  3. Comparison of Full-Scale XV-15 Wind Tunnel and In-Flight Blade-Vortex Interaction Noise

    NASA Technical Reports Server (NTRS)

    Kitaplioglu, Cahit; McCluer, M.; Acree, C. W., Jr.; Warmbrodt, William (Technical Monitor)

    1997-01-01

    An isolated full-scale XV-15 rotor was tested in helicopter mode in the NASA Ames 80 by 120-Foot Wind Tunnel. Extensive acoustic data were obtained to define the rotor operating condition for maximum blade-vortex interaction (BVI) noise. Additional data were obtained at operating conditions simulating flight up to 80 knots. An XV-15 aircraft was also tested under operating conditions corresponding to landing approaches for which BVI is expected to be a maximum. In-flight acoustic data were obtained using the YO-3A acoustic research aircraft. An attempt was made to closely match wind tunnel and flight test operating conditions. Details of the two tests are described and some representative acoustic results are presented. Comparisons are shown between the wind tunnel data and corresponding flight test data. Preliminary results indicate very good correlation of the BVI-related features. However, some differences between flight test and wind tunnel results exist away from the BVI event, thought to arise from differences in the two flow environments.

  4. Flow structure generated by perpendicular blade-vortex interaction and implications for helicopter noise prediction. Volume 1: Measurements

    NASA Technical Reports Server (NTRS)

    Wittmer, Kenneth S.; Devenport, William J.

    1996-01-01

    The perpendicular interaction of a streamwise vortex with an infinite span helicopter blade was modeled experimentally in incompressible flow. Three-component velocity and turbulence measurements were made using a sub-miniature four sensor hot-wire probe. Vortex core parameters (radius, peak tangential velocity, circulation, and centerline axial velocity deficit) were determined as functions of blade-vortex separation, streamwise position, blade angle of attack, vortex strength, and vortex size. The downstream development of the flow shows that the interaction of the vortex with the blade wake is the primary cause of the changes in the core parameters. The blade sheds negative vorticity into its wake as a result of the induced angle of attack generated by the passing vortex. Instability in the vortex core due to its interaction with this negative vorticity region appears to be the catalyst for the magnification of the size and intensity of the turbulent flowfield downstream of the interaction. In general, the core radius increases while peak tangential velocity decreases with the effect being greater for smaller separations. These effects are largely independent of blade angle of attack; and if these parameters are normalized on their undisturbed values, then the effects of the vortex strength appear much weaker. Two theoretical models were developed to aid in extending the results to other flow conditions. An empirical model was developed for core parameter prediction which has some rudimentary physical basis, implying usefulness beyond a simple curve fit. An inviscid flow model was also created to estimate the vorticity shed by the interaction blade, and to predict the early stages of its incorporation into the interacting vortex.

  5. Evaluation of helicopter noise due to b blade-vortex interaction for five tip configurations. [conducted in the Langley V/STOL tunnel

    NASA Technical Reports Server (NTRS)

    Hoad, D. R.

    1979-01-01

    The effect of tip shape modification on blade vortex interaction induced helicopter blade slap noise was investigated. Simulated flight and descent velocities which have been shown to produce blade slap were tested. Aerodynamic performance parameters of the rotor system were monitored to ensure properly matched flight conditions among the tip shapes. The tunnel was operated in the open throat configuration with treatment to improve the acoustic characteristics of the test chamber. Four promising tips were used along with a standard square tip as a baseline configuration. A detailed acoustic evaluation on the same rotor system of the relative applicability of the various tip configurations for blade slap noise reduction is provided.

  6. Prediction of blade-vortex interaction noise from measured blade pressure

    NASA Technical Reports Server (NTRS)

    Nakamura, Y.

    1981-01-01

    The impulsive nature of noise due to the interaction of a rotor blade with a tip vortex is studied. The time signature of this noise is calculated theoretically based on the measured blade surface pressure fluctuation of an operational load survey rotor in slow descending flight and is compared with the simultaneous microphone measurement. Particularly, the physical understanding of the characteristic features of a waveform is extensively studied in order to understand the generating mechanism and to identify the important parameters. The interaction trajectory of a tip vortex on an acoustic planform is shown to be a very important parameter for the impulsive shape of the noise. The unsteady nature of the pressure distribution at the very leading edge is also important to the pulse shape. The theoretical model using noncompact liner acoustics predicts the general shape of interaction impulse pretty well except for peak amplitude which requires more continuous information along the span at the leading edge.

  7. Non-harmonic root-pitch individual-blade control for the reduction of blade-vortex interaction noise in rotorcraft

    NASA Astrophysics Data System (ADS)

    Malovrh, Brendon D.

    One of the greatest obstacles to public acceptance of rotorcraft is the high levels of noise they produce, particularly in low-speed descent. In this flight condition, the trailing edge vortex of one blade often passes in close proximity to other blades resulting in impulsive changes in lift. This Blade-Vortex Interaction (BVI) creates high levels of both noise and vibration. The objective of this dissertation is to evaluate the effectiveness of using physically motivated pulse-type Individual Blade Control for reducing the noise associated with the BVI. First, the major parameters that affect the severity of the interaction, such as vortex strength and blade-vortex miss-distance, are analyzed. Second, inputs designed specifically to alter the parameters previously identified as key are explored, resulting in elimination of advancing side noise and overall peak BVI Sound Pressure Level (BVISPL) reductions of up to 4.6 dB. Lastly, different feedback mechanisms for closed-loop control of IBC are examined to allow implementation of the developed inputs.

  8. Perpendicular blade vortex interaction and its implications for helicopter noise prediction: Wave-number frequency spectra in a trailing vortex for BWI noise prediction

    NASA Technical Reports Server (NTRS)

    Devenport, William J.; Glegg, Stewart A. L.

    1993-01-01

    Perpendicular blade vortex interactions are a common occurrence in helicopter rotor flows. Under certain conditions they produce a substantial proportion of the acoustic noise. However, the mechanism of noise generation is not well understood. Specifically, turbulence associated with the trailing vortices shed from the blade tips appears insufficient to account for the noise generated. The hypothesis that the first perpendicular interaction experienced by a trailing vortex alters its turbulence structure in such a way as to increase the acoustic noise generated by subsequent interactions is examined. To investigate this hypothesis a two-part investigation was carried out. In the first part, experiments were performed to examine the behavior of a streamwise vortex as it passed over and downstream of a spanwise blade in incompressible flow. Blade vortex separations between +/- one eighth chord were studied for at a chord Reynolds number of 200,000. Three-component velocity and turbulence measurements were made in the flow from 4 chord lengths upstream to 15 chordlengths downstream of the blade using miniature 4-sensor hot wire probes. These measurements show that the interaction of the vortex with the blade and its wake causes the vortex core to loose circulation and diffuse much more rapidly than it otherwise would. Core radius increases and peak tangential velocity decreases with distance downstream of the blade. True turbulence levels within the core are much larger downstream than upstream of the blade. The net result is a much larger and more intense region of turbulent flow than that presented by the original vortex and thus, by implication, a greater potential for generating acoustic noise. In the second part, the turbulence measurements described above were used to derive the necessary inputs to a Blade Wake Interaction (BWI) noise prediction scheme. This resulted in significantly improved agreement between measurements and calculations of the BWI noise

  9. A parametric study of blade vortex interaction noise for two, three, and four-bladed model rotors at moderate tip speeds Theory and experiment

    NASA Technical Reports Server (NTRS)

    Leighton, K. P.; Harris, W. L.

    1984-01-01

    An investigation of blade slap due to blade vortex interaction (BVI) has been conducted. This investigation consisted of an examination of BVI blade slap for two, three, and four-bladed model rotors at tip Mach numbers ranging from 0.20 to 0.50. Blade slap contours have been obtained for each configuration tested. Differences in blade slap contours, peak sound pressure level, and directivity for each configuration tested are noted. Additional fundamental differences, such as multiple interaction BVI, are observed and occur for only specific rotor blade configurations. The effect of increasing the Mach number on the BVI blade slap for various rotor blade combinations has been quantified. A peak blade slap Mach number scaling law is proposed. Comparison of measured BVI blade slap with theory is made.

  10. Numerical simulation and validation of helicopter blade-vortex interaction using coupled CFD/CSD and three levels of aerodynamic modeling

    NASA Astrophysics Data System (ADS)

    Amiraux, Mathieu

    Rotorcraft Blade-Vortex Interaction (BVI) remains one of the most challenging flow phenomenon to simulate numerically. Over the past decade, the HART-II rotor test and its extensive experimental dataset has been a major database for validation of CFD codes. Its strong BVI signature, with high levels of intrusive noise and vibrations, makes it a difficult test for computational methods. The main challenge is to accurately capture and preserve the vortices which interact with the rotor, while predicting correct blade deformations and loading. This doctoral dissertation presents the application of a coupled CFD/CSD methodology to the problem of helicopter BVI and compares three levels of fidelity for aerodynamic modeling: a hybrid lifting-line/free-wake (wake coupling) method, with modified compressible unsteady model; a hybrid URANS/free-wake method; and a URANS-based wake capturing method, using multiple overset meshes to capture the entire flow field. To further increase numerical correlation, three helicopter fuselage models are implemented in the framework. The first is a high resolution 3D GPU panel code; the second is an immersed boundary based method, with 3D elliptic grid adaption; the last one uses a body-fitted, curvilinear fuselage mesh. The main contribution of this work is the implementation and systematic comparison of multiple numerical methods to perform BVI modeling. The trade-offs between solution accuracy and computational cost are highlighted for the different approaches. Various improvements have been made to each code to enhance physical fidelity, while advanced technologies, such as GPU computing, have been employed to increase efficiency. The resulting numerical setup covers all aspects of the simulation creating a truly multi-fidelity and multi-physics framework. Overall, the wake capturing approach showed the best BVI phasing correlation and good blade deflection predictions, with slightly under-predicted aerodynamic loading magnitudes

  11. Effect of higher harmonic control on helicopter rotor blade-vortex interaction noise: Prediction and initial validation

    NASA Technical Reports Server (NTRS)

    Beaumier, P.; Prieur, J.; Rahier, G.; Spiegel, P.; Demargne, A.; Tung, C.; Gallman, J. M.; Yu, Y. H.; Kube, R.; Vanderwall, B. G.

    1995-01-01

    The paper presents a status of theoretical tools of AFDD, DLR, NASA and ONERA for prediction of the effect of HHC on helicopter main rotor BVI noise. Aeroacoustic predictions from the four research centers, concerning a wind tunnel simulation of a typical descent flight case without and with HHC are presented and compared. The results include blade deformation, geometry of interacting vortices, sectional loads and noise. Acoustic predictions are compared to experimental data. An analysis of the results provides a first insight of the mechanisms by which HHC may affect BVI noise.

  12. Blade-Vortex Interaction Noise Characteristics of a Full-Scale Active Flap Rotor

    DTIC Science & Technology

    2009-01-01

    vibration control, a more inboard flap position was more efficient. Smaller flap chord ratios (15%) were preferred as a compromise between control... ratio . The constant chord section of the blade is 10 inches long. Nominal rotation speed of the rotor is 392 RPM producing a tip speed of 695 ft/sec...minimum actuator power. The flap system selected [25] has a flap/chord ratio of 25% with an overhang of 40% (total flap length of 35% chord) and flap

  13. Foundations of the Blade-Vortex Interaction Problem: Structure and Behavior of Travelling Compressible Vortices

    DTIC Science & Technology

    1988-04-01

    but a systematic study is now in order, it would constitute the logical subject for any follow -on work. Included at the end of the report is a special...ORIENTATION 3 a quasi-uniform 2-D compressible flow toward an airfoil model (Mandella and Ber- shader [41,[51). The flow was energized by a traveling shock...is two-dimensional and axisymmetric (all physical quantities depend only on time t and the distance r from the axis), then the Navier-Stokes equations

  14. Rotor system having alternating length rotor blades for reducing blade-vortex interaction (BVI) noise

    NASA Technical Reports Server (NTRS)

    Moffitt, Robert C. (Inventor); Visintainer, Joseph A. (Inventor)

    1997-01-01

    A rotor system (4) having odd and even blade assemblies (O.sub.b, E.sub.b) mounting to and rotating with a rotor hub assembly (6) wherein the odd blade assemblies (O.sub.b) define a radial length R.sub.O, and the even blade assemblies (E.sub.b) define a radial length R.sub.E and wherein the radial length R.sub.E is between about 70% to about 95% of the radial length R.sub.O. Other embodiments of the invention are directed to a Variable Diameter Rotor system (4) which may be configured for operating in various operating modes for optimizing aerodynamic and acoustic performance. The Variable Diameter Rotor system (4) includes odd and even blade assemblies (O.sub.b, E.sub.b) having inboard and outboard blade sections (10, 12) wherein the outboard blade sections (12) telescopically mount to the inboard blade sections (10). The outboard blade sections (12) are positioned with respect to the inboard blade sections (10 such that the radial length R.sub.E of the even blade assemblies (E.sub.b) is equal to the radial length R.sub.O of the odd blade assemblies (O.sub.b) in a first operating mode, and such that the radial length R.sub.E is between about 70% to about 95% of the length R.sub.O in a second operating mode.

  15. Parallel spin-orbit coupled configuration interaction

    NASA Astrophysics Data System (ADS)

    Tilson, J. L.; Ermler, W. C.; Pitzer, R. M.

    2000-06-01

    A parallel spin-orbit configuration interaction (SOCI) code has been developed. This code, named P-SOCI, is an extension of an existing sequential SOCI program and permits solution to heavy-element systems requiring both explicit spin-orbit (SO) effects and significant electron correlation. The relativistic procedure adopted here is an ab initio conventional configuration interaction (CI) method that constructs a Hamiltonian matrix in a double-group-adapted basis. P-SOCI enables solutions to problems far larger than possible with the original code by exploiting the resources of large massively parallel processing computers (MPP). This increase in capability permits not only the continued inclusion of explicit spin-orbit effects but now also a significant amount of non-dynamic and dynamic correlation as is necessary for a good description of heavy-element systems.

  16. Parallel Vegetation Stripe Formation Through Hydrologic Interactions

    NASA Astrophysics Data System (ADS)

    Cheng, Yiwei; Stieglitz, Marc; Turk, Greg; Engel, Victor

    2010-05-01

    It has long been a challenge to theoretical ecologists to describe vegetation pattern formations such as the "tiger bush" stripes and "leopard bush" spots in Niger, and the regular maze patterns often observed in bogs in North America and Eurasia. To date, most of simulation models focus on reproducing the spot and labyrinthine patterns, and on the vegetation bands which form perpendicular to surface and groundwater flow directions. Various hypotheses have been invoked to explain the formation of vegetation patterns: selective grazing by herbivores, fire, and anisotropic environmental conditions such as slope. Recently, short distance facilitation and long distance competition between vegetation (a.k.a scale dependent feedback) has been proposed as a generic mechanism for vegetation pattern formation. In this paper, we test the generality of this mechanism by employing an existing, spatially explicit, advection-reaction-diffusion type model to describe the formation of regularly spaced vegetation bands, including those that are parallel to flow direction. Such vegetation patterns are, for example, characteristic of the ridge and slough habitat in the Florida Everglades and which are thought to have formed parallel to the prevailing surface water flow direction. To our knowledge, this is the first time that a simple model encompassing a nutrient accumulation mechanism along with biomass development and flow is used to demonstrate the formation of parallel stripes. We also explore the interactive effects of plant transpiration, slope and anisotropic hydraulic conductivity on the resulting vegetation pattern. Our results highlight the ability of the short distance facilitation and long distance competition mechanism to explain the formation of the different vegetation patterns beyond semi-arid regions. Therefore, we propose that the parallel stripes, like the other periodic patterns observed in both isotropic and anisotropic environments, are self-organized and form

  17. Interactions of evaportranspiration between two parallel columns

    NASA Astrophysics Data System (ADS)

    Sun, D.; Zhu, J.

    2010-12-01

    Moisture flux across the land-atmosphere boundary (through soil evaporation and plant transpiration) is an important component of many large-scale hydrological processes, which were often quantified through simulation of multiple realizations (stream tubes) of independent one-dimensional local scale flow. A major problem of this approach is that it ignores the interactions among different stream tubes. Lateral flows might be prominent for long and narrow tubes and heterogeneous hydraulic properties and plant covers. This study is to investigate whether using this stream tube modeling will produce unacceptable errors for large scale evapotranspiration simulations. Instead of using convenient parallel column models of independent hydrologic processes, this study simulates two-dimensional transpiration and evaporation in two parallel columns which allow lateral interactions. The impact of both plant characteristics and soil hydraulic properties on evapotranspitration is addressed and discussed in comparison to those of independent stream tube models. The results provide applicable guidance for applications of stream tube models to simulate large scale evapotranspiration in a heterogeneous landscape.

  18. Parametric Investigation of the Effect of Hub Pitching Moment on Blade Vortex Interaction (BVI) Noise of an Isolated Rotor

    DTIC Science & Technology

    2016-05-19

    Aerospace Engineer NASA Ames Research Center Moffett Field , CA Eric Greenwood Aerospace Engineer NASA Langley Research Center Hampton, VA Ben...Sim Research Engineer US Army Aviation Development Directorate Moffett Field , CA ABSTRACT At the most fundamental level, main rotor loading noise...and torsion . Specifically, the elastic deformation of the blade is characterized by the spatial displacements of any arbitrary point on the elastic

  19. Acoustic Measurements from a Rotor Blade-Vortex Interaction Noise Experiment in the German-Dutch Wind Tunnel (DNW)

    DTIC Science & Technology

    1988-03-01

    traversing Microphone Array Traverse System system and two in-flow microphones mounted on the rotor fuselage as shown in figure 3(b). The mi - The microphone...array traverse system consisted crophones were ’/2-in. pressure-type condenser mi - of a horizontal wing with its span normal to the flow crophones...figure 9, which shows the upstream positions was also investigated. These data data for X. = -0.7 and 3 m at 60 m/sec from mi - are seen in figure 10

  20. Parallelized Stochastic Cutoff Method for Long-Range Interacting Systems

    NASA Astrophysics Data System (ADS)

    Endo, Eishin; Toga, Yuta; Sasaki, Munetaka

    2015-07-01

    We present a method of parallelizing the stochastic cutoff (SCO) method, which is a Monte-Carlo method for long-range interacting systems. After interactions are eliminated by the SCO method, we subdivide a lattice into noninteracting interpenetrating sublattices. This subdivision enables us to parallelize the Monte-Carlo calculation in the SCO method. Such subdivision is found by numerically solving the vertex coloring of a graph created by the SCO method. We use an algorithm proposed by Kuhn and Wattenhofer to solve the vertex coloring by parallel computation. This method was applied to a two-dimensional magnetic dipolar system on an L × L square lattice to examine its parallelization efficiency. The result showed that, in the case of L = 2304, the speed of computation increased about 102 times by parallel computation with 288 processors.

  1. Protein interaction discovery using parallel analysis of translated ORFs (PLATO).

    PubMed

    Zhu, Jian; Larman, H Benjamin; Gao, Geng; Somwar, Romel; Zhang, Zijuan; Laserson, Uri; Ciccia, Alberto; Pavlova, Natalya; Church, George; Zhang, Wei; Kesari, Santosh; Elledge, Stephen J

    2013-04-01

    Identifying physical interactions between proteins and other molecules is a critical aspect of biological analysis. Here we describe PLATO, an in vitro method for mapping such interactions by affinity enrichment of a library of full-length open reading frames displayed on ribosomes, followed by massively parallel analysis using DNA sequencing. We demonstrate the broad utility of the method for human proteins by identifying known and previously unidentified interacting partners of LYN kinase, patient autoantibodies, and the small-molecules gefitinib and dasatinib.

  2. An interactive parallel programming environment applied in atmospheric science

    SciTech Connect

    Laszewski, G. von

    1996-12-31

    This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

  3. An interactive parallel programming environment applied in atmospheric science

    NASA Technical Reports Server (NTRS)

    vonLaszewski, G.

    1996-01-01

    This article introduces an interactive parallel programming environment (IPPE) that simplifies the generation and execution of parallel programs. One of the tasks of the environment is to generate message-passing parallel programs for homogeneous and heterogeneous computing platforms. The parallel programs are represented by using visual objects. This is accomplished with the help of a graphical programming editor that is implemented in Java and enables portability to a wide variety of computer platforms. In contrast to other graphical programming systems, reusable parts of the programs can be stored in a program library to support rapid prototyping. In addition, runtime performance data on different computing platforms is collected in a database. A selection process determines dynamically the software and the hardware platform to be used to solve the problem in minimal wall-clock time. The environment is currently being tested on a Grand Challenge problem, the NASA four-dimensional data assimilation system.

  4. Investigation of helicopter rotor blade/wake interactive impulsive noise

    NASA Technical Reports Server (NTRS)

    Miley, S. J.; Hall, G. F.; Vonlavante, E.

    1987-01-01

    An analysis of the Tip Aerodynamic/Aeroacoustic Test (TAAT) data was performed to identify possible aerodynamic sources of blade/vortex interaction (BVI) impulsive noise. The identification is based on correlation of measured blade pressure time histories with predicted blade/vortex intersections for the flight condition(s) where impulsive noise was detected. Due to the location of the recording microphones, only noise signatures associated with the advancing blade were available, and the analysis was accordingly restricted to the first and second azimuthal quadrants. The results show that the blade tip region is operating transonically in the azimuthal range where previous BVI experiments indicated the impulsive noise to be. No individual blade/vortex encounter is identifiable in the pressure data; however, there is indication of multiple intersections in the roll-up region which could be the origin of the noise. Discrete blade/vortex encounters are indicated in the second quadrant; however, if impulsive noise were produced here, the directivity pattern would be such that it was not recorded by the microphones. It is demonstrated that the TAAT data base is a valuable resource in the investigation of rotor aerodynamic/aeroacoustic behavior.

  5. IPython: components for interactive and parallel computing across disciplines. (Invited)

    NASA Astrophysics Data System (ADS)

    Perez, F.; Bussonnier, M.; Frederic, J. D.; Froehle, B. M.; Granger, B. E.; Ivanov, P.; Kluyver, T.; Patterson, E.; Ragan-Kelley, B.; Sailer, Z.

    2013-12-01

    Scientific computing is an inherently exploratory activity that requires constantly cycling between code, data and results, each time adjusting the computations as new insights and questions arise. To support such a workflow, good interactive environments are critical. The IPython project (http://ipython.org) provides a rich architecture for interactive computing with: 1. Terminal-based and graphical interactive consoles. 2. A web-based Notebook system with support for code, text, mathematical expressions, inline plots and other rich media. 3. Easy to use, high performance tools for parallel computing. Despite its roots in Python, the IPython architecture is designed in a language-agnostic way to facilitate interactive computing in any language. This allows users to mix Python with Julia, R, Octave, Ruby, Perl, Bash and more, as well as to develop native clients in other languages that reuse the IPython clients. In this talk, I will show how IPython supports all stages in the lifecycle of a scientific idea: 1. Individual exploration. 2. Collaborative development. 3. Production runs with parallel resources. 4. Publication. 5. Education. In particular, the IPython Notebook provides an environment for "literate computing" with a tight integration of narrative and computation (including parallel computing). These Notebooks are stored in a JSON-based document format that provides an "executable paper": notebooks can be version controlled, exported to HTML or PDF for publication, and used for teaching.

  6. The interaction of turbulence with parallel and perpendicular shocks

    NASA Astrophysics Data System (ADS)

    Adhikari, L.; Zank, G. P.; Hunana, P.; Hu, Q.

    2016-11-01

    Interplanetary shocks exist in most astrophysical flows, and modify the properties of the background flow. We apply the Zank et al 2012 six coupled turbulence transport model equations to study the interaction of turbulence with parallel and perpendicular shock waves in the solar wind. We model the 1D structure of a stationary perpendicular or parallel shock wave using a hyperbolic tangent function and the Rankine-Hugoniot conditions. A reduced turbulence transport model (the 4-equation model) is applied to parallel and perpendicular shock waves, and solved using a 4th- order Runge Kutta method. We compare the model results with ACE spacecraft observations. We identify one quasi-parallel and one quasi-perpendicular event in the ACE spacecraft data sets, and compute various turbulent observed values such as the fluctuating magnetic and kinetic energy, the energy in forward and backward propagating modes, the total turbulent energy in the upstream and downstream of the shock. We also calculate the error associated with each turbulent observed value, and fit the observed values by a least square method and use a Fourier series fitting function. We find that the theoretical results are in reasonable agreement with observations. The energy in turbulent fluctuations is enhanced and the correlation length is approximately constant at the shock. Similarly, the normalized cross helicity increases across a perpendicular shock, and decreases across a parallel shock.

  7. Analysis of the interaction of two parallel surface cracks

    NASA Astrophysics Data System (ADS)

    Hahn, Jeeyeon

    The objective of this research is to analyze and predict the interaction of surface cracks that occur in parallel planes. Multiple cracks may form in aging aircraft that forms at stress concentrations such as fastener holes and notched components by stress corrosion and fatigue cracking. The lifetime of the structures are significantly affected by the interaction between these cracks. Depending on relative positions and orientations of neighboring cracks, local stress fields and crack driving forces can be affected by the presence of adjacent cracks. Even small subcritical cracks may rapidly grow to a size that will cause failure in service due to interaction and coalescence with other cracks. The interaction behavior and crack propagation direction of two parallel surface cracks is studied using three-dimensional finite element analysis (FEA). FEA models with wide range of crack configurations in a finite plate under tension are evaluated to investigate the correlation between the crack shapes and the separation distance between two cracks. The relative distance (vertical and horizontal) between two cracks and size and shape of these cracks are varied to create different stress interaction fields. Stress intensity factors (SIF) along the crack fronts are obtained from FEA, and then, cracking behaviors of the cracks are predicted by considering the influence of the interaction on the SIF and the coalescence of two cracks. The results obtained are then compared with existing experimental and analytical data for validation. All of the data analyses are presented in tabular forms and figures.

  8. Framework for Interactive Parallel Dataset Analysis on the Grid

    SciTech Connect

    Alexander, David A.; Ananthan, Balamurali; Johnson, Tony; Serbo, Victor; /SLAC

    2007-01-10

    We present a framework for use at a typical Grid site to facilitate custom interactive parallel dataset analysis targeting terabyte-scale datasets of the type typically produced by large multi-institutional science experiments. We summarize the needs for interactive analysis and show a prototype solution that satisfies those needs. The solution consists of desktop client tool and a set of Web Services that allow scientists to sign onto a Grid site, compose analysis script code to carry out physics analysis on datasets, distribute the code and datasets to worker nodes, collect the results back to the client, and to construct professional-quality visualizations of the results.

  9. Nanoparticle-target interactions parallel antibody-protein interactions.

    PubMed

    Koh, Isaac; Hong, Rui; Weissleder, Ralph; Josephson, Lee

    2009-05-01

    Magnetic particles can act as magnetic relaxation switches (MRSw's) when they bind to target analytes, and switch between their dispersed and aggregated states resulting in changes in the spin-spin relaxation time (T(2)) of their surrounding water protons. Both nanoparticles (NPs, 10-100 nm) and micrometer-sized particles (MPs) have been employed as MRSw's, to sense drugs, metabolites, oligonucleotides, proteins, bacteria, and mammalian cells. To better understand how NPs or MPs interact with targets, we employed as a molecular recognition system the reaction between the Tag peptide of the influenza virus hemagglutinin and a monoclonal antibody to that peptide (anti-Tag). To obtain targets of different size and valency, we attached the Tag peptide to BSA (M(w)= 65000 Daltons, diameter = 8 nm) and to Latex spheres (diameter = 900 nm). To obtain magnetic probes of very different sizes, anti-Tag was conjugated to 40 nm NPs and 1 microm MPs. MP and NP probes reacted with Tag peptide targets in a manner similar to antibody/antigen reactions in solution, exhibiting so-called Prozone effects. MPs detected all types of targets with higher sensitivity than NPs with targets of higher valency being better detected than those of lower valency. The Tag/anti Tag recognition system can be used to synthesize combinations of molecular targets and magnetic probes, to more fully understand the aggregation reaction that occurs when probes bind targets in solution and the ensuing changes in water relaxation times that result.

  10. Parallel algorithms for interactive manipulation of digital terrain models

    NASA Technical Reports Server (NTRS)

    Davis, E. W.; Mcallister, D. F.; Nagaraj, V.

    1988-01-01

    Interactive three-dimensional graphics applications, such as terrain data representation and manipulation, require extensive arithmetic processing. Massively parallel machines are attractive for this application since they offer high computational rates, and grid connected architectures provide a natural mapping for grid based terrain models. Presented here are algorithms for data movement on the massive parallel processor (MPP) in support of pan and zoom functions over large data grids. It is an extension of earlier work that demonstrated real-time performance of graphics functions on grids that were equal in size to the physical dimensions of the MPP. When the dimensions of a data grid exceed the processing array size, data is packed in the array memory. Windows of the total data grid are interactively selected for processing. Movement of packed data is needed to distribute items across the array for efficient parallel processing. Execution time for data movement was found to exceed that for arithmetic aspects of graphics functions. Performance figures are given for routines written in MPP Pascal.

  11. Long-range interactions and parallel scalability in molecular simulations

    NASA Astrophysics Data System (ADS)

    Patra, Michael; Hyvönen, Marja T.; Falck, Emma; Sabouri-Ghomi, Mohsen; Vattulainen, Ilpo; Karttunen, Mikko

    2007-01-01

    Typical biomolecular systems such as cellular membranes, DNA, and protein complexes are highly charged. Thus, efficient and accurate treatment of electrostatic interactions is of great importance in computational modeling of such systems. We have employed the GROMACS simulation package to perform extensive benchmarking of different commonly used electrostatic schemes on a range of computer architectures (Pentium-4, IBM Power 4, and Apple/IBM G5) for single processor and parallel performance up to 8 nodes—we have also tested the scalability on four different networks, namely Infiniband, GigaBit Ethernet, Fast Ethernet, and nearly uniform memory architecture, i.e. communication between CPUs is possible by directly reading from or writing to other CPUs' local memory. It turns out that the particle-mesh Ewald method (PME) performs surprisingly well and offers competitive performance unless parallel runs on PC hardware with older network infrastructure are needed. Lipid bilayers of sizes 128, 512 and 2048 lipid molecules were used as the test systems representing typical cases encountered in biomolecular simulations. Our results enable an accurate prediction of computational speed on most current computing systems, both for serial and parallel runs. These results should be helpful in, for example, choosing the most suitable configuration for a small departmental computer cluster.

  12. Parallel Force Assay for Protein-Protein Interactions

    PubMed Central

    Aschenbrenner, Daniela; Pippig, Diana A.; Klamecka, Kamila; Limmer, Katja; Leonhardt, Heinrich; Gaub, Hermann E.

    2014-01-01

    Quantitative proteome research is greatly promoted by high-resolution parallel format assays. A characterization of protein complexes based on binding forces offers an unparalleled dynamic range and allows for the effective discrimination of non-specific interactions. Here we present a DNA-based Molecular Force Assay to quantify protein-protein interactions, namely the bond between different variants of GFP and GFP-binding nanobodies. We present different strategies to adjust the maximum sensitivity window of the assay by influencing the binding strength of the DNA reference duplexes. The binding of the nanobody Enhancer to the different GFP constructs is compared at high sensitivity of the assay. Whereas the binding strength to wild type and enhanced GFP are equal within experimental error, stronger binding to superfolder GFP is observed. This difference in binding strength is attributed to alterations in the amino acids that form contacts according to the crystal structure of the initial wild type GFP-Enhancer complex. Moreover, we outline the potential for large-scale parallelization of the assay. PMID:25546146

  13. Highly parallel characterization of IgG Fc binding interactions

    PubMed Central

    Boesch, Austin W; Brown, Eric P; Cheng, Hao D; Ofori, Maame Ofua; Normandin, Erica; Nigrovic, Peter A; Alter, Galit; Ackerman, Margaret E

    2014-01-01

    Because the variable ability of the antibody constant (Fc) domain to recruit innate immune effector cells and complement is a major factor in antibody activity in vivo, convenient means of assessing these binding interactions is of high relevance to the development of enhanced antibody therapeutics, and to understanding the protective or pathogenic antibody response to infection, vaccination, and self. Here, we describe a highly parallel microsphere assay to rapidly assess the ability of antibodies to bind to a suite of antibody receptors. Fc and glycan binding proteins such as FcγR and lectins were conjugated to coded microspheres and the ability of antibodies to interact with these receptors was quantified. We demonstrate qualitative and quantitative assessment of binding preferences and affinities across IgG subclasses, Fc domain point mutants, and antibodies with variant glycosylation. This method can serve as a rapid proxy for biophysical methods that require substantial sample quantities, high-end instrumentation, and serial analysis across multiple binding interactions, thereby offering a useful means to characterize monoclonal antibodies, clinical antibody samples, and antibody mimics, or alternatively, to investigate the binding preferences of candidate Fc receptors. PMID:24927273

  14. Simulation of Parallel Interacting Faults and Earthquake Predictability

    NASA Astrophysics Data System (ADS)

    Mora, P.; Weatherley, D.; Klein, B.

    2003-04-01

    Numerical shear experiments of a granular region using the lattice solid model often exhibit accelerating energy release in the lead-up to large events (Mora et al, 2000) and a growth in correlation lengths in the stress field (Mora and Place, 2002). While these results provide evidence for a Critical Point-like mechanism in elasto-dynamic systems and the possibility of earthquake forecasting but they do not prove such a mechanism occurs in the crust. Cellular automaton models simulations exhibit accelerating energy release prior to large events or unpredictable behaviour in which large events may occur at any time depending on tuning parameters such as dissipation ratio and stress transfer ratio (Weatherley and Mora, 2003). The mean stress plots from the particle simulations are most similar to the CA mean stress plots near the boundary of the predictable and unpredictable regimes suggesting that elasto-dynamic systems may be close to the borderline of predictable and unpredictable. To progress in resolving the question of whether more realistic fault system models exhibit predictable behaviour and to determine whether they also have an unpredictable and predictable regime depending on tuning parameters like that seen in CA simulations, we developed a 2D elasto-dynamic model of parallel interacting faults. The friction is slip weakening until a critical slip distance. Henceforth, the friction is at the dynamic value until the slip rate drops below the value it attained when the critical slip distance was exceeded. As the slip rate continues to drop, the friction increases back to the static value as a function of slip rate. Numerical shear experiments are conducted in a model with 41 parallel interacting faults. Calculations of the inverse metric defined in Klein et al (2000) indicate that the system is non-ergodic. Furthermore, by calculating the correllation between the stress fields at different times we determine that the system exhibits so called ``glassy

  15. Social interaction shapes babbling: Testing parallels between birdsong and speech

    PubMed Central

    Goldstein, Michael H.; King, Andrew P.; West, Meredith J.

    2003-01-01

    Birdsong is considered a model of human speech development at behavioral and neural levels. Few direct tests of the proposed analogs exist, however. Here we test a mechanism of phonological development in human infants that is based on social shaping, a selective learning process first documented in songbirds. By manipulating mothers' reactions to their 8-month-old infants' vocalizations, we demonstrate that phonological features of babbling are sensitive to nonimitative social stimulation. Contingent, but not noncontingent, maternal behavior facilitates more complex and mature vocal behavior. Changes in vocalizations persist after the manipulation. The data show that human infants use social feedback, facilitating immediate transitions in vocal behavior. Social interaction creates rapid shifts to developmentally more advanced sounds. These transitions mirror the normal development of speech, supporting the predictions of the avian social shaping model. These data provide strong support for a parallel in function between vocal precursors of songbirds and infants. Because imitation is usually considered the mechanism for vocal learning in both taxa, the findings introduce social shaping as a general process underlying the development of speech and song. PMID:12808137

  16. Parasites and biological invasions: parallels, interactions, and control.

    PubMed

    Dunn, Alison M; Hatcher, Melanie J

    2015-05-01

    Species distributions are changing at an unprecedented rate owing to human activity. We examine how two key processes of redistribution - biological invasion and disease emergence - are interlinked. There are many parallels between invasion and emergence processes, and invasions can drive the spread of new diseases to wildlife. We examine the potential impacts of invasion and disease emergence, and discuss how these threats can be countered, focusing on biosecurity. In contrast with international policy on emerging diseases of humans and managed species, policy on invasive species and parasites of wildlife is fragmented, and the lack of international cooperation encourages individual parties to minimize their input into control. We call for international policy that acknowledges the strong links between emerging diseases and invasion risk.

  17. Parallel quantification of lectin-glycan interaction using ultrafiltration.

    PubMed

    Takeda, Yoichi; Seko, Akira; Sakono, Masafumi; Hachisu, Masakazu; Koizumi, Akihiko; Fujikawa, Kohki; Ito, Yukishige

    2013-06-28

    Using ultrafiltration membrane, a simple method for screening protein-ligand interaction was developed. The procedure comprises three steps: mixing ligand with protein, ultrafiltration of the solution, and quantification of unbound ligands by HPLC. By conducting analysis with variable protein concentrations, affinity constants were easily obtained. Multiple ligands can be analyzed simultaneously as a mixture, when concentration of ligands was controlled. Feasibility of this method for lectin-glycan interaction analysis was examined using fluorescently labeled high-mannose-type glycans and recombinant intracellular lectins or endo-α-mannosidase mutants. Estimated Ka values of malectin and VIP36 were in good agreement indeed with those evaluated by conventional methods such as isothermal titration calorimetry (ITC) or frontal affinity chromatography (FAC). Finally, several mutants of endo-α-mannosidase were produced and their affinities to monoglucosylated glycans were evaluated.

  18. Bayesian seismic tomography by parallel interacting Markov chains

    NASA Astrophysics Data System (ADS)

    Gesret, Alexandrine; Bottero, Alexis; Romary, Thomas; Noble, Mark; Desassis, Nicolas

    2014-05-01

    The velocity field estimated by first arrival traveltime tomography is commonly used as a starting point for further seismological, mineralogical, tectonic or similar analysis. In order to interpret quantitatively the results, the tomography uncertainty values as well as their spatial distribution are required. The estimated velocity model is obtained through inverse modeling by minimizing an objective function that compares observed and computed traveltimes. This step is often performed by gradient-based optimization algorithms. The major drawback of such local optimization schemes, beyond the possibility of being trapped in a local minimum, is that they do not account for the multiple possible solutions of the inverse problem. They are therefore unable to assess the uncertainties linked to the solution. Within a Bayesian (probabilistic) framework, solving the tomography inverse problem aims at estimating the posterior probability density function of velocity model using a global sampling algorithm. Markov chains Monte-Carlo (MCMC) methods are known to produce samples of virtually any distribution. In such a Bayesian inversion, the total number of simulations we can afford is highly related to the computational cost of the forward model. Although fast algorithms have been recently developed for computing first arrival traveltimes of seismic waves, the complete browsing of the posterior distribution of velocity model is hardly performed, especially when it is high dimensional and/or multimodal. In the latter case, the chain may even stay stuck in one of the modes. In order to improve the mixing properties of classical single MCMC, we propose to make interact several Markov chains at different temperatures. This method can make efficient use of large CPU clusters, without increasing the global computational cost with respect to classical MCMC and is therefore particularly suited for Bayesian inversion. The exchanges between the chains allow a precise sampling of the

  19. A fast, scalable method for the parallel evaluation of distance-limited pairwise particle interactions.

    PubMed

    Shaw, David E

    2005-10-01

    Classical molecular dynamics simulations of biological macromolecules in explicitly modeled solvent typically require the evaluation of interactions between all pairs of atoms separated by no more than some distance R, with more distant interactions handled using some less expensive method. Performing such simulations for periods on the order of a millisecond is likely to require the use of massive parallelism. The extent to which such simulations can be efficiently parallelized, however, has historically been limited by the time required for interprocessor communication. This article introduces a new method for the parallel evaluation of distance-limited pairwise particle interactions that significantly reduces the amount of data transferred between processors by comparison with traditional methods. Specifically, the amount of data transferred into and out of a given processor scales as O(R(3/2)p(-1/2)), where p is the number of processors, and with constant factors that should yield a substantial performance advantage in practice.

  20. A parallel graded-mesh FDTD algorithm for human-antenna interaction problems.

    PubMed

    Catarinucci, Luca; Tarricone, Luciano

    2009-01-01

    The finite difference time domain method (FDTD) is frequently used for the numerical solution of a wide variety of electromagnetic (EM) problems and, among them, those concerning human exposure to EM fields. In many practical cases related to the assessment of occupational EM exposure, large simulation domains are modeled and high space resolution adopted, so that strong memory and central processing unit power requirements have to be satisfied. To better afford the computational effort, the use of parallel computing is a winning approach; alternatively, subgridding techniques are often implemented. However, the simultaneous use of subgridding schemes and parallel algorithms is very new. In this paper, an easy-to-implement and highly-efficient parallel graded-mesh (GM) FDTD scheme is proposed and applied to human-antenna interaction problems, demonstrating its appropriateness in dealing with complex occupational tasks and showing its capability to guarantee the advantages of a traditional subgridding technique without affecting the parallel FDTD performance.

  1. Hippocampal-prefrontal dynamics in spatial working memory: interactions and independent parallel processing.

    PubMed

    Churchwell, John C; Kesner, Raymond P

    2011-12-01

    Memory processes may be independent, compete, operate in parallel, or interact. In accordance with this view, behavioral studies suggest that the hippocampus (HPC) and prefrontal cortex (PFC) may act as an integrated circuit during performance of tasks that require working memory over longer delays, whereas during short delays the HPC and PFC may operate in parallel or have completely dissociable functions. In the present investigation we tested rats in a spatial delayed non-match to sample working memory task using short and long time delays to evaluate the hypothesis that intermediate CA1 region of the HPC (iCA1) and medial PFC (mPFC) interact and operate in parallel under different temporal working memory constraints. In order to assess the functional role of these structures, we used an inactivation strategy in which each subject received bilateral chronic cannula implantation of the iCA1 and mPFC, allowing us to perform bilateral, contralateral, ipsilateral, and combined bilateral inactivation of structures and structure pairs within each subject. This novel approach allowed us to test for circuit-level systems interactions, as well as independent parallel processing, while we simultaneously parametrically manipulated the temporal dimension of the task. The current results suggest that, at longer delays, iCA1 and mPFC interact to coordinate retrospective and prospective memory processes in anticipation of obtaining a remote goal, whereas at short delays either structure may independently represent spatial information sufficient to successfully complete the task.

  2. Interactive Parallel Data Analysis within Data-Centric Cluster Facilities using the IPython Notebook

    NASA Astrophysics Data System (ADS)

    Pascoe, S.; Lansdowne, J.; Iwi, A.; Stephens, A.; Kershaw, P.

    2012-12-01

    The data deluge is making traditional analysis workflows for many researchers obsolete. Support for parallelism within popular tools such as matlab, IDL and NCO is not well developed and rarely used. However parallelism is necessary for processing modern data volumes on a timescale conducive to curiosity-driven analysis. Furthermore, for peta-scale datasets such as the CMIP5 archive, it is no longer practical to bring an entire dataset to a researcher's workstation for analysis, or even to their institutional cluster. Therefore, there is an increasing need to develop new analysis platforms which both enable processing at the point of data storage and which provides parallelism. Such an environment should, where possible, maintain the convenience and familiarity of our current analysis environments to encourage curiosity-driven research. We describe how we are combining the interactive python shell (IPython) with our JASMIN data-cluster infrastructure. IPython has been specifically designed to bridge the gap between the HPC-style parallel workflows and the opportunistic curiosity-driven analysis usually carried out using domain specific languages and scriptable tools. IPython offers a web-based interactive environment, the IPython notebook, and a cluster engine for parallelism all underpinned by the well-respected Python/Scipy scientific programming stack. JASMIN is designed to support the data analysis requirements of the UK and European climate and earth system modeling community. JASMIN, with its sister facility CEMS focusing the earth observation community, has 4.5 PB of fast parallel disk storage alongside over 370 computing cores provide local computation. Through the IPython interface to JASMIN, users can make efficient use of JASMIN's multi-core virtual machines to perform interactive analysis on all cores simultaneously or can configure IPython clusters across multiple VMs. Larger-scale clusters can be provisioned through JASMIN's batch scheduling system

  3. A Theory of Interactive Parallel Processing: New Capacity Measures and Predictions for a Response Time Inequality Series

    ERIC Educational Resources Information Center

    Townsend, James T.; Wenger, Michael J.

    2004-01-01

    The authors present a theory of stochastic interactive parallel processing with special emphasis on channel interactions and their relation to system capacity. The approach is based both on linear systems theory augmented with stochastic elements and decisional operators and on a metatheory of parallel channels' dependencies that incorporates…

  4. Collisionless Interaction of a Magnetized Ambient Plasma and a Field-Parallel Laser Produced Plasma

    NASA Astrophysics Data System (ADS)

    Heuer, P. V.; Bondarenko, A. S.; Schaeffer, D. B.; Constantin, C. G.; Vincena, S.; Tripathi, S.; Gekelman, W.; Weidl, M.; Winske, D.; Niemann, C.

    2016-10-01

    We present measurements of the collisionless coupling between an exploding laser-produced plasma (LPP) and a large, magnetized ambient plasma. The LPP was created by focusing the Raptor laser (400 J, 40 ns) on a planar plastic target embedded in the ambient Large Plasma Device (LAPD) plasma at the University of California, Los Angeles. The resulting ablated material moved parallel to the background magnetic field, interacting with the ambient plasma along the full 17m length of the LAPD. The amplitude and polarization of waves driven by the interaction were measured by an array of 3-axis magnetic flux probes. Emissive doppler spectroscopy and a high temporal resolution monochrometer were used to observe the velocity and charge state distributions of both ambient and debris ions. Measurements are compared to hybrid simulations of quasi-parallel shocks.

  5. Modeling of fatigue crack induced nonlinear ultrasonics using a highly parallelized explicit local interaction simulation approach

    NASA Astrophysics Data System (ADS)

    Shen, Yanfeng; Cesnik, Carlos E. S.

    2016-04-01

    This paper presents a parallelized modeling technique for the efficient simulation of nonlinear ultrasonics introduced by the wave interaction with fatigue cracks. The elastodynamic wave equations with contact effects are formulated using an explicit Local Interaction Simulation Approach (LISA). The LISA formulation is extended to capture the contact-impact phenomena during the wave damage interaction based on the penalty method. A Coulomb friction model is integrated into the computation procedure to capture the stick-slip contact shear motion. The LISA procedure is coded using the Compute Unified Device Architecture (CUDA), which enables the highly parallelized supercomputing on powerful graphic cards. Both the explicit contact formulation and the parallel feature facilitates LISA's superb computational efficiency over the conventional finite element method (FEM). The theoretical formulations based on the penalty method is introduced and a guideline for the proper choice of the contact stiffness is given. The convergence behavior of the solution under various contact stiffness values is examined. A numerical benchmark problem is used to investigate the new LISA formulation and results are compared with a conventional contact finite element solution. Various nonlinear ultrasonic phenomena are successfully captured using this contact LISA formulation, including the generation of nonlinear higher harmonic responses. Nonlinear mode conversion of guided waves at fatigue cracks is also studied.

  6. String interactions in a plane-fronted parallel-wave spacetime.

    PubMed

    Gopakumar, Rajesh

    2002-10-21

    We argue that string interactions in a plane-fronted parallel-wave spacetime are governed by an effective coupling g(eff)=g(s)(micro p(+)alpha('))f(micro p(+)alpha(')) where f(microp(+)alpha(')) is proportional to the light-cone energy of the string states involved in the interaction. This simply follows from generalities of a matrix string description of this background. g(eff) nicely interpolates between the expected result (g(s)) for flat space (small micro p(+)alpha(')) and a recently conjectured expression from the perturbative gauge theory side (large micro p(+)alpha(')).

  7. Parallel implementation of three-dimensional molecular dynamic simulation for laser-cluster interaction

    SciTech Connect

    Holkundkar, Amol R.

    2013-11-15

    The objective of this article is to report the parallel implementation of the 3D molecular dynamic simulation code for laser-cluster interactions. The benchmarking of the code has been done by comparing the simulation results with some of the experiments reported in the literature. Scaling laws for the computational time is established by varying the number of processor cores and number of macroparticles used. The capabilities of the code are highlighted by implementing various diagnostic tools. To study the dynamics of the laser-cluster interactions, the executable version of the code is available from the author.

  8. Parallel Exploration of Interaction Space by BioID and Affinity Purification Coupled to Mass Spectrometry.

    PubMed

    Hesketh, Geoffrey G; Youn, Ji-Young; Samavarchi-Tehrani, Payman; Raught, Brian; Gingras, Anne-Claude

    2017-01-01

    Complete understanding of cellular function requires knowledge of the composition and dynamics of protein interaction networks, the importance of which spans all molecular cell biology fields. Mass spectrometry-based proteomics approaches are instrumental in this process, with affinity purification coupled to mass spectrometry (AP-MS) now widely used for defining interaction landscapes. Traditional AP-MS methods are well suited to providing information regarding the temporal aspects of soluble protein-protein interactions, but the requirement to maintain protein-protein interactions during cell lysis and AP means that both weak-affinity interactions and spatial information is lost. A more recently developed method called BioID employs the expression of bait proteins fused to a nonspecific biotin ligase, BirA*, that induces in vivo biotinylation of proximal proteins. Coupling this method to biotin affinity enrichment and mass spectrometry negates many of the solubility and interaction strength issues inherent in traditional AP-MS methods, and provides unparalleled spatial context for protein interactions. Here we describe the parallel implementation of both BioID and FLAG AP-MS allowing simultaneous exploration of both spatial and temporal aspects of protein interaction networks.

  9. Influence of supramolecular structures in crystals on parallel stacking interactions between pyridine molecules.

    PubMed

    Janjić, Goran V; Ninković, Dragan B; Zarić, Snezana D

    2013-08-01

    Parallel stacking interactions between pyridines in crystal structures and the influence of hydrogen bonding and supramolecular structures in crystals on the geometries of interactions were studied by analyzing data from the Cambridge Structural Database (CSD). In the CSD 66 contacts of pyridines have a parallel orientation of molecules and most of these pyridines simultaneously form hydrogen bonds (44 contacts). The geometries of stacked pyridines observed in crystal structures were compared with the geometries obtained by calculations and explained by supramolecular structures in crystals. The results show that the mean perpendicular distance (R) between pyridine rings with (3.48 Å) and without hydrogen bonds (3.62 Å) is larger than that calculated, because of the influence of supramolecular structures in crystals. The pyridines with hydrogen bonds show a pronounced preference for offsets of 1.25-1.75 Å, close to the position of the calculated minimum (1.80 Å). However, stacking interactions of pyridines without hydrogen bonds do not adopt values at or close to that of the calculated offset. This is because stacking interactions of pyridines without hydrogen bonds are less strong, and they are more susceptible to the influence of supramolecular structures in crystals. These results show that hydrogen bonding and supramolecular structures have an important influence on the geometries of stacked pyridines in crystals.

  10. Formation of electron kappa distributions due to interactions with parallel propagating whistler waves

    SciTech Connect

    Tao, X. Lu, Q.

    2014-02-15

    In space plasmas, charged particles are frequently observed to possess a high-energy tail, which is often modeled by a kappa-type distribution function. In this work, the formation of the electron kappa distribution in generation of parallel propagating whistler waves is investigated using fully nonlinear particle-in-cell (PIC) simulations. A previous research concluded that the bi-Maxwellian character of electron distributions is preserved in PIC simulations. We now demonstrate that for interactions between electrons and parallel propagating whistler waves, a non-Maxwellian high-energy tail can be formed, and a kappa distribution can be used to fit the electron distribution in time-asymptotic limit. The κ-parameter is found to decrease with increasing initial temperature anisotropy or decreasing ratio of electron plasma frequency to cyclotron frequency. The results might be helpful to understanding the origin of electron kappa distributions observed in space plasmas.

  11. Orbital-based insights into parallel-displaced and twisted conformations in π-π interactions.

    PubMed

    Lutz, Patricia B; Bayse, Craig A

    2013-06-21

    Dispersion and electrostatics are known to stabilize π-π interactions, but the preference for parallel-displaced (PD) and/or twisted (TW) over sandwiched (S) conformations is not well understood. Orbital interactions are generally believed to play little to no role in π-stacking. However, orbital analysis of the dimers of benzene, pyridine, cytosine and several polyaromatic hydrocarbons demonstrates that PD and/or TW structures convert one or more π-type dimer MOs with out-of-phase or antibonding inter-ring character at the S stack to in-phase or bonding in the PD/TW stack. This change in dimer MO character can be described in terms of a qualitative stack bond order (SBO) defined as the difference between the number of occupied in-phase/bonding and out-of-phase/antibonding inter-ring π-type MOs. The concept of an SBO is introduced here in analogy to the bond order in molecular orbital theory. Thus, whereas the SBO of the S structure is zero, parallel displacement or twisting the stack results in a non-zero SBO and overall bonding character. The shift in bonding/antibonding character found at optimal PD/TW structures maximizes the inter-ring density, as measured by intermolecular Wiberg bond indices (WBIs). Values of WBIs calculated as a function of the parallel-displacement are found to correlate with the dispersion and other contributions to the π-π interaction energy determined by the highly accurate density-fitting DFT symmetry adapted perturbation theory (DF-DFT-SAPT) method. These DF-DFT-SAPT calculations also suggest that the dispersion and other contributions are maximized at the PD conformation rather than the S when conducted on a potential energy curve where the inter-ring distance is optimized at fixed slip distances. From these results of this study, we conclude that descriptions of the qualitative manner in which orbitals interact within π-stacking interactions can supplement high-level calculations of the interaction energy and provide an

  12. Precision control of charge coherence in parallel double dot systems through spin-orbit interaction

    NASA Astrophysics Data System (ADS)

    Jin, Jinshuang; Tu, Matisse Wei-Yuan; Wang, Nien-En; Zhang, Wei-Min

    2013-08-01

    In terms of the exact quantum master equation solution for open electronic systems, the coherent dynamics of two charge states described by two parallel quantum dots with one fully polarized electron on either dot is investigated in the presence of spin-orbit interaction. We demonstrate that the double dot system can stay in a dynamically decoherence free space. The coherence between two double dot charge states can be precisely manipulated through a spin-orbit coupling. The effects of the temperature, the finite bandwidth of lead, and the energy deviations during the coherence manipulation are also explored.

  13. Dynamical interaction effects on an electric dipole moving parallel to a flat solid surface

    SciTech Connect

    Villo-Perez, Isidro; Abril, Isabel; Garcia-Molina, Rafael; Arista, Nestor R.

    2005-05-15

    The interaction experienced by a fast electric dipole moving parallel and close to a flat solid surface is studied using the dielectric formalism. Analytical expressions for the force acting on the dipole, for random and for particular orientations, are obtained. Several features related to the dynamical effects on the induced forces are discussed, and numerical values are obtained for the different cases. The calculated energy loss of the electric dipole provides useful estimations which could be of interest for small-angle scattering experiments using polar molecules.

  14. Interaction of Aspergillus fumigatus conidia with Acanthamoeba castellanii parallels macrophage-fungus interactions.

    PubMed

    Van Waeyenberghe, Lieven; Baré, Julie; Pasmans, Frank; Claeys, Myriam; Bert, Wim; Haesebrouck, Freddy; Houf, Kurt; Martel, An

    2013-12-01

    Aspergillus fumigatus and free-living amoebae are common inhabitants of soil. Mechanisms of A. fumigatus to circumvent the amoeba's digestion may facilitate overcoming the vertebrate macrophage defence mechanisms. We performed co-culture experiments using A. fumigatus conidia and the amoeba Acanthamoeba castellanii. Approximately 25% of the amoebae ingested A. fumigatus conidia after 1 h of contact. During intra-amoebal passage, part of the ingested conidia was able to escape the food vacuole and to germinate inside the cytoplasm of A. castellanii. Fungal release into the extra-protozoan environment by exocytosis of conidia or by germination was observed with light and transmission electron microscopy. These processes resulted in structural changes in A. castellanii, leading to amoebal permeabilization without cell lysis. In conclusion, A. castellanii internalizes A. fumigatus conidia, resulting in fungal intracellular germination and subsequent amoebal death. As such, this interaction highly resembles that of A. fumigatus with mammalian and avian macrophages. This suggests that A. fumigatus virulence mechanisms to evade macrophage killing may be acquired by co-evolutionary interactions among A. fumigatus and environmental amoebae.

  15. Interaction of elastocapillary flows in parallel microchannels across a thin membrane

    NASA Astrophysics Data System (ADS)

    Reddy, S. P.; Samy, R. A.; Sen, A. K.

    2016-10-01

    We report the interaction of counter elastocapillary flows in parallel microchannels across a thin membrane. At the crossing point, the interaction between the capillary flows via the thin membrane leads to significant retardation of capillary flow. The drop in velocity at the crossing point and velocity variation after the crossing point are predicted using the analytical model and measured from experiments. A non-dimensional parameter J, which is the ratio of the capillary force to the mechanical restoring force, governs the drop in velocity at the crossing point with the maximum drop of about 60% for J = 1. The meniscus velocity after the crossing point decreases (J < 0.5), remains constant (0.5 < J < 0.6), or increases (J > 0.6) depending on the value of J. The proposed technique can be applied for the manipulation of capillary flows in microchannels.

  16. A Parallel Monolithic Approach for Fluid-Structure Interaction in a Cerebral Aneurysm

    NASA Astrophysics Data System (ADS)

    Sahin, Mehmet; Eken, Ali

    2014-11-01

    A parallel fully-coupled approach has been developed for the fluid-structure interaction problem in a cerebral artery with aneurysm. An Arbitrary Lagrangian-Eulerian formulation based on the side-centered unstructured finite volume method is employed for the governing incompressible Navier-Stokes equations and the classical Galerkin finite element formulation is used to discretize the constitutive law for the Saint Venant-Kirchhoff material in a Lagrangian frame for the solid domain. The time integration method for the structure domain is based on the energy conserving mid-point method while the second-order backward difference is used within the fluid domain. The resulting large-scale algebraic linear equations are solved using a one-level restricted additive Schwarz preconditioner with a block-incomplete factorization within each partitioned sub-domains. The parallel implementation of the present fully coupled unstructured fluid-structure solver is based on the PETSc library. The proposed numerical algorithm is initially validated for several classical benchmark problems and then applied to a more complicated problem involving unsteady pulsatile blood flow in a cerebral artery with aneurysm as a realistic fluid-structure interaction problem encountered in biomechanics. The authors acknowledge financial support from Turkish National Scientific and Technical Research Council through Project Number 112M107.

  17. Parallel changes of taxonomic interaction networks in lacustrine bacterial communities induced by a polymetallic perturbation

    PubMed Central

    Laplante, Karine; Sébastien, Boutin; Derome, Nicolas

    2013-01-01

    Heavy metals released by anthropogenic activities such as mining trigger profound changes to bacterial communities. In this study we used 16S SSU rRNA gene high-throughput sequencing to characterize the impact of a polymetallic perturbation and other environmental parameters on taxonomic networks within five lacustrine bacterial communities from sites located near Rouyn-Noranda, Quebec, Canada. The results showed that community equilibrium was disturbed in terms of both diversity and structure. Moreover, heavy metals, especially cadmium combined with water acidity, induced parallel changes among sites via the selection of resistant OTUs (Operational Taxonomic Unit) and taxonomic dominance perturbations favoring the Alphaproteobacteria. Furthermore, under a similar selective pressure, covariation trends between phyla revealed conservation and parallelism within interphylum interactions. Our study sheds light on the importance of analyzing communities not only from a phylogenetic perspective but also including a quantitative approach to provide significant insights into the evolutionary forces that shape the dynamic of the taxonomic interaction networks in bacterial communities. PMID:23789031

  18. FaCSI: A block parallel preconditioner for fluid-structure interaction in hemodynamics

    NASA Astrophysics Data System (ADS)

    Deparis, Simone; Forti, Davide; Grandperrin, Gwenol; Quarteroni, Alfio

    2016-12-01

    Modeling Fluid-Structure Interaction (FSI) in the vascular system is mandatory to reliably compute mechanical indicators in vessels undergoing large deformations. In order to cope with the computational complexity of the coupled 3D FSI problem after discretizations in space and time, a parallel solution is often mandatory. In this paper we propose a new block parallel preconditioner for the coupled linearized FSI system obtained after space and time discretization. We name it FaCSI to indicate that it exploits the Factorized form of the linearized FSI matrix, the use of static Condensation to formally eliminate the interface degrees of freedom of the fluid equations, and the use of a SIMPLE preconditioner for saddle-point problems. FaCSI is built upon a block Gauss-Seidel factorization of the FSI Jacobian matrix and it uses ad-hoc preconditioners for each physical component of the coupled problem, namely the fluid, the structure and the geometry. In the fluid subproblem, after operating static condensation of the interface fluid variables, we use a SIMPLE preconditioner on the reduced fluid matrix. Moreover, to efficiently deal with a large number of processes, FaCSI exploits efficient single field preconditioners, e.g., based on domain decomposition or the multigrid method. We measure the parallel performances of FaCSI on a benchmark cylindrical geometry and on a problem of physiological interest, namely the blood flow through a patient-specific femoropopliteal bypass. We analyze the dependence of the number of linear solver iterations on the cores count (scalability of the preconditioner) and on the mesh size (optimality).

  19. Use of Hilbert Curves in Parallelized CUDA code: Interaction of Interstellar Atoms with the Heliosphere

    NASA Astrophysics Data System (ADS)

    Destefano, Anthony; Heerikhuisen, Jacob

    2015-04-01

    Fully 3D particle simulations can be a computationally and memory expensive task, especially when high resolution grid cells are required. The problem becomes further complicated when parallelization is needed. In this work we focus on computational methods to solve these difficulties. Hilbert curves are used to map the 3D particle space to the 1D contiguous memory space. This method of organization allows for minimized cache misses on the GPU as well as a sorted structure that is equivalent to an octal tree data structure. This type of sorted structure is attractive for uses in adaptive mesh implementations due to the logarithm search time. Implementations using the Message Passing Interface (MPI) library and NVIDIA's parallel computing platform CUDA will be compared, as MPI is commonly used on server nodes with many CPU's. We will also compare static grid structures with those of adaptive mesh structures. The physical test bed will be simulating heavy interstellar atoms interacting with a background plasma, the heliosphere, simulated from fully consistent coupled MHD/kinetic particle code. It is known that charge exchange is an important factor in space plasmas, specifically it modifies the structure of the heliosphere itself. We would like to thank the Alabama Supercomputer Authority for the use of their computational resources.

  20. Determination of interaction forces between parallel dislocations by the evaluation of J integrals of plane elasticity

    NASA Astrophysics Data System (ADS)

    Lubarda, Vlado A.

    2016-03-01

    The Peach-Koehler expressions for the glide and climb components of the force exerted on a straight dislocation in an infinite isotropic medium by another straight dislocation are derived by evaluating the plane and antiplane strain versions of J integrals around the center of the dislocation. After expressing the elastic fields as the sums of elastic fields of each dislocation, the energy momentum tensor is decomposed into three parts. It is shown that only one part, involving mixed products from the two dislocation fields, makes a nonvanishing contribution to J integrals and the corresponding dislocation forces. Three examples are considered, with dislocations on parallel or intersecting slip planes. For two edge dislocations on orthogonal slip planes, there are two equilibrium configurations in which the glide and climb components of the dislocation force simultaneously vanish. The interactions between two different types of screw dislocations and a nearby circular void, as well as between parallel line forces in an infinite or semi-infinite medium, are then evaluated.

  1. Discovery of protein interactions using parallel analysis of translated ORFs (PLATO).

    PubMed

    Larman, H Benjamin; Liang, Anthony C; Elledge, Stephen J; Zhu, Jian

    2014-01-01

    Parallel analysis of translated open reading frames (ORFs) (PLATO) can be used for the unbiased discovery of interactions between full-length proteins encoded by a library of 'prey' ORFs and surface-immobilized 'bait' antibodies, polypeptides or small-molecular-weight compounds. PLATO uses ribosome display (RD) to link ORF-derived mRNA molecules to the proteins they encode, and recovered mRNA from affinity enrichment is subjected to analysis using massively parallel DNA sequencing. Compared with alternative in vitro methods, PLATO provides several advantages including library size and cost. A unique advantage of PLATO is that an alternative reverse transcription-quantitative PCR (RT-qPCR) protocol can be used to test binding of specific, individual proteins. To illustrate a typical experimental workflow, we demonstrate PLATO for the identification of the immune target of serum antibodies from patients with inclusion body myositis (IBM). Beginning with an ORFeome library in an RD vector, the protocol can produce samples for deep sequencing or RT-qPCR within 4 d.

  2. Extensions of parallel coordinates for interactive exploration of large multi-timepoint data sets.

    PubMed

    Blaas, Jorik; Botha, Charl P; Post, Frits H

    2008-01-01

    Parallel coordinate plots (PCPs) are commonly used in information visualization to provide insight into multi-variate data. These plots help to spot correlations between variables. PCPs have been successfully applied to unstructured datasets up to a few millions of points. In this paper, we present techniques to enhance the usability of PCPs for the exploration of large, multi-timepoint volumetric data sets, containing tens of millions of points per timestep. The main difficulties that arise when applying PCPs to large numbers of data points are visual clutter and slow performance, making interactive exploration infeasible. Moreover, the spatial context of the volumetric data is usually lost. We describe techniques for preprocessing using data quantization and compression, and for fast GPU-based rendering of PCPs using joint density distributions for each pair of consecutive variables, resulting in a smooth, continuous visualization. Also, fast brushing techniques are proposed for interactive data selection in multiple linked views, including a 3D spatial volume view. These techniques have been successfully applied to three large data sets: Hurricane Isabel (Vis'04 contest), the ionization front instability data set (Vis'08 design contest), and data from a large-eddy simulation of cumulus clouds. With these data, we show how PCPs can be extended to successfully visualize and interactively explore multi-timepoint volumetric datasets with an order of magnitude more data points.

  3. Modelling packing interactions in parallel helix bundles: pentameric bundles of nicotinic receptor M2 helices.

    PubMed

    Sankararamakrishnan, R; Sansom, M S

    1995-11-01

    The transbilayer pore of the nicotinic acetylcholine receptor (nAChR) is formed by a pentameric bundle of M2 helices. Models of pentameric bundles of M2 helices have been generated using simulated annealing via restrained molecular dynamics. The influence of: (a) the initial C alpha template; and (b) screening of sidechain electrostatic interactions on the geometry of the resultant M2 helix bundles is explored. Parallel M2 helices, in the absence of sidechain electrostatic interactions, pack in accordance with simple ridges-in-grooves considerations. This results in a helix crossing angle of ca. +12 degrees, corresponding to a left-handed coiled coil structure for the bundle as a whole. Tilting of M2 helices away from the central pore axis at their C-termini and/or inclusion of sidechain electrostatic interactions may perturb such ridges-in-grooves packing. In the most extreme cases right-handed coiled coils are formed. An interplay between inter-helix H-bonding and helix bundle geometry is revealed. The effects of changes in electrostatic screening on the dimensions of the pore mouth are described and the significance of these changes in the context of models for the nAChR pore domain is discussed.

  4. A new cascaded control strategy for paralleled line-interactive UPS with LCL filter

    NASA Astrophysics Data System (ADS)

    Zhang, X. Y.; Zhang, X. H.; Li, L.; Luo, F.; Zhang, Y. S.

    2016-08-01

    Traditional uninterrupted power supply (UPS) is difficult to meet the output voltage quality and grid-side power quality requirements at the same time, and usually has some disadvantage, such as multi-stage conversion, complex structure, or harmonic current pollution to the utility grid and so on. A three-phase three-level paralleled line-interactive UPS with LCL filter is presented in this paper. It can achieve the output voltage quality and grid-side power quality control simultaneously with only single-conversion power stage, but the multi-objective control strategy design is difficult. Based on the detailed analysis of the circuit structure and operation mechanism, a new cascaded control strategy for the power, voltage, and current is proposed. An outer current control loop based on the resonant control theory is designed to ensure the grid-side power quality. An inner voltage control loop based on the capacitance voltage and capacitance current feedback is designed to ensure the output voltage quality and avoid the resonance peak of the LCL filter. Improved repetitive controller is added to reduce the distortion of the output voltage. The setting of the controller parameters is detailed discussed. A 100kVA UPS prototype is built and experiments under the unbalanced resistive load and nonlinear load are carried out. Theoretical analysis and experimental results show the effectiveness of the control strategy. The paralleled line-interactive UPS can not only remain constant three-phase balanced output voltage, but also has the comprehensive power quality management functions with three-phase balanced grid active power input, low THD of output voltage and grid current, and reactive power compensation. The UPS is a green friendly load to the utility.

  5. Structure and drug interactions of parallel-stranded DNA studied by infrared spectroscopy and fluorescence.

    PubMed Central

    Fritzsche, H; Akhebat, A; Taillandier, E; Rippe, K; Jovin, T M

    1993-01-01

    The infrared spectra of three different 25-mer parallel-stranded DNAs (ps-DNA) have been studied. We have used ps-DNAs containing either exclusively dA x dT base pairs or substitution with four dG x dC base pairs and have them compared with their antiparallel-stranded (aps) reference duplexes in a conventional B-DNA conformation. Significant differences have been found in the region of the thymine C = O stretching vibrations. The parallel-stranded duplexes showed characteristic marker bands for the C2 = O2 and C4 = O4 carbonyl stretching vibrations of thymine at 1685 cm-1 and 1668 cm-1, respectively, as compared to values of 1696 cm-1 and 1663 cm-1 for the antiparallel-stranded reference duplexes. The results confirm previous studies indicating that the secondary structure in parallel-stranded DNA is established by reversed Watson--Crick base pairing of dA x dT with hydrogen bonds between N6H...O2 and N1...HN3. The duplex structure of the ps-DNA is much more sensitive to dehydration than that of the aps-DNA. Interaction with three drugs known to bind in the minor groove of aps-DNA--netropsin, distamycin A and Hoechst 33258--induces shifts of the C = O stretching vibrations of ps-DNA even at low ratio of drug per DNA base pair. These results suggest a conformational change of the ps-DNA to optimize the DNA-drug interaction. As demonstrated by excimer fluorescence of strands labeled with pyrene at the 5'-end, the drugs induce dissociation of the ps-DNA duplex with subsequent formation of imperfectly matched aps-DNA to allow the more favorable drug binding to aps-DNA. Similarly, attempts to form a triple helix of the type d(T)n.d(A)n.d(T)n with ps-DNA failed and resulted in the dissociation of the ps-DNA duplex and reformation of a triple helix based upon an aps-DNA duplex core d(T)10.d(A)10. PMID:7504812

  6. 3D magnetospheric parallel hybrid multi-grid method applied to planet–plasma interactions

    SciTech Connect

    Leclercq, L.; Mancini, M.

    2016-03-15

    We present a new method to exploit multiple refinement levels within a 3D parallel hybrid model, developed to study planet–plasma interactions. This model is based on the hybrid formalism: ions are kinetically treated whereas electrons are considered as a inertia-less fluid. Generally, ions are represented by numerical particles whose size equals the volume of the cells. Particles that leave a coarse grid subsequently entering a refined region are split into particles whose volume corresponds to the volume of the refined cells. The number of refined particles created from a coarse particle depends on the grid refinement rate. In order to conserve velocity distribution functions and to avoid calculations of average velocities, particles are not coalesced. Moreover, to ensure the constancy of particles' shape function sizes, the hybrid method is adapted to allow refined particles to move within a coarse region. Another innovation of this approach is the method developed to compute grid moments at interfaces between two refinement levels. Indeed, the hybrid method is adapted to accurately account for the special grid structure at the interfaces, avoiding any overlapping grid considerations. Some fundamental test runs were performed to validate our approach (e.g. quiet plasma flow, Alfven wave propagation). Lastly, we also show a planetary application of the model, simulating the interaction between Jupiter's moon Ganymede and the Jovian plasma.

  7. Distinct cerebellar lobules process arousal, valence and their interaction in parallel following a temporal hierarchy.

    PubMed

    Styliadis, Charis; Ioannides, Andreas A; Bamidis, Panagiotis D; Papadelis, Christos

    2015-04-15

    The cerebellum participates in emotion-related neural circuits formed by different cortical and subcortical areas, which sub-serve arousal and valence. Recent neuroimaging studies have shown a functional specificity of cerebellar lobules in the processing of emotional stimuli. However, little is known about the temporal component of this process. The goal of the current study is to assess the spatiotemporal profile of neural responses within the cerebellum during the processing of arousal and valence. We hypothesized that the excitation and timing of distinct cerebellar lobules is influenced by the emotional content of the stimuli. By using magnetoencephalography, we recorded magnetic fields from twelve healthy human individuals while passively viewing affective pictures rated along arousal and valence. By using a beamformer, we localized gamma-band activity in the cerebellum across time and we related the foci of activity to the anatomical organization of the cerebellum. Successive cerebellar activations were observed within distinct lobules starting ~160ms after the stimuli onset. Arousal was processed within both vermal (VI and VIIIa) and hemispheric (left Crus II) lobules. Valence (left VI) and its interaction (left V and left Crus I) with arousal were processed only within hemispheric lobules. Arousal processing was identified first at early latencies (160ms) and was long-lived (until 980ms). In contrast, the processing of valence and its interaction to arousal was short lived at later stages (420-530ms and 570-640ms respectively). Our findings provide for the first time evidence that distinct cerebellar lobules process arousal, valence, and their interaction in a parallel yet temporally hierarchical manner determined by the emotional content of the stimuli.

  8. Experimental Studies of the Interaction Between a Parallel Shear Flow and a Directionally-Solidifying Front

    NASA Technical Reports Server (NTRS)

    Zhang, Meng; Maxworthy, Tony

    1999-01-01

    sample cell, driven by an outside rotating magnet, in order to generate the flow. However, it appears that this was not a well-controlled flow and may also have been unsteady. In the present experimental study, we want to study how a forced parallel shear flow in a Hele-Shaw cell interacts with the directionally solidifying crystal interface. The comparison of experimental data show that the parallel shear flow in a Hele-Shaw cell has a strong stabilizing effect on the planar interface by damping the existing initial perturbations. The flow also shows a stabilizing effect on the cellular interface by slightly reducing the exponential growth rate of cells. The left-right symmetry of cells is broken by the flow with cells tilting toward the incoming flow direction. The tilting angle increases with the velocity ratio. The experimental results are explained through the parallel flow effect on lateral solute transport. The phenomenon of cells tilting against the flow is consistent with the numerical result of Dantzig and Chao.

  9. Interaction of a Rectangular Jet with a Flat-Plate Placed Parallel to the Flow

    NASA Technical Reports Server (NTRS)

    Zaman, K. B. M. Q.; Brown, C. A.; Bridges, J. A.

    2013-01-01

    An experimental study is carried out addressing the flowfield and radiated noise from the interaction of a large aspect ratio rectangular jet with a flat plate placed parallel to but away from the direct path of the jet. Sound pressure level spectra exhibit an increase in the noise levels for both the 'reflected' and 'shielded' sides of the plate relative to the free-jet case. Detailed cross-sectional distributions of flowfield properties obtained by hot-wire anemometry are documented for a low subsonic condition. Corresponding mean Mach number distributions obtained by Pitot-probe surveys are presented for high subsonic conditions. In the latter flow regime and for certain relative locations of the plate, a flow resonance accompanied by audible tones is encountered. Under the resonant condition the jet cross-section experiences an 'axis-switching' and flow visualization indicates the presence of an organized 'vortex street'. The trends of the resonant frequency variation with flow parameters exhibit some similarities to, but also marked differences with, corresponding trends of the well-known edgetone phenomenon.

  10. Software tools for developing parallel applications. Part 2: Interactive control and performance tuning

    SciTech Connect

    Brown, J.; Geist, A.; Pancake, C.; Rover, D.

    1997-04-01

    This paper continues the discussion of parallel tool support with an overview of the current state of tools for runtime control and performance tuning. Each is discussed in terms of the programmer needs addressed, the extent to which representative current tools meet those needs, and what new levels of tool support are important if parallel computing is to become more widespread.

  11. Parallelization of the Flow Field Dependent Variation Scheme for Solving the Triple Shock/Boundary Layer Interaction Problem

    NASA Technical Reports Server (NTRS)

    Schunk, Greg; chung, T. J.

    1999-01-01

    A parallelized version of the Flowfield Dependent Variation (FDV) Method is developed to analyze a problem of current research interest, the flowfield resulting from a triple shock/boundary layer interaction. Such flowfields are often encountered in the inlets of high speed air-breathing vehicles including NASA's Hyper-X. In order to resolve the complex shock structure and to provide adequate resolution for boundary layer computations of the convective heat transfer from surfaces inside the inlet, models containing over 500,000 nodes are needed. Efficient parallelization of the computation is essential to obtaining the results in a timely manner. Results from different parallelization schemes, based upon multi-threading and message passing, as implemented on multiple processor supercomputers and on distributed workstations are compared.

  12. Parallelization of the Flow Field Dependent Variation Scheme for Solving the Triple Shock/Boundary Layer Interaction Problem

    NASA Technical Reports Server (NTRS)

    Schunk, Richard Gregory; Chung, T. J.

    2001-01-01

    A parallelized version of the Flowfield Dependent Variation (FDV) Method is developed to analyze a problem of current research interest, the flowfield resulting from a triple shock/boundary layer interaction. Such flowfields are often encountered in the inlets of high speed air-breathing vehicles including the NASA Hyper-X research vehicle. In order to resolve the complex shock structure and to provide adequate resolution for boundary layer computations of the convective heat transfer from surfaces inside the inlet, models containing over 500,000 nodes are needed. Efficient parallelization of the computation is essential to achieving results in a timely manner. Results from a parallelization scheme, based upon multi-threading, as implemented on multiple processor supercomputers and workstations is presented.

  13. Mechanical Behavior of Collagen-Fibrin Co-Gels Reflects Transition From Series to Parallel Interactions With Increasing Collagen Content

    PubMed Central

    Lai, Victor K.; Lake, Spencer P.; Frey, Christina R.; Tranquillo, Robert T.; Barocas, Victor H.

    2013-01-01

    Fibrin and collagen, biopolymers occurring naturally in the body, are commonly-used biomaterials as scaffolds for tissue engineering. How collagen and fibrin interact to confer macroscopic mechanical properties in collagen-fibrin composite systems remains poorly understood. In this study, we formulated collagen-fibrin co-gels at different collagen-to-fibrin ratios to observe changes in overall mechanical behavior and microstructure. A modeling framework of a two-network system was developed by modifying our micro-scale model, considering two forms of interaction between the networks: (a) two interpenetrating but non-interacting networks (“parallel”), and (b) a single network consisting of randomly alternating collagen and fibrin fibrils (“series”). Mechanical testing of our gels show that collagen-fibrin co-gels exhibit intermediate properties (UTS, strain at failure, tangent modulus) compared to those of pure collagen and fibrin. Comparison with model predictions show that the parallel and series model cases provide upper and lower bounds respectively for the experimental data, suggesting that a combination of such interactions exist between collagen and fibrin in co-gels. A transition from the series model to the parallel model occurs with increasing collagen content, with the series model best describing predominantly fibrin co-gels, and the parallel model best describing predominantly collagen co-gels. PMID:22482659

  14. Parallel electric field generation in the ionosphere over thunderstorms and the interaction with ionospheric electrons

    NASA Astrophysics Data System (ADS)

    Rowland, D.; Wygant, J.; Pfaff, R.; Farrell, W.; Goetz, K.; Monson, S.

    Sounding rockets launched by Mike Kelley and his group at Cornell demonstrated the existence of transient (1 ms) electric fields associated with lightning strikes at high altitudes above active thunderstorms. These electric fields had a component parallel to the Earth's magnetic field, and were unipolar and large in amplitude. They were thought to be strong enough to energize electrons and generate strong turbulence as the beams thermalized. The parallel electric fields were observed on multiple flights, but high time resolution measurements were not made within 100 km horizontal distance of lightning strokes, where the electric fields are largest. In 2000 the ``Lightning Bolt'' sounding rocket (NASA 27.143) was launched directly over an active thunderstorm to an apogee near 300 km. The sounding rocket was equipped with sensitive electric and magnetic field instruments as well as a photometer and electrostatic analyser for measuring accelerated electrons. The electric and magnetic fields were sampled at 10 million samples per second, letting us fully resolve the structure of the parallel electric field pulse up to and beyond the plasma frequency. We will present results from the Lightning Bolt mission, concentrating on the parallel electric field pulses that arrive before the lower-frequency whistler wave modes. We observe pulses with peak electric fields of a few mV/m lasting for a substantial fraction of a millisecond. Superimposed on this is high-frequency turbulence, comparable in amplitude to the pulse itself. This is the first direct observation of this structure in the parallel electric field, within 100 km horizontal distance of the lightning stroke. We will present evidence for the method of generation of these parallel fields, and discuss their probable effect on ionospheric electrons.

  15. Achieving High Performance in Parallel Applications via Kernel-Application Interaction

    DTIC Science & Technology

    1996-04-01

    empirical numbers. Tom LeBlanc’s past experience in real-time systems and insightful criticisms proved valuable, and Professor Abraham Seid- mann’s...Becker, P. Das, J. Karlsson, and C. Quiroz . Operating system support for animate vi- sion. Journal of Parallel and Distributed Computing, 15(2):103

  16. Request queues for interactive clients in a shared file system of a parallel computing system

    DOEpatents

    Bent, John M.; Faibish, Sorin

    2015-08-18

    Interactive requests are processed from users of log-in nodes. A metadata server node is provided for use in a file system shared by one or more interactive nodes and one or more batch nodes. The interactive nodes comprise interactive clients to execute interactive tasks and the batch nodes execute batch jobs for one or more batch clients. The metadata server node comprises a virtual machine monitor; an interactive client proxy to store metadata requests from the interactive clients in an interactive client queue; a batch client proxy to store metadata requests from the batch clients in a batch client queue; and a metadata server to store the metadata requests from the interactive client queue and the batch client queue in a metadata queue based on an allocation of resources by the virtual machine monitor. The metadata requests can be prioritized, for example, based on one or more of a predefined policy and predefined rules.

  17. Selective visual attention to emotional words: Early parallel frontal and visual activations followed by interactive effects in visual cortex.

    PubMed

    Schindler, Sebastian; Kissler, Johanna

    2016-10-01

    Human brains spontaneously differentiate between various emotional and neutral stimuli, including written words whose emotional quality is symbolic. In the electroencephalogram (EEG), emotional-neutral processing differences are typically reflected in the early posterior negativity (EPN, 200-300 ms) and the late positive potential (LPP, 400-700 ms). These components are also enlarged by task-driven visual attention, supporting the assumption that emotional content naturally drives attention. Still, the spatio-temporal dynamics of interactions between emotional stimulus content and task-driven attention remain to be specified. Here, we examine this issue in visual word processing. Participants attended to negative, neutral, or positive nouns while high-density EEG was recorded. Emotional content and top-down attention both amplified the EPN component in parallel. On the LPP, by contrast, emotion and attention interacted: Explicit attention to emotional words led to a substantially larger amplitude increase than did explicit attention to neutral words. Source analysis revealed early parallel effects of emotion and attention in bilateral visual cortex and a later interaction of both in right visual cortex. Distinct effects of attention were found in inferior, middle and superior frontal, paracentral, and parietal areas, as well as in the anterior cingulate cortex (ACC). Results specify separate and shared mechanisms of emotion and attention at distinct processing stages. Hum Brain Mapp 37:3575-3587, 2016. © 2016 Wiley Periodicals, Inc.

  18. Interacting parallel pathways associate sounds with visual identity in auditory cortices.

    PubMed

    Ahveninen, Jyrki; Huang, Samantha; Ahlfors, Seppo P; Hämäläinen, Matti; Rossi, Stephanie; Sams, Mikko; Jääskeläinen, Iiro P

    2016-01-01

    Spatial and non-spatial information of sound events is presumably processed in parallel auditory cortex (AC) "what" and "where" streams, which are modulated by inputs from the respective visual-cortex subsystems. How these parallel processes are integrated to perceptual objects that remain stable across time and the source agent's movements is unknown. We recorded magneto- and electroencephalography (MEG/EEG) data while subjects viewed animated video clips featuring two audiovisual objects, a black cat and a gray cat. Adaptor-probe events were either linked to the same object (the black cat meowed twice in a row in the same location) or included a visually conveyed identity change (the black and then the gray cat meowed with identical voices in the same location). In addition to effects in visual (including fusiform, middle temporal or MT areas) and frontoparietal association areas, the visually conveyed object-identity change was associated with a release from adaptation of early (50-150ms) activity in posterior ACs, spreading to left anterior ACs at 250-450ms in our combined MEG/EEG source estimates. Repetition of events belonging to the same object resulted in increased theta-band (4-8Hz) synchronization within the "what" and "where" pathways (e.g., between anterior AC and fusiform areas). In contrast, the visually conveyed identity changes resulted in distributed synchronization at higher frequencies (alpha and beta bands, 8-32Hz) across different auditory, visual, and association areas. The results suggest that sound events become initially linked to perceptual objects in posterior AC, followed by modulations of representations in anterior AC. Hierarchical what and where pathways seem to operate in parallel after repeating audiovisual associations, whereas the resetting of such associations engages a distributed network across auditory, visual, and multisensory areas.

  19. Wave-particle interaction in parallel transport of long mean-free-path plasmas along open field magnetic field lines

    NASA Astrophysics Data System (ADS)

    Guo, Zehua; Tang, Xianzhu

    2012-03-01

    A tokamak fusion reactor dumps a large amount of heat and particle flux to the divertor through the scrape-off plasma (SOL). Situation exists either by necessity or through deliberate design that the SOL plasma attains long mean-free-path along large segments of the open field lines. The rapid parallel streaming of electrons requires a large parallel electric field to maintain ambipolarity. The confining effect of the parallel electric field on electrons leads to a trap/passing boundary in the velocity space for electrons. In the normal situation where the upstream electron source populates both the trapped and passing region, a mechanism must exist to produce a flux across the electron trap/passing boundary. In a short mean-free-path plasma, this is provided by collisions. For long mean-free-path plasmas, wave-particle interaction is the primary candidate for detrapping the electrons. Here we present simulation results and a theoretical analysis using a model distribution function of trapped electrons. The dominating electromagnetic plasma instability and the associated collisionless scattering, that produces both particle and energy fluxes across the electron trap/passing boundary in velocity space, are discussed.

  20. MPI parallelization of Vlasov codes for the simulation of nonlinear laser-plasma interactions

    NASA Astrophysics Data System (ADS)

    Savchenko, V.; Won, K.; Afeyan, B.; Decyk, V.; Albrecht-Marc, M.; Ghizzo, A.; Bertrand, P.

    2003-10-01

    The simulation of optical mixing driven KEEN waves [1] and electron plasma waves [1] in laser-produced plasmas require nonlinear kinetic models and massive parallelization. We use Massage Passing Interface (MPI) libraries and Appleseed [2] to solve the Vlasov Poisson system of equations on an 8 node dual processor MAC G4 cluster. We use the semi-Lagrangian time splitting method [3]. It requires only row-column exchanges in the global data redistribution, minimizing the total number of communications between processors. Recurrent communication patterns for 2D FFTs involves global transposition. In the Vlasov-Maxwell case, we use splitting into two 1D spatial advections and a 2D momentum advection [4]. Discretized momentum advection equations have a double loop structure with the outer index being assigned to different processors. We adhere to a code structure with separate routines for calculations and data management for parallel computations. [1] B. Afeyan et al., IFSA 2003 Conference Proceedings, Monterey, CA [2] V. K. Decyk, Computers in Physics, 7, 418 (1993) [3] Sonnendrucker et al., JCP 149, 201 (1998) [4] Begue et al., JCP 151, 458 (1999)

  1. Parallel adaptive fluid-structure interaction simulation of explosions impacting on building structures

    SciTech Connect

    Deiterding, Ralf; Wood, Stephen L

    2013-01-01

    We pursue a level set approach to couple an Eulerian shock-capturing fluid solver with space-time refinement to an explicit solid dynamics solver for large deformations and fracture. The coupling algorithms considering recursively finer fluid time steps as well as overlapping solver updates are discussed in detail. Our ideas are implemented in the AMROC adaptive fluid solver framework and are used for effective fluid-structure coupling to the general purpose solid dynamics code DYNA3D. Beside simulations verifying the coupled fluid-structure solver and assessing its parallel scalability, the detailed structural analysis of a reinforced concrete column under blast loading and the simulation of a prototypical blast explosion in a realistic multistory building are presented.

  2. Gamma ray bursts from comet neutron star magnetosphere interaction, field twisting and E sub parallel formation

    SciTech Connect

    Colgate, S.A.

    1990-01-01

    Consider the problem of a comet in a collision trajectory with a magnetized neutron star. The question addressed in this paper is whether the comet interacts strongly enough with a magnetic field such as to capture at a large radius or whether in general the comet will escape a magnetized neutron star. 6 refs., 4 figs.

  3. Interaction-induced local moments in parallel quantum dots within the functional renormalization group approach

    NASA Astrophysics Data System (ADS)

    Protsenko, V. S.; Katanin, A. A.

    2016-11-01

    We propose a version of the functional renormalization-group (fRG) approach, which is, due to including Litim-type cutoff and switching off (or reducing) the magnetic field during fRG flow, capable of describing a singular Fermi-liquid (SFL) phase, formed due to the presence of local moments in quantum dot structures. The proposed scheme allows one to describe the first-order quantum phase transition from the "singular" to the "regular" paramagnetic phase with applied gate voltage to parallel quantum dots, symmetrically coupled to leads, and shows sizable spin splitting of electronic states in the SFL phase in the limit of vanishing magnetic field H →0 ; the calculated conductance shows good agreement with the results of the numerical renormalization group. Using the proposed fRG approach with the counterterm, we also show that for asymmetric coupling of the leads to the dots the SFL behavior similar to that for the symmetric case persists, but with occupation numbers, effective energy levels, and conductance changing continuously through the quantum phase transition into the SFL phase.

  4. Exploring sensitivity & throughput of a parallel flow SPRi biosensor for characterization of antibody-antigen interaction.

    PubMed

    Kamat, Vishal; Rafique, Ashique

    2017-02-20

    Rapid growth in the field of biotherapeutics has led to an increased demand for high-throughput, label-free biosensors exhibiting high sensitivity. To support the current needs, Sierra Sensors introduced a surface plasmon resonance imaging (SPRi) based biosensor, Molecular Affinity Screening System (MASS-1). We assessed the potential utility of MASS-1 to support Regeneron's therapeutic antibody discovery. A large panel of antibody-antigen interactions was characterized using MASS-1 and the kinetic data were compared with the Biacore 4000 biosensor. Less than 10% deviation in the binding rate constants measured across eight flow channels of MASS-1 was observed. The single injection cycle kinetic assay allowed rapid measurement of binding rate constants for antibody-antigen interactions. MASS-1 sensitivity was independent of protein immobilization level and kinetic analysis performed using ultra-low density mAb surfaces allowed characterization of picomolar affinity interactions without mass transport limitation. High-throughput characterization of a panel of 189 monoclonal antibodies to 13 different antigens with molecular weights ranging from 14kD to 105kD revealed that binding kinetic parameters measured on MASS-1 were comparable to those measured on Biacore 4000. Our data demonstrate that MASS-1 measures reliable binding kinetic parameters and has an appropriate combination of throughput and sensitivity to support discovery and development of therapeutic antibodies.

  5. Parallel Three-Dimensional Computation of Fluid Dynamics and Fluid-Structure Interactions of Ram-Air Parachutes

    NASA Technical Reports Server (NTRS)

    Tezduyar, Tayfun E.

    1998-01-01

    This is a final report as far as our work at University of Minnesota is concerned. The report describes our research progress and accomplishments in development of high performance computing methods and tools for 3D finite element computation of aerodynamic characteristics and fluid-structure interactions (FSI) arising in airdrop systems, namely ram-air parachutes and round parachutes. This class of simulations involves complex geometries, flexible structural components, deforming fluid domains, and unsteady flow patterns. The key components of our simulation toolkit are a stabilized finite element flow solver, a nonlinear structural dynamics solver, an automatic mesh moving scheme, and an interface between the fluid and structural solvers; all of these have been developed within a parallel message-passing paradigm.

  6. A highly accurate and efficient algorithm for electrostatic interactions of charged particles confined by parallel metallic plates

    NASA Astrophysics Data System (ADS)

    Rostami, Samare; Ghasemi, S. Alireza; Nedaaee Oskoee, Ehsan

    2016-09-01

    We present an accurate and efficient algorithm to calculate the electrostatic interaction of charged point particles with partially periodic boundary conditions that are confined along the non-periodic direction by two parallel metallic plates. The method preserves the original boundary conditions, leading to an exact solution of the problem. In addition, the scaling complexity is quasilinear O ( N ln ( N ) ) , where N is the number of particles in the simulation box. Based on the superposition principle in electrostatics, the problem is split into two electrostatic problems where each can be calculated by the appropriate Poisson solver. The method is applied to NaCl ultra-thin films where its dielectric response with respect to an external bias voltage is investigated. Furthermore, the total charge induced on the metallic boundaries can be calculated to an arbitrary precision.

  7. The grid-based fast multipole method--a massively parallel numerical scheme for calculating two-electron interaction energies.

    PubMed

    Toivanen, Elias A; Losilla, Sergio A; Sundholm, Dage

    2015-12-21

    Algorithms and working expressions for a grid-based fast multipole method (GB-FMM) have been developed and implemented. The computational domain is divided into cubic subdomains, organized in a hierarchical tree. The contribution to the electrostatic interaction energies from pairs of neighboring subdomains is computed using numerical integration, whereas the contributions from further apart subdomains are obtained using multipole expansions. The multipole moments of the subdomains are obtained by numerical integration. Linear scaling is achieved by translating and summing the multipoles according to the tree structure, such that each subdomain interacts with a number of subdomains that are almost independent of the size of the system. To compute electrostatic interaction energies of neighboring subdomains, we employ an algorithm which performs efficiently on general purpose graphics processing units (GPGPU). Calculations using one CPU for the FMM part and 20 GPGPUs consisting of tens of thousands of execution threads for the numerical integration algorithm show the scalability and parallel performance of the scheme. For calculations on systems consisting of Gaussian functions (α = 1) distributed as fullerenes from C20 to C720, the total computation time and relative accuracy (ppb) are independent of the system size.

  8. Computation of interactional aerodynamics for noise prediction of heavy lift rotorcraft

    NASA Astrophysics Data System (ADS)

    Hennes, Christopher C.

    Many computational tools are used when developing a modern helicopter. As the design space is narrowed, more accurate and time-intensive tools are brought to bear. These tools are used to determine the effect of a design decision on the performance, handling, stability and efficiency of the aircraft. One notable parameter left out of this process is acoustics. This is due in part to the difficulty in making useful acoustics calculations that reveal the differences between various design configurations. This thesis presents a new approach designed to bridge the gap in prediction capability between fast but low-fidelity Lagrangian particle methods, and slow but high-fidelity Eulerian computational fluid dynamics simulations. A multi-pronged approach is presented. First, a simple flow solver using well-understood and tested flow solution methodologies is developed specifically to handle bodies in arbitrary motion. To this basic flow solver two new technologies are added. The first is an Immersed Boundary technique designed to be tolerant of geometric degeneracies and low-resolution grids. This new technique allows easy inclusion of complex fuselage geometries at minimal computational cost, improving the ability of a solver to capture the complex interactional aerodynamic effects expected in modern rotorcraft design. The second new technique is an extension of a concept from flow visualization where the motion of tip vortices are tracked through the solution using massless particles convecting with the local flow. In this extension of that concept, the particles maintain knowledge of the expected and actual vortex strength. As a post-processing step, when the acoustic calculations are made, these particles are used to augment the loading noise calculation and reproduce the highly-impulsive character of blade-vortex interaction noise. In combination these new techniques yield a significant improvement to the state of the art in rotorcraft blade-vortex interaction noise

  9. Effects of sex steroids on bones and muscles: Similarities, parallels, and putative interactions in health and disease.

    PubMed

    Carson, James A; Manolagas, Stavros C

    2015-11-01

    Estrogens and androgens influence the growth and maintenance of bones and muscles and are responsible for their sexual dimorphism. A decline in their circulating levels leads to loss of mass and functional integrity in both tissues. In the article, we highlight the similarities of the molecular and cellular mechanisms of action of sex steroids in the two tissues; the commonality of a critical role of mechanical forces on tissue mass and function; emerging evidence for an interplay between mechanical forces and hormonal and growth factor signals in both bones and muscles; as well as the current state of evidence for or against a cross-talk between muscles and bone. In addition, we review evidence for the parallels in the development of osteoporosis and sarcopenia with advancing age and the potential common mechanisms responsible for the age-dependent involution of these two tissues. Lastly, we discuss the striking difference in the availability of several drug therapies for the prevention and treatment of osteoporosis, as compared to none for sarcopenia. This article is part of a Special Issue entitled "Muscle Bone Interactions".

  10. Mars-solar wind interaction: LatHyS, an improved parallel 3-D multispecies hybrid model

    NASA Astrophysics Data System (ADS)

    Modolo, Ronan; Hess, Sebastien; Mancini, Marco; Leblanc, Francois; Chaufray, Jean-Yves; Brain, David; Leclercq, Ludivine; Esteban-Hernández, Rosa; Chanteur, Gerard; Weill, Philippe; González-Galindo, Francisco; Forget, Francois; Yagi, Manabu; Mazelle, Christian

    2016-07-01

    In order to better represent Mars-solar wind interaction, we present an unprecedented model achieving spatial resolution down to 50 km, a so far unexplored resolution for global kinetic models of the Martian ionized environment. Such resolution approaches the ionospheric plasma scale height. In practice, the model is derived from a first version described in Modolo et al. (2005). An important effort of parallelization has been conducted and is presented here. A better description of the ionosphere was also implemented including ionospheric chemistry, electrical conductivities, and a drag force modeling the ion-neutral collisions in the ionosphere. This new version of the code, named LatHyS (Latmos Hybrid Simulation), is here used to characterize the impact of various spatial resolutions on simulation results. In addition, and following a global model challenge effort, we present the results of simulation run for three cases which allow addressing the effect of the suprathermal corona and of the solar EUV activity on the magnetospheric plasma boundaries and on the global escape. Simulation results showed that global patterns are relatively similar for the different spatial resolution runs, but finest grid runs provide a better representation of the ionosphere and display more details of the planetary plasma dynamic. Simulation results suggest that a significant fraction of escaping O+ ions is originated from below 1200 km altitude.

  11. Prediction of BVI noise patterns and correlation with wake interaction locations

    NASA Technical Reports Server (NTRS)

    Marcolini, Michael A.; Martin, Ruth M.; Lorber, Peter F.; Egolf, T. A.

    1992-01-01

    High resolution fluctuating airloads data were acquired during a test of a contemporary design United Technologies model rotor in the Duits-Nederlandse Windtunnel (DNW). The airloads are used as input to the noise prediction program WOPWOP, in order to predict the blade-vortex interaction (BVI) noise field on a large plane below the rotor. Trends of predicted advancing and retreating side BVI noise levels and directionality as functions of flight condition are presented. The measured airloads have been analyzed to determine the BVI locations on the blade surface, and are used to interpret the predicted BVI noise radiation patterns. Predicted BVI locations are obtained using the free wake model in CAMRAD/JA, the UTRC Generalized Forward Flight Distorted Wake Model, and the UTRC FREEWAKE analysis. These predicted BVI locations are compared with those obtained from the measured pressure data.

  12. Development of a Multi-Grids Approach into a Parallelized Hybrid Model to Describe Ganymede's Interaction with the Jovian Plasma

    NASA Astrophysics Data System (ADS)

    Leclercq, L.; Modolo, R.; Leblanc, F.; Hess, S. L.; Andre, N.

    2014-12-01

    Ganymede is the only satellite which has its own magnetosphere, which is embedded in the Jovian magnetosphere (Kivelson et al. 1996). This peculiar interaction has been investigated by means of a 3D parallel multi-species hybrid model based on a CAM-CL algorithm (Mathews et al. 1994). In this formalism, ions have a kinetic description whereas electrons are considered as an inertialess fluid which ensures the neutrality of the plasma and contributes to the total current and electronic pressure. Maxwell's equations are solved to compute the temporal evolution of electromagnetic field. Hybrid simulations are performed on a uniform cartesian grid with a spatial resolution of about 240 km. Our results are globally consistent with other models and Galileo measurements. Nevertheless, our description of the magnetopause and the ionosphere is not satisfying enough due to the low spatial resolution. Indeed, we want to describe scale heights of 125 km in the ionosphere whereas the best spatial resolution that we are allowed to use is about 240 km. Therefore, in order to obtain more efficient and relevant results, it is necessary to improve the size of the grid. In this optic, we are introducing a multi-grids approach in order to refine the spatial resolution by a factor 2 (~120km) near Ganymede. The creation of a finer mesh in the simulation grid leads to make some peculiar computations at the interfaces between the two different grids, whether for the calculation of moments, such as charge density or current, or the computation of electromagnetic fields. Moreover, the parallelization of the code, based on domain decomposition methods, imposes us to take care of boundary conditions. In the hybrid model, macroparticules, which represent a kind of cloud of physical particles, have a volume equal to that of a grid cell. Then, the macroparticules entering into the higher spatial resolution region are splited into smaller macroparticules whose the volume corresponds to the volume

  13. Genetic and morphological analyses reveal a critical interaction between the C-termini of two SNARE proteins and a parallel four helical arrangement for the exocytic SNARE complex.

    PubMed Central

    Katz, L; Hanson, P I; Heuser, J E; Brennwald, P

    1998-01-01

    In a screen for suppressors of a temperature-sensitive mutation in the yeast SNAP-25 homolog, Sec9, we have identified a gain-of-function mutation in the yeast synaptobrevin homolog, Snc2. The genetic properties of this suppression point to a specific interaction between the C-termini of Sec9 and Snc2 within the SNARE complex. Biochemical analysis of interactions between the wild-type and mutant proteins confirms this prediction, demonstrating specific effects of these mutations on interactions between the SNAREs. The location of the mutations suggests that the C-terminal H2 helical domain of Sec9 is likely to be aligned in parallel with Snc2 in the SNARE complex. To test this prediction, we examined the structure of the yeast exocytic SNARE complex by deep-etch electron microscopy. Like the neuronal SNARE complex, it is a rod approximately 14 nm long. Using epitope tags, antibodies and maltose-binding protein markers, we find that the helical domains of Sso, Snc and both halves of Sec9 are all aligned in parallel within the SNARE complex, suggesting that the yeast exocytic SNARE complex consists of a parallel four helix bundle. Finally, we find a similar arrangement for SNAP-25 in the neuronal SNARE complex. This provides strong evidence that the exocytic SNARE complex is a highly conserved structure composed of four parallel helical domains whose C-termini must converge in order to bring about membrane fusion. PMID:9799229

  14. Approximate expression for the potential energy of the double-layer interaction between two parallel ion-penetrable membranes at small separations in an electrolyte solution.

    PubMed

    Ohshima, Hiroyuki

    2010-10-01

    An approximate expression for the potential energy of the double-layer interaction between two parallel similar ion-penetrable membranes in a symmetrical electrolyte solution is derived via a linearization method, in which the nonlinear Poisson-Boltzmann equations in the regions inside and outside the membranes are linearized with respect to the deviation of the electric potential from the Donnan potential. This approximation works quite well for small membrane separations h for all values of the density of fixed charges in the membranes (or the Donnan potential) and gives a correct limiting form of the interaction energy (or the interaction force) as h-->0.

  15. Parallel Atomistic Simulations

    SciTech Connect

    HEFFELFINGER,GRANT S.

    2000-01-18

    Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.

  16. A Study of Cell-to-Cell Interactions and Degradation in Parallel Strings: Implications for the Battery Management System

    NASA Astrophysics Data System (ADS)

    Pastor-Fernández, C.; Bruen, T.; Widanage, W. D.; Gama-Valdez, M. A.; Marco, J.

    2016-10-01

    Vehicle battery systems are usually designed with a high number of cells connected in parallel to meet the stringent requirements of power and energy. The self-balancing characteristic of parallel cells allows a battery management system (BMS) to approximate the cells as one equivalent cell with a single state of health (SoH) value, estimated either as capacity fade (SoHE) or resistance increase (SoHP). A single SoH value is however not applicable if the initial SoH of each cell is different, which can occur when cell properties change due to inconsistent manufacturing processes or in-homogeneous operating environments. As such this work quantifies the convergence of SoHE and SoHP due to initial differences in cell SoH and examines the convergence factors. Four 3 Ah 18650 cells connected in parallel at 25 °C are aged by charging and discharging for 500 cycles. For an initial SoHE difference of 40% and SoHP difference of 45%, SoHE converge to 10% and SoHP to 30% by the end of the experiment. From this, a strong linear correlation between ΔSoHE and ΔSoHP is also observed. The results therefore imply that a BMS should consider a calibration strategy to accurately estimate the SoH of parallel cells until convergence is reached.

  17. Large-scale parallel configuration interaction. I. Nonrelativistic and scalar-relativistic general active space implementation with application to (Rb-Ba)+.

    PubMed

    Knecht, Stefan; Jensen, Hans Jorgen Aa; Fleig, Timo

    2008-01-07

    We present a parallel implementation of a string-driven general active space configuration interaction program for nonrelativistic and scalar-relativistic electronic-structure calculations. The code has been modularly incorporated in the DIRAC quantum chemistry program package. The implementation is based on the message passing interface and a distributed data model in order to efficiently exploit key features of various modern computer architectures. We exemplify the nearly linear scalability of our parallel code in large-scale multireference configuration interaction (MRCI) calculations, and we discuss the parallel speedup with respect to machine-dependent aspects. The largest sample MRCI calculation includes 1.5x10(9) Slater determinants. Using the new code we determine for the first time the full short-range electronic potentials and spectroscopic constants for the ground state and for eight low-lying excited states of the weakly bound molecular system (Rb-Ba)+ with the spin-orbit-free Dirac formalism and using extensive uncontracted basis sets. The time required to compute to full convergence these electronic states for (Rb-Ba)+ in a single-point MRCI calculation correlating 18 electrons and using 16 cores was reduced from more than 10 days to less than 1 day.

  18. Large-scale parallel configuration interaction. I. Nonrelativistic and scalar-relativistic general active space implementation with application to (Rb-Ba)+

    NASA Astrophysics Data System (ADS)

    Knecht, Stefan; Jensen, Hans Jørgen Aa.; Fleig, Timo

    2008-01-01

    We present a parallel implementation of a string-driven general active space configuration interaction program for nonrelativistic and scalar-relativistic electronic-structure calculations. The code has been modularly incorporated in the DIRAC quantum chemistry program package. The implementation is based on the message passing interface and a distributed data model in order to efficiently exploit key features of various modern computer architectures. We exemplify the nearly linear scalability of our parallel code in large-scale multireference configuration interaction (MRCI) calculations, and we discuss the parallel speedup with respect to machine-dependent aspects. The largest sample MRCI calculation includes 1.5×109 Slater determinants. Using the new code we determine for the first time the full short-range electronic potentials and spectroscopic constants for the ground state and for eight low-lying excited states of the weakly bound molecular system (Rb-Ba)+ with the spin-orbit-free Dirac formalism and using extensive uncontracted basis sets. The time required to compute to full convergence these electronic states for (Rb-Ba)+ in a single-point MRCI calculation correlating 18 electrons and using 16 cores was reduced from more than 10days to less than 1day.

  19. Temporally Resolved Ion Fluorescence Measurements of the Interaction of a Field-Parallel Laser Produced Plasma and an Ambient Magnetized Plasma

    NASA Astrophysics Data System (ADS)

    Dorst, R. S.; Heuer, P. V.; Bondarenko, A. S.; Shaffer, D. B.; Contantin, G.; Vincena, S.; Tripathi, S.; Gekelman, W.; Weidl, M.; Winske, D.; Niemann, C.

    2016-10-01

    We present measurements of the collisionless coupling between an exploding laser-produced plasma (LPP) and a large, magnetized ambient plasma. The LPP was created by focusing the Raptor laser (400J, 40ns) on a planar plastic target embedded in the ambient Large Plasma Device (LAPD) plasma at the University of Californa, Los Angeles. The resulting ablated material moved parallel to the background magnetic field, interacting with the ambient plasma along the full 17m length of the LAPD. A high temporal and spectral resolution monochrometer measured fluorescence from debris and ambient ions to deter- mine the debris velocity distribution by charge state and study the fast electron precursor to the LPP. Measurements are compared to hybrid simulations of quasi-parallel shocks.

  20. Parallel Newton-Krylov-Schwarz algorithms for the three-dimensional Poisson-Boltzmann equation in numerical simulation of colloidal particle interactions

    NASA Astrophysics Data System (ADS)

    Hwang, Feng-Nan; Cai, Shang-Rong; Shao, Yun-Long; Wu, Jong-Shinn

    2010-09-01

    We investigate fully parallel Newton-Krylov-Schwarz (NKS) algorithms for solving the large sparse nonlinear systems of equations arising from the finite element discretization of the three-dimensional Poisson-Boltzmann equation (PBE), which is often used to describe the colloidal phenomena of an electric double layer around charged objects in colloidal and interfacial science. The NKS algorithm employs an inexact Newton method with backtracking (INB) as the nonlinear solver in conjunction with a Krylov subspace method as the linear solver for the corresponding Jacobian system. An overlapping Schwarz method as a preconditioner to accelerate the convergence of the linear solver. Two test cases including two isolated charged particles and two colloidal particles in a cylindrical pore are used as benchmark problems to validate the correctness of our parallel NKS-based PBE solver. In addition, a truly three-dimensional case, which models the interaction between two charged spherical particles within a rough charged micro-capillary, is simulated to demonstrate the applicability of our PBE solver to handle a problem with complex geometry. Finally, based on the result obtained from a PC cluster of parallel machines, we show numerically that NKS is quite suitable for the numerical simulation of interaction between colloidal particles, since NKS is robust in the sense that INB is able to converge within a small number of iterations regardless of the geometry, the mesh size, the number of processors. With help of an additive preconditioned Krylov subspace method NKS achieves parallel efficiency of 71% or better on up to a hundred processors for a 3D problem with 5 million unknowns.

  1. Parallel rendering

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1995-01-01

    This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.

  2. Scalable parallel methods for monolithic coupling in fluid-structure interaction with application to blood flow modeling

    SciTech Connect

    Barker, Andrew T. Cai Xiaochuan

    2010-02-01

    We introduce and study numerically a scalable parallel finite element solver for the simulation of blood flow in compliant arteries. The incompressible Navier-Stokes equations are used to model the fluid and coupled to an incompressible linear elastic model for the blood vessel walls. Our method features an unstructured dynamic mesh capable of modeling complicated geometries, an arbitrary Lagrangian-Eulerian framework that allows for large displacements of the moving fluid domain, monolithic coupling between the fluid and structure equations, and fully implicit time discretization. Simulations based on blood vessel geometries derived from patient-specific clinical data are performed on large supercomputers using scalable Newton-Krylov algorithms preconditioned with an overlapping restricted additive Schwarz method that preconditions the entire fluid-structure system together. The algorithm is shown to be robust and scalable for a variety of physical parameters, scaling to hundreds of processors and millions of unknowns.

  3. Is it possible to study the kinetic parameters of interaction between PNA and parallel and antiparallel DNA by stopped-flow fluorescence?

    PubMed

    Barbero, N; Cauteruccio, S; Thakare, P; Licandro, E; Viscardi, G; Visentin, S

    2016-10-01

    Peptide nucleic acids (PNAs) are among the most interesting and versatile artificial structural mimics of nucleic acids and exhibit peculiar and important properties (i.e. high chemical stability, and a high resistance to cellular enzymes and nucleases). Despite their unnatural structure, they are able to recognize and bind DNA and RNA in a very high, specific and selective manner. One of the most popular, easy and reliable method to measure the stability of PNA-DNA hybrid systems is the melting temperature but the thermodynamic data are obtained using a big quantity of materials failing to provide information on the kinetics of the interaction. In the present work, the PNA decamer 6, with the TCACTAGATG sequence of nucleobases, and the corresponding fluorescent PNA-FITU (fluorescein isothiourea) decamer 8 were synthesized with standard manual Boc-based chemistry. The interaction of the PNA-FITU with parallel and antiparallel DNA has been studied by stopped-flow fluorescence, which is proposed as an alternative technique to obtain the kinetic parameters of the binding. The great advantage of using the stopped-flow technique is the possibility of studying the kinetics of the PNA-DNA duplex formation in a physiological environment. In particular, fluorescence stopped-flow technique has been exploited to compare the affinity of two PNA-DNA duplexes since it can discriminate between parallel and antiparallel DNA binding.

  4. Effects of sex steroids on bones and muscles: similarities, parallels, and putative interactions in health and disease

    PubMed Central

    Carson, James A.; Manolagas, Stavros C.

    2015-01-01

    Estrogens and androgens influence the growth and maintenance of bones and muscles and are responsible for their sexual dimorphism. A decline in their circulating levels leads to loss of mass and functional integrity in both tissues. In the article, we highlight the similarities of the molecular and cellular mechanisms of action of sex steroids in the two tissues; the commonality of a critical role of mechanical forces on tissue mass and function; emerging evidence for an interplay between mechanical forces and hormonal and growth factor signals in both bones and muscles; as well as the current state of evidence for or against a cross-talk between muscles and bone. In addition, we review evidence for the parallels in the development of osteoporosis and sarcopenia with advancing age and the potential common mechanisms responsible for the age-dependent involution of these two tissues. Lastly, we discuss the striking difference in the availability of several drug therapies for the prevention and treatment of osteoporosis, as compared to none for sarcopenia. PMID:26453497

  5. 3-D Hybrid Kinetic Modeling of the Interaction Between the Solar Wind and Lunar-like Exospheric Pickup Ions in Case of Oblique/ Quasi-Parallel/Parallel Upstream Magnetic Field

    NASA Technical Reports Server (NTRS)

    Lipatov, A. S.; Farrell, W. M.; Cooper, J. F.; Sittler, E. C., Jr.; Hartle, R. E.

    2015-01-01

    The interactions between the solar wind and Moon-sized objects are determined by a set of the solar wind parameters and plasma environment of the space objects. The orientation of upstream magnetic field is one of the key factors which determines the formation and structure of bow shock wave/Mach cone or Alfven wing near the obstacle. The study of effects of the direction of the upstream magnetic field on lunar-like plasma environment is the main subject of our investigation in this paper. Photoionization, electron-impact ionization and charge exchange are included in our hybrid model. The computational model includes the self-consistent dynamics of the light (hydrogen (+), helium (+)) and heavy (sodium (+)) pickup ions. The lunar interior is considered as a weakly conducting body. Our previous 2013 lunar work, as reported in this journal, found formation of a triple structure of the Mach cone near the Moon in the case of perpendicular upstream magnetic field. Further advances in modeling now reveal the presence of strong wave activity in the upstream solar wind and plasma wake in the cases of quasiparallel and parallel upstream magnetic fields. However, little wave activity is found for the opposite case with a perpendicular upstream magnetic field. The modeling does not show a formation of the Mach cone in the case of theta(Sub B,U) approximately equal to 0 degrees.

  6. An O(N) and parallel approach to integral problems by a kernel-independent fast multipole method: Application to polarization and magnetization of interacting particles

    NASA Astrophysics Data System (ADS)

    Jiang, Xikai; Li, Jiyuan; Zhao, Xujun; Qin, Jian; Karpeev, Dmitry; Hernandez-Ortiz, Juan; de Pablo, Juan J.; Heinonen, Olle

    2016-08-01

    Large classes of materials systems in physics and engineering are governed by magnetic and electrostatic interactions. Continuum or mesoscale descriptions of such systems can be cast in terms of integral equations, whose direct computational evaluation requires O(N2) operations, where N is the number of unknowns. Such a scaling, which arises from the many-body nature of the relevant Green's function, has precluded wide-spread adoption of integral methods for solution of large-scale scientific and engineering problems. In this work, a parallel computational approach is presented that relies on using scalable open source libraries and utilizes a kernel-independent Fast Multipole Method (FMM) to evaluate the integrals in O(N) operations, with O(N) memory cost, thereby substantially improving the scalability and efficiency of computational integral methods. We demonstrate the accuracy, efficiency, and scalability of our approach in the context of two examples. In the first, we solve a boundary value problem for a ferroelectric/ferromagnetic volume in free space. In the second, we solve an electrostatic problem involving polarizable dielectric bodies in an unbounded dielectric medium. The results from these test cases show that our proposed parallel approach, which is built on a kernel-independent FMM, can enable highly efficient and accurate simulations and allow for considerable flexibility in a broad range of applications.

  7. An O(N) and parallel approach to integral problems by a kernel-independent fast multipole method: Application to polarization and magnetization of interacting particles

    DOE PAGES

    Jiang, Xikai; Li, Jiyuan; Zhao, Xujun; ...

    2016-08-10

    Large classes of materials systems in physics and engineering are governed by magnetic and electrostatic interactions. Continuum or mesoscale descriptions of such systems can be cast in terms of integral equations, whose direct computational evaluation requires O(N2) operations, where N is the number of unknowns. Such a scaling, which arises from the many-body nature of the relevant Green's function, has precluded wide-spread adoption of integral methods for solution of large-scale scientific and engineering problems. In this work, a parallel computational approach is presented that relies on using scalable open source libraries and utilizes a kernel-independent Fast Multipole Method (FMM) tomore » evaluate the integrals in O(N) operations, with O(N) memory cost, thereby substantially improving the scalability and efficiency of computational integral methods. We demonstrate the accuracy, efficiency, and scalability of our approach in the context of two examples. In the first, we solve a boundary value problem for a ferroelectric/ferromagnetic volume in free space. In the second, we solve an electrostatic problem involving polarizable dielectric bodies in an unbounded dielectric medium. Lastly, the results from these test cases show that our proposed parallel approach, which is built on a kernel-independent FMM, can enable highly efficient and accurate simulations and allow for considerable flexibility in a broad range of applications.« less

  8. Investigation of the interaction between As and Sb species and dissolved organic matter in the Yangtze Estuary, China, using excitation-emission matrices with parallel factor analysis.

    PubMed

    Wang, Ying; Zhang, Di; Shen, Zhen-Yao; Feng, Cheng-Hong; Zhang, Xiao

    2015-02-01

    The interactions between trivalent or pentavalent As/Sb and dissolved organic matter (DOM) in four regions (the river channel, the adjacent coastal area, and the northern and southern nearshore areas) of the Yangtze Estuary, China, were studied using fluorescence quenching titration combined with excitation-emission matrix spectroscopy and parallel factor analysis (PARAFAC). The As/Sb-DOM complexation characteristics were investigated using FTIR and UV absorbance spectroscopy and zeta potential analysis. Four protein-like components and one humic-like component were identified in the DOM from the Yangtze Estuary, China, by PARAFAC analysis. The tryptophan-like substance represented by component 2 was the dominant component and played an important role in the complexation between DOM and As/Sb. The results of complexation modeling demonstrated that the binding capacity of trivalent As/Sb with DOM was higher than that of pentavalent As/Sb with DOM. The DOM from the north nearshore area with the most acidic functional groups and greatest aromaticity possessed the highest binding capacity for trivalent and pentavalent As/Sb. The increase in the UV absorbance and the charge neutralization further indicated the interaction between As/Sb and DOM. The higher binding capacity of Sb(III) with DOM was mainly due to the hydroxyl and carboxyl groups. Our study demonstrates that the use of the advanced EEM-PARAFAC method in fluorescence quenching studies is very useful for evaluating the properties of DOM-pollutant interactions.

  9. Massively parallel multiple interacting continua formulation for modeling flow in fractured porous media using the subsurface reactive flow and transport code PFLOTRAN

    NASA Astrophysics Data System (ADS)

    Kumar, J.; Mills, R. T.; Lichtner, P. C.; Hammond, G. E.

    2010-12-01

    Fracture dominated flows occur in numerous subsurface geochemical processes and at many different scales in rock pore structures, micro-fractures, fracture networks and faults. Fractured porous media can be modeled as multiple interacting continua which are connected to each other through transfer terms that capture the flow of mass and energy in response to pressure, temperature and concentration gradients. However, the analysis of large-scale transient problems using the multiple interacting continuum approach presents an algorithmic and computational challenge for problems with very large numbers of degrees of freedom. A generalized dual porosity model based on the Dual Continuum Disconnected Matrix approach has been implemented within a massively parallel multiphysics-multicomponent-multiphase subsurface reactive flow and transport code PFLOTRAN. Developed as part of the Department of Energy's SciDAC-2 program, PFLOTRAN provides subsurface simulation capabilities that can scale from laptops to ultrascale supercomputers, and utilizes the PETSc framework to solve the large, sparse algebraic systems that arises in complex subsurface reactive flow and transport problems. It has been successfully applied to the solution of problems composed of more than two billions degrees of freedom, utilizing up to 131,072 processor cores on Jaguar, the Cray XT5 system at Oak Ridge National Laboratory that is the world’s fastest supercomputer. Building upon the capabilities and computational efficiency of PFLOTRAN, we will present an implementation of the multiple interacting continua formulation for fractured porous media along with an application case study.

  10. Parallel machines: Parallel machine languages

    SciTech Connect

    Iannucci, R.A. )

    1990-01-01

    This book presents a framework for understanding the tradeoffs between the conventional view and the dataflow view with the objective of discovering the critical hardware structures which must be present in any scalable, general-purpose parallel computer to effectively tolerate latency and synchronization costs. The author presents an approach to scalable general purpose parallel computation. Linguistic Concerns, Compiling Issues, Intermediate Language Issues, and hardware/technological constraints are presented as a combined approach to architectural Develoement. This book presents the notion of a parallel machine language.

  11. Aeroacoustic theory for noncompact wing-gust interaction

    NASA Technical Reports Server (NTRS)

    Martinez, R.; Widnall, S. E.

    1981-01-01

    Three aeroacoustic models for noncompact wing-gust interaction were developed for subsonic flow. The first is that for a two dimensional (infinite span) wing passing through an oblique gust. The unsteady pressure field was obtained by the Wiener-Hopf technique; the airfoil loading and the associated acoustic field were calculated, respectively, by allowing the field point down on the airfoil surface, or by letting it go to infinity. The second model is a simple spanwise superposition of two dimensional solutions to account for three dimensional acoustic effects of wing rotation (for a helicopter blade, or some other rotating planform) and of finiteness of wing span. A three dimensional theory for a single gust was applied to calculate the acoustic signature in closed form due to blade vortex interaction in helicopters. The third model is that of a quarter infinite plate with side edge through a gust at high subsonic speed. An approximate solution for the three dimensional loading and the associated three dimensional acoustic field in closed form was obtained. The results reflected the acoustic effect of satisfying the correct loading condition at the side edge.

  12. Activity and interactions of methane seep microorganisms assessed by parallel transcription and FISH-NanoSIMS analyses

    PubMed Central

    Dekas, Anne E; Connon, Stephanie A; Chadwick, Grayson L; Trembath-Reichert, Elizabeth; Orphan, Victoria J

    2016-01-01

    To characterize the activity and interactions of methanotrophic archaea (ANME) and Deltaproteobacteria at a methane-seeping mud volcano, we used two complimentary measures of microbial activity: a community-level analysis of the transcription of four genes (16S rRNA, methyl coenzyme M reductase A (mcrA), adenosine-5′-phosphosulfate reductase α-subunit (aprA), dinitrogenase reductase (nifH)), and a single-cell-level analysis of anabolic activity using fluorescence in situ hybridization coupled to nanoscale secondary ion mass spectrometry (FISH-NanoSIMS). Transcript analysis revealed that members of the deltaproteobacterial groups Desulfosarcina/Desulfococcus (DSS) and Desulfobulbaceae (DSB) exhibit increased rRNA expression in incubations with methane, suggestive of ANME-coupled activity. Direct analysis of anabolic activity in DSS cells in consortia with ANME by FISH-NanoSIMS confirmed their dependence on methanotrophy, with no 15NH4+ assimilation detected without methane. In contrast, DSS and DSB cells found physically independent of ANME (i.e., single cells) were anabolically active in incubations both with and without methane. These single cells therefore comprise an active ‘free-living' population, and are not dependent on methane or ANME activity. We investigated the possibility of N2 fixation by seep Deltaproteobacteria and detected nifH transcripts closely related to those of cultured diazotrophic Deltaproteobacteria. However, nifH expression was methane-dependent. 15N2 incorporation was not observed in single DSS cells, but was detected in single DSB cells. Interestingly, 15N2 incorporation in single DSB cells was methane-dependent, raising the possibility that DSB cells acquired reduced 15N products from diazotrophic ANME while spatially coupled, and then subsequently dissociated. With this combined data set we address several outstanding questions in methane seep microbial ecosystems and highlight the benefit of measuring microbial activity in the

  13. Activity and interactions of methane seep microorganisms assessed by parallel transcription and FISH-NanoSIMS analyses.

    PubMed

    Dekas, Anne E; Connon, Stephanie A; Chadwick, Grayson L; Trembath-Reichert, Elizabeth; Orphan, Victoria J

    2016-03-01

    To characterize the activity and interactions of methanotrophic archaea (ANME) and Deltaproteobacteria at a methane-seeping mud volcano, we used two complimentary measures of microbial activity: a community-level analysis of the transcription of four genes (16S rRNA, methyl coenzyme M reductase A (mcrA), adenosine-5'-phosphosulfate reductase α-subunit (aprA), dinitrogenase reductase (nifH)), and a single-cell-level analysis of anabolic activity using fluorescence in situ hybridization coupled to nanoscale secondary ion mass spectrometry (FISH-NanoSIMS). Transcript analysis revealed that members of the deltaproteobacterial groups Desulfosarcina/Desulfococcus (DSS) and Desulfobulbaceae (DSB) exhibit increased rRNA expression in incubations with methane, suggestive of ANME-coupled activity. Direct analysis of anabolic activity in DSS cells in consortia with ANME by FISH-NanoSIMS confirmed their dependence on methanotrophy, with no (15)NH4(+) assimilation detected without methane. In contrast, DSS and DSB cells found physically independent of ANME (i.e., single cells) were anabolically active in incubations both with and without methane. These single cells therefore comprise an active 'free-living' population, and are not dependent on methane or ANME activity. We investigated the possibility of N2 fixation by seep Deltaproteobacteria and detected nifH transcripts closely related to those of cultured diazotrophic Deltaproteobacteria. However, nifH expression was methane-dependent. (15)N2 incorporation was not observed in single DSS cells, but was detected in single DSB cells. Interestingly, (15)N2 incorporation in single DSB cells was methane-dependent, raising the possibility that DSB cells acquired reduced (15)N products from diazotrophic ANME while spatially coupled, and then subsequently dissociated. With this combined data set we address several outstanding questions in methane seep microbial ecosystems and highlight the benefit of measuring microbial activity in

  14. Parallel pipelining

    SciTech Connect

    Joseph, D.D.; Bai, R.; Liao, T.Y.; Huang, A.; Hu, H.H.

    1995-09-01

    In this paper the authors introduce the idea of parallel pipelining for water lubricated transportation of oil (or other viscous material). A parallel system can have major advantages over a single pipe with respect to the cost of maintenance and continuous operation of the system, to the pressure gradients required to restart a stopped system and to the reduction and even elimination of the fouling of pipe walls in continuous operation. The authors show that the action of capillarity in small pipes is more favorable for restart than in large pipes. In a parallel pipeline system, they estimate the number of small pipes needed to deliver the same oil flux as in one larger pipe as N = (R/r){sup {alpha}}, where r and R are the radii of the small and large pipes, respectively, and {alpha} = 4 or 19/7 when the lubricating water flow is laminar or turbulent.

  15. Parallel Dislocation Simulator

    SciTech Connect

    2006-10-30

    ParaDiS is software capable of simulating the motion, evolution, and interaction of dislocation networks in single crystals using massively parallel computer architectures. The software is capable of outputting the stress-strain response of a single crystal whose plastic deformation is controlled by the dislocation processes.

  16. Non-equilibrium reaction and relaxation dynamics in a strongly interacting explicit solvent: F + CD3CN treated with a parallel multi-state EVB model

    NASA Astrophysics Data System (ADS)

    Glowacki, David R.; Orr-Ewing, Andrew J.; Harvey, Jeremy N.

    2015-07-01

    We describe a parallelized linear-scaling computational framework developed to implement arbitrarily large multi-state empirical valence bond (MS-EVB) calculations within CHARMM and TINKER. Forces are obtained using the Hellmann-Feynman relationship, giving continuous gradients, and good energy conservation. Utilizing multi-dimensional Gaussian coupling elements fit to explicitly correlated coupled cluster theory, we built a 64-state MS-EVB model designed to study the F + CD3CN → DF + CD2CN reaction in CD3CN solvent (recently reported in Dunning et al. [Science 347(6221), 530 (2015)]). This approach allows us to build a reactive potential energy surface whose balanced accuracy and efficiency considerably surpass what we could achieve otherwise. We ran molecular dynamics simulations to examine a range of observables which follow in the wake of the reactive event: energy deposition in the nascent reaction products, vibrational relaxation rates of excited DF in CD3CN solvent, equilibrium power spectra of DF in CD3CN, and time dependent spectral shifts associated with relaxation of the nascent DF. Many of our results are in good agreement with time-resolved experimental observations, providing evidence for the accuracy of our MS-EVB framework in treating both the solute and solute/solvent interactions. The simulations provide additional insight into the dynamics at sub-picosecond time scales that are difficult to resolve experimentally. In particular, the simulations show that (immediately following deuterium abstraction) the nascent DF finds itself in a non-equilibrium regime in two different respects: (1) it is highly vibrationally excited, with ˜23 kcal mol-1 localized in the stretch and (2) its post-reaction solvation environment, in which it is not yet hydrogen-bonded to CD3CN solvent molecules, is intermediate between the non-interacting gas-phase limit and the solution-phase equilibrium limit. Vibrational relaxation of the nascent DF results in a spectral

  17. Non-equilibrium reaction and relaxation dynamics in a strongly interacting explicit solvent: F + CD{sub 3}CN treated with a parallel multi-state EVB model

    SciTech Connect

    Glowacki, David R.; Orr-Ewing, Andrew J.; Harvey, Jeremy N.

    2015-07-28

    We describe a parallelized linear-scaling computational framework developed to implement arbitrarily large multi-state empirical valence bond (MS-EVB) calculations within CHARMM and TINKER. Forces are obtained using the Hellmann-Feynman relationship, giving continuous gradients, and good energy conservation. Utilizing multi-dimensional Gaussian coupling elements fit to explicitly correlated coupled cluster theory, we built a 64-state MS-EVB model designed to study the F + CD{sub 3}CN → DF + CD{sub 2}CN reaction in CD{sub 3}CN solvent (recently reported in Dunning et al. [Science 347(6221), 530 (2015)]). This approach allows us to build a reactive potential energy surface whose balanced accuracy and efficiency considerably surpass what we could achieve otherwise. We ran molecular dynamics simulations to examine a range of observables which follow in the wake of the reactive event: energy deposition in the nascent reaction products, vibrational relaxation rates of excited DF in CD{sub 3}CN solvent, equilibrium power spectra of DF in CD{sub 3}CN, and time dependent spectral shifts associated with relaxation of the nascent DF. Many of our results are in good agreement with time-resolved experimental observations, providing evidence for the accuracy of our MS-EVB framework in treating both the solute and solute/solvent interactions. The simulations provide additional insight into the dynamics at sub-picosecond time scales that are difficult to resolve experimentally. In particular, the simulations show that (immediately following deuterium abstraction) the nascent DF finds itself in a non-equilibrium regime in two different respects: (1) it is highly vibrationally excited, with ∼23 kcal mol{sup −1} localized in the stretch and (2) its post-reaction solvation environment, in which it is not yet hydrogen-bonded to CD{sub 3}CN solvent molecules, is intermediate between the non-interacting gas-phase limit and the solution-phase equilibrium limit. Vibrational

  18. Insights into the interaction between carbamazepine and natural dissolved organic matter in the Yangtze Estuary using fluorescence excitation-emission matrix spectra coupled with parallel factor analysis.

    PubMed

    Wang, Ying; Zhang, Manman; Fu, Jun; Li, Tingting; Wang, Jinggang; Fu, Yingyu

    2016-10-01

    The interaction between carbamazepine (CBZ) and dissolved organic matter (DOM) from three zones (the nearshore, the river channel, and the coastal areas) in the Yangtze Estuary was investigated using fluorescence quenching titration combined with excitation emission matrix spectra and parallel factor analysis (PARAFAC). The complexation between CBZ and DOM was demonstrated by the increase in hydrogen bonding and the disappearance of the C=O stretch obtained from the Fourier transform infrared spectroscopy analysis. The results indicated that two protein-like substances (component 2 and component3) and two humic-like substances (component 1 and 4) were identified in the DOM from the Yangtze Estuary. The fluorescence quenching curves of each component with the addition of CBZ and the Ryan and Weber model calculation results both demonstrated that the different components exhibited different complexation activities with CBZ. The protein-like components had a stronger affinity with CBZ than did the humic-like substances. On the other hand, the autochthonous tyrosine-like C2 played an important role in the complexation with DOM from the river channel and coastal areas, while C3 influenced by anthropogenic activities showed an obvious effect in the nearshore area. DOMs from the river channel have the highest binding capacity for CBZ, which may ascribe to the relatively high phenol content group in the DOM.

  19. The Double Hierarchy Method. A parallel 3D contact method for the interaction of spherical particles with rigid FE boundaries using the DEM

    NASA Astrophysics Data System (ADS)

    Santasusana, Miquel; Irazábal, Joaquín; Oñate, Eugenio; Carbonell, Josep Maria

    2016-07-01

    In this work, we present a new methodology for the treatment of the contact interaction between rigid boundaries and spherical discrete elements (DE). Rigid body parts are present in most of large-scale simulations. The surfaces of the rigid parts are commonly meshed with a finite element-like (FE) discretization. The contact detection and calculation between those DE and the discretized boundaries is not straightforward and has been addressed by different approaches. The algorithm presented in this paper considers the contact of the DEs with the geometric primitives of a FE mesh, i.e. facet, edge or vertex. To do so, the original hierarchical method presented by Horner et al. (J Eng Mech 127(10):1027-1032, 2001) is extended with a new insight leading to a robust, fast and accurate 3D contact algorithm which is fully parallelizable. The implementation of the method has been developed in order to deal ideally with triangles and quadrilaterals. If the boundaries are discretized with another type of geometries, the method can be easily extended to higher order planar convex polyhedra. A detailed description of the procedure followed to treat a wide range of cases is presented. The description of the developed algorithm and its validation is verified with several practical examples. The parallelization capabilities and the obtained performance are presented with the study of an industrial application example.

  20. Research investigation of helicopter main rotor/tail rotor interaction noise

    NASA Technical Reports Server (NTRS)

    Fitzgerald, J.; Kohlhepp, F.

    1988-01-01

    Acoustic measurements were obtained in a Langley 14 x 22 foot Subsonic Wind Tunnel to study the aeroacoustic interaction of 1/5th scale main rotor, tail rotor, and fuselage models. An extensive aeroacoustic data base was acquired for main rotor, tail rotor, fuselage aerodynamic interaction for moderate forward speed flight conditions. The details of the rotor models, experimental design and procedure, aerodynamic and acoustic data acquisition and reduction are presented. The model was initially operated in trim for selected fuselage angle of attack, main rotor tip-path-plane angle, and main rotor thrust combinations. The effects of repositioning the tail rotor in the main rotor wake and the corresponding tail rotor countertorque requirements were determined. Each rotor was subsequently tested in isolation at the thrust and angle of attack combinations for trim. The acoustic data indicated that the noise was primarily dominated by the main rotor, especially for moderate speed main rotor blade-vortex interaction conditions. The tail rotor noise increased when the main rotor was removed indicating that tail rotor inflow was improved with the main rotor present.

  1. Start/Pat; A parallel-programming toolkit

    SciTech Connect

    Appelbe, B.; Smith, K. ); McDowell, C. )

    1989-07-01

    How can you make Fortran code parallel without isolating the programmer from learning to understand and exploit parallelism effectively. With an interactive toolkit that automates parallelization as it educates. This paper discusses the Start/Pat toolkit.

  2. Getting a feel for parameters: using interactive parallel plots as a tool for parameter identification in the new rainfall-runoff model WALRUS

    NASA Astrophysics Data System (ADS)

    Brauer, Claudia; Torfs, Paul; Teuling, Ryan; Uijlenhoet, Remko

    2015-04-01

    Recently, we developed the Wageningen Lowland Runoff Simulator (WALRUS) to fill the gap between complex, spatially distributed models often used in lowland catchments and simple, parametric models which have mostly been developed for mountainous catchments (Brauer et al., 2014ab). This parametric rainfall-runoff model can be used all over the world in both freely draining lowland catchments and polders with controlled water levels. The open source model code is implemented in R and can be downloaded from www.github.com/ClaudiaBrauer/WALRUS. The structure and code of WALRUS are simple, which facilitates detailed investigation of the effect of parameters on all model variables. WALRUS contains only four parameters requiring calibration; they are intended to have a strong, qualitative relation with catchment characteristics. Parameter estimation remains a challenge, however. The model structure contains three main feedbacks: (1) between groundwater and surface water; (2) between saturated and unsaturated zone; (3) between catchment wetness and (quick/slow) flowroute division. These feedbacks represent essential rainfall-runoff processes in lowland catchments, but increase the risk of parameter dependence and equifinality. Therefore, model performance should not only be judged based on a comparison between modelled and observed discharges, but also based on the plausibility of the internal modelled variables. Here, we present a method to analyse the effect of parameter values on internal model states and fluxes in a qualitative and intuitive way using interactive parallel plotting. We applied WALRUS to ten Dutch catchments with different sizes, slopes and soil types and both freely draining and polder areas. The model was run with a large number of parameter sets, which were created using Latin Hypercube Sampling. The model output was characterised in terms of several signatures, both measures of goodness of fit and statistics of internal model variables (such as the

  3. A Genomic and Protein-Protein Interaction Analyses of Nonsyndromic Hearing Impairment in Cameroon Using Targeted Genomic Enrichment and Massively Parallel Sequencing.

    PubMed

    Lebeko, Kamogelo; Manyisa, Noluthando; Chimusa, Emile R; Mulder, Nicola; Dandara, Collet; Wonkam, Ambroise

    2017-02-01

    Hearing impairment (HI) is one of the leading causes of disability in the world, impacting the social, economic, and psychological well-being of the affected individual. This is particularly true in sub-Saharan Africa, which carries one of the highest burdens of this condition. Despite this, there are limited data on the most prevalent genes or mutations that cause HI among sub-Saharan Africans. Next-generation technologies, such as targeted genomic enrichment and massively parallel sequencing, offer new promise in this context. This study reports, for the first time to the best of our knowledge, on the prevalence of novel mutations identified through a platform of 116 HI genes (OtoSCOPE(®)), among 82 African probands with HI. Only variants OTOF NM_194248.2:c.766-2A>G and MYO7A NM_000260.3:c.1996C>T, p.Arg666Stop were found in 3 (3.7%) and 5 (6.1%) patients, respectively. In addition and uniquely, the analysis of protein-protein interactions (PPI), through interrogation of gene subnetworks, using a custom script and two databases (Enrichr and PANTHER), and an algorithm in the igraph package of R, identified the enrichment of sensory perception and mechanical stimulus biological processes, and the most significant molecular functions of these variants pertained to binding or structural activity. Furthermore, 10 genes (MYO7A, MYO6, KCTD3, NUMA1, MYH9, KCNQ1, UBC, DIAPH1, PSMC2, and RDX) were identified as significant hubs within the subnetworks. Results reveal that the novel variants identified among familial cases of HI in Cameroon are not common, and PPI analysis has highlighted the role of 10 genes, potentially important in understanding HI genomics among Africans.

  4. Development of a prototype PET scanner with depth-of-interaction measurement using solid-state photomultiplier arrays and parallel readout electronics.

    PubMed

    Shao, Yiping; Sun, Xishan; Lan, Kejian A; Bircher, Chad; Lou, Kai; Deng, Zhi

    2014-03-07

    In this study, we developed a prototype animal PET by applying several novel technologies to use solid-state photomultiplier (SSPM) arrays to measure the depth of interaction (DOI) and improve imaging performance. Each PET detector has an 8 × 8 array of about 1.9 × 1.9 × 30.0 mm(3) lutetium-yttrium-oxyorthosilicate scintillators, with each end optically connected to an SSPM array (16 channels in a 4 × 4 matrix) through a light guide to enable continuous DOI measurement. Each SSPM has an active area of about 3 × 3 mm(2), and its output is read by a custom-developed application-specific integrated circuit to directly convert analogue signals to digital timing pulses that encode the interaction information. These pulses are transferred to and are decoded by a field-programmable gate array-based time-to-digital convertor for coincident event selection and data acquisition. The independent readout of each SSPM and the parallel signal process can significantly improve the signal-to-noise ratio and enable the use of flexible algorithms for different data processes. The prototype PET consists of two rotating detector panels on a portable gantry with four detectors in each panel to provide 16 mm axial and variable transaxial field-of-view (FOV) sizes. List-mode ordered subset expectation maximization image reconstruction was implemented. The measured mean energy, coincidence timing and DOI resolution for a crystal were about 17.6%, 2.8 ns and 5.6 mm, respectively. The measured transaxial resolutions at the center of the FOV were 2.0 mm and 2.3 mm for images reconstructed with and without DOI, respectively. In addition, the resolutions across the FOV with DOI were substantially better than those without DOI. The quality of PET images of both a hot-rod phantom and mouse acquired with DOI was much higher than that of images obtained without DOI. This study demonstrates that SSPM arrays and advanced readout/processing electronics can be used to develop a practical DOI

  5. Development of a prototype PET scanner with depth-of-interaction measurement using solid-state photomultiplier arrays and parallel readout electronics

    PubMed Central

    Shao, Yiping; Sun, Xishan; Lan, Kejian A.; Bircher, Chad; Lou, Kai; Deng, Zhi

    2014-01-01

    In this study, we developed a prototype animal PET by applying several novel technologies to use the solid-state photomultiplier (SSPM) arrays for measuring the depth-of-interaction (DOI) and improving imaging performance. Each PET detector has an 8×8 array of about 1.9×1.9×30.0 mm3 lutetium-yttrium-oxyorthosilicate (LYSO) scintillators, with each end optically connected to a SSPM array (16-channel in a 4×4 matrix) through a light guide to enable continuous DOI measurement. Each SSPM has an active area of about 3×3 mm2, and its output is read by a custom-developed application-specific-integrated-circuit (ASIC) to directly convert analog signals to digital timing pulses that encode the interaction information. These pulses are transferred to and be decoded by a field-programmable-gate-array (FPGA) based time-to-digital convertor for coincident event selection and data acquisition. The independent readout of each SSPM and the parallel signal process can significantly improve the signal-to-noise ratio and enable using flexible algorithms for different data processes. The prototype PET consists of two rotating detector panels on a portable gantry with four detectors in each panel to provide 16 mm axial and variable transaxial field-of-view (FOV) sizes. List-mode ordered-subset-expectation-maximization image reconstruction was implemented. The measured mean energy, coincidence timing, and DOI resolution for a crystal were about 17.6%, 2.8 ns, and 5.6 mm, respectively. The measured transaxial resolutions at the center of the FOV were 2.0 mm and 2.3 mm for images reconstructed with and without DOI, respectively. In addition, the resolutions across the FOV with DOI were substantially better than those without DOI. The quality of PET images of both a hot-rod phantom and mouse acquired with DOI was much higher than that of images obtained without DOI. This study demonstrates that SSPM arrays and advanced readout/processing electronics can be used to develop a practical

  6. A Fast Parallel Simulation Code for Interaction between Proto-Planetary Disk and Embedded Proto-Planets: Implementation for 3D Code

    SciTech Connect

    Li, Shengtai; Li, Hui

    2012-06-14

    We develop a 3D simulation code for interaction between the proto-planetary disk and embedded proto-planets. The protoplanetary disk is treated as a three-dimensional (3D), self-gravitating gas whose motion is described by the locally isothermal Navier-Stokes equations in a spherical coordinate centered on the star. The differential equations for the disk are similar to those given in Kley et al. (2009) with a different gravitational potential that is defined in Nelson et al. (2000). The equations are solved by directional split Godunov method for the inviscid Euler equations plus operator-split method for the viscous source terms. We use a sub-cycling technique for the azimuthal sweep to alleviate the time step restriction. We also extend the FARGO scheme of Masset (2000) and modified in Li et al. (2001) to our 3D code to accelerate the transport in the azimuthal direction. Furthermore, we have implemented a reduced 2D (r, {theta}) and a fully 3D self-gravity solver on our uniform disk grid, which extends our 2D method (Li, Buoni, & Li 2008) to 3D. This solver uses a mode cut-off strategy and combines FFT in the azimuthal direction and direct summation in the radial and meridional direction. An initial axis-symmetric equilibrium disk is generated via iteration between the disk density profile and the 2D disk-self-gravity. We do not need any softening in the disk self-gravity calculation as we have used a shifted grid method (Li et al. 2008) to calculate the potential. The motion of the planet is limited on the mid-plane and the equations are the same as given in D'Angelo et al. (2005), which we adapted to the polar coordinates with a fourth-order Runge-Kutta solver. The disk gravitational force on the planet is assumed to evolve linearly with time between two hydrodynamics time steps. The Planetary potential acting on the disk is calculated accurately with a small softening given by a cubic-spline form (Kley et al. 2009). Since the torque is extremely sensitive to

  7. Detached-eddy simulation of flow non-linearity of fluid-structural interactions using high order schemes and parallel computation

    NASA Astrophysics Data System (ADS)

    Wang, Baoyuan

    The objective of this research is to develop an efficient and accurate methodology to resolve flow non-linearity of fluid-structural interaction. To achieve this purpose, a numerical strategy to apply the detached-eddy simulation (DES) with a fully coupled fluid-structural interaction model is established for the first time. The following novel numerical algorithms are also created: a general sub-domain boundary mapping procedure for parallel computation to reduce wall clock simulation time, an efficient and low diffusion E-CUSP (LDE) scheme used as a Riemann solver to resolve discontinuities with minimal numerical dissipation, and an implicit high order accuracy weighted essentially non-oscillatory (WENO) scheme to capture shock waves. The Detached-Eddy Simulation is based on the model proposed by Spalart in 1997. Near solid walls within wall boundary layers, the Reynolds averaged Navier-Stokes (RANS) equations are solved. Outside of the wall boundary layers, the 3D filtered compressible Navier-Stokes equations are solved based on large eddy simulation(LES). The Spalart-Allmaras one equation turbulence model is solved to provide the Reynolds stresses in the RANS region and the subgrid scale stresses in the LES region. An improved 5th order finite differencing weighted essentially non-oscillatory (WENO) scheme with an optimized epsilon value is employed for the inviscid fluxes. The new LDE scheme used with the WENO scheme is able to capture crisp shock profiles and exact contact surfaces. A set of fully conservative 4th order finite central differencing schemes are used for the viscous terms. The 3D Navier-Stokes equations are discretized based on a conservative finite differencing scheme. The unfactored line Gauss-Seidel relaxation iteration is employed for time marching. A general sub-domain boundary mapping procedure is developed for arbitrary topology multi-block structured grids with grid points matched on sub-domain boundaries. Extensive numerical experiments

  8. Prediction of rotating-blade vortex noise from noise of nonrotating blades

    NASA Technical Reports Server (NTRS)

    Fink, M. R.; Schlinker, R. H.; Amiet, R. K.

    1976-01-01

    Measurements were conducted in an acoustic wind tunnel to determine vortex noise of nonrotating circular cylinders and NACA 0012 airfoils. Both constant-width and spanwise tapered models were tested at a low turbulence level. The constant-diameter cylinder and constant-chord airfoil also were tested in the turbulent wake generated by an upstream cylinder or airfoil. Vortex noise radiation from nonrotating circular cylinders at Reynolds numbers matching those of the rotating-blade tests were found to be strongly dependent on surface conditions and Reynolds number. Vortex noise of rotating circular cylinder blades, operating with and without the shed wake blown downstream, could be predicted using data for nonrotating circular cylinders as functions of Reynolds number. Vortex noise of nonrotating airfoils was found to be trailing-edge noise at a time frequence equal to that predicted for maximum-amplitude Tollmein-Schlichting instability waves at the trailing edge.

  9. Serial Order: A Parallel Distributed Processing Approach.

    ERIC Educational Resources Information Center

    Jordan, Michael I.

    Human behavior shows a variety of serially ordered action sequences. This paper presents a theory of serial order which describes how sequences of actions might be learned and performed. In this theory, parallel interactions across time (coarticulation) and parallel interactions across space (dual-task interference) are viewed as two aspects of a…

  10. Parallel pivoting combined with parallel reduction

    NASA Technical Reports Server (NTRS)

    Alaghband, Gita

    1987-01-01

    Parallel algorithms for triangularization of large, sparse, and unsymmetric matrices are presented. The method combines the parallel reduction with a new parallel pivoting technique, control over generations of fill-ins and a check for numerical stability, all done in parallel with the work being distributed over the active processes. The parallel technique uses the compatibility relation between pivots to identify parallel pivot candidates and uses the Markowitz number of pivots to minimize fill-in. This technique is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds.

  11. Special parallel processing workshop

    SciTech Connect

    1994-12-01

    This report contains viewgraphs from the Special Parallel Processing Workshop. These viewgraphs deal with topics such as parallel processing performance, message passing, queue structure, and other basic concept detailing with parallel processing.

  12. Parallel hierarchical method in networks

    NASA Astrophysics Data System (ADS)

    Malinochka, Olha; Tymchenko, Leonid

    2007-09-01

    This method of parallel-hierarchical Q-transformation offers new approach to the creation of computing medium - of parallel -hierarchical (PH) networks, being investigated in the form of model of neurolike scheme of data processing [1-5]. The approach has a number of advantages as compared with other methods of formation of neurolike media (for example, already known methods of formation of artificial neural networks). The main advantage of the approach is the usage of multilevel parallel interaction dynamics of information signals at different hierarchy levels of computer networks, that enables to use such known natural features of computations organization as: topographic nature of mapping, simultaneity (parallelism) of signals operation, inlaid cortex, structure, rough hierarchy of the cortex, spatially correlated in time mechanism of perception and training [5].

  13. Runtime volume visualization for parallel CFD

    NASA Technical Reports Server (NTRS)

    Ma, Kwan-Liu

    1995-01-01

    This paper discusses some aspects of design of a data distributed, massively parallel volume rendering library for runtime visualization of parallel computational fluid dynamics simulations in a message-passing environment. Unlike the traditional scheme in which visualization is a postprocessing step, the rendering is done in place on each node processor. Computational scientists who run large-scale simulations on a massively parallel computer can thus perform interactive monitoring of their simulations. The current library provides an interface to handle volume data on rectilinear grids. The same design principles can be generalized to handle other types of grids. For demonstration, we run a parallel Navier-Stokes solver making use of this rendering library on the Intel Paragon XP/S. The interactive visual response achieved is found to be very useful. Performance studies show that the parallel rendering process is scalable with the size of the simulation as well as with the parallel computer.

  14. Parallel rendering techniques for massively parallel visualization

    SciTech Connect

    Hansen, C.; Krogh, M.; Painter, J.

    1995-07-01

    As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory. and parallelism of Massively Parallel Processors (MPPs) are becoming increasingly important. For large applications rendering on the MPP tends to be preferable to rendering on a graphics workstation due to the MPP`s abundant resources: memory, disk, and numerous processors. The challenge becomes developing algorithms that can exploit these resources while minimizing overhead, typically communication costs. This paper will describe recent efforts in parallel rendering for polygonal primitives as well as parallel volumetric techniques. This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume render use a MIMD approach. Implementations for these algorithms are presented for the Thinking Ma.chines Corporation CM-5 MPP.

  15. Computer-Aided Parallelizer and Optimizer

    NASA Technical Reports Server (NTRS)

    Jin, Haoqiang

    2011-01-01

    The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.

  16. Applied Parallel Metadata Indexing

    SciTech Connect

    Jacobi, Michael R

    2012-08-01

    The GPFS Archive is parallel archive is a parallel archive used by hundreds of users in the Turquoise collaboration network. It houses 4+ petabytes of data in more than 170 million files. Currently, users must navigate the file system to retrieve their data, requiring them to remember file paths and names. A better solution might allow users to tag data with meaningful labels and searach the archive using standard and user-defined metadata, while maintaining security. last summer, I developed the backend to a tool that adheres to these design goals. The backend works by importing GPFS metadata into a MongoDB cluster, which is then indexed on each attribute. This summer, the author implemented security and developed the user interfae for the search tool. To meet security requirements, each database table is associated with a single user, which only stores records that the user may read, and requires a set of credentials to access. The interface to the search tool is implemented using FUSE (Filesystem in USErspace). FUSE is an intermediate layer that intercepts file system calls and allows the developer to redefine how those calls behave. In the case of this tool, FUSE interfaces with MongoDB to issue queries and populate output. A FUSE implementation is desirable because it allows users to interact with the search tool using commands they are already familiar with. These security and interface additions are essential for a usable product.

  17. Parallel computations and control of adaptive structures

    NASA Technical Reports Server (NTRS)

    Park, K. C.; Alvin, Kenneth F.; Belvin, W. Keith; Chong, K. P. (Editor); Liu, S. C. (Editor); Li, J. C. (Editor)

    1991-01-01

    The equations of motion for structures with adaptive elements for vibration control are presented for parallel computations to be used as a software package for real-time control of flexible space structures. A brief introduction of the state-of-the-art parallel computational capability is also presented. Time marching strategies are developed for an effective use of massive parallel mapping, partitioning, and the necessary arithmetic operations. An example is offered for the simulation of control-structure interaction on a parallel computer and the impact of the approach presented for applications in other disciplines than aerospace industry is assessed.

  18. Parallel flow diffusion battery

    DOEpatents

    Yeh, Hsu-Chi; Cheng, Yung-Sung

    1984-08-07

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  19. Parallel flow diffusion battery

    DOEpatents

    Yeh, H.C.; Cheng, Y.S.

    1984-01-01

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  20. Wave-particle interactions with parallel whistler waves: Nonlinear and time-dependent effects revealed by particle-in-cell simulations

    SciTech Connect

    Camporeale, Enrico; Zimbardo, Gaetano

    2015-09-15

    We present a self-consistent Particle-in-Cell simulation of the resonant interactions between anisotropic energetic electrons and a population of whistler waves, with parameters relevant to the Earth's radiation belt. By tracking PIC particles and comparing with test-particle simulations, we emphasize the importance of including nonlinear effects and time evolution in the modeling of wave-particle interactions, which are excluded in the resonant limit of quasi-linear theory routinely used in radiation belt studies. In particular, we show that pitch angle diffusion is enhanced during the linear growth phase, and it rapidly saturates well before a single bounce period. This calls into question the widely used bounce average performed in most radiation belt diffusion calculations. Furthermore, we discuss how the saturation is related to the fact that the domain in which the particles pitch angle diffuses is bounded, and to the well-known problem of 90° diffusion barrier.

  1. Parallel processing ITS

    SciTech Connect

    Fan, W.C.; Halbleib, J.A. Sr.

    1996-09-01

    This report provides a users` guide for parallel processing ITS on a UNIX workstation network, a shared-memory multiprocessor or a massively-parallel processor. The parallelized version of ITS is based on a master/slave model with message passing. Parallel issues such as random number generation, load balancing, and communication software are briefly discussed. Timing results for example problems are presented for demonstration purposes.

  2. Research in parallel computing

    NASA Technical Reports Server (NTRS)

    Ortega, James M.; Henderson, Charles

    1994-01-01

    This report summarizes work on parallel computations for NASA Grant NAG-1-1529 for the period 1 Jan. - 30 June 1994. Short summaries on highly parallel preconditioners, target-specific parallel reductions, and simulation of delta-cache protocols are provided.

  3. Parallel simulation today

    NASA Technical Reports Server (NTRS)

    Nicol, David; Fujimoto, Richard

    1992-01-01

    This paper surveys topics that presently define the state of the art in parallel simulation. Included in the tutorial are discussions on new protocols, mathematical performance analysis, time parallelism, hardware support for parallel simulation, load balancing algorithms, and dynamic memory management for optimistic synchronization.

  4. An O(N) and parallel approach to integral problems by a kernel-independent fast multipole method: Application to polarization and magnetization of interacting particles

    SciTech Connect

    Jiang, Xikai; Li, Jiyuan; Zhao, Xujun; Qin, Jian; Karpeev, Dmitry; Hernandez-Ortiz, Juan; de Pablo, Juan J.; Heinonen, Olle

    2016-08-10

    Large classes of materials systems in physics and engineering are governed by magnetic and electrostatic interactions. Continuum or mesoscale descriptions of such systems can be cast in terms of integral equations, whose direct computational evaluation requires O(N2) operations, where N is the number of unknowns. Such a scaling, which arises from the many-body nature of the relevant Green's function, has precluded wide-spread adoption of integral methods for solution of large-scale scientific and engineering problems. In this work, a parallel computational approach is presented that relies on using scalable open source libraries and utilizes a kernel-independent Fast Multipole Method (FMM) to evaluate the integrals in O(N) operations, with O(N) memory cost, thereby substantially improving the scalability and efficiency of computational integral methods. We demonstrate the accuracy, efficiency, and scalability of our approach in the context of two examples. In the first, we solve a boundary value problem for a ferroelectric/ferromagnetic volume in free space. In the second, we solve an electrostatic problem involving polarizable dielectric bodies in an unbounded dielectric medium. Lastly, the results from these test cases show that our proposed parallel approach, which is built on a kernel-independent FMM, can enable highly efficient and accurate simulations and allow for considerable flexibility in a broad range of applications.

  5. Parallel algorithm development

    SciTech Connect

    Adams, T.F.

    1996-06-01

    Rapid changes in parallel computing technology are causing significant changes in the strategies being used for parallel algorithm development. One approach is simply to write computer code in a standard language like FORTRAN 77 or with the expectation that the compiler will produce executable code that will run in parallel. The alternatives are: (1) to build explicit message passing directly into the source code; or (2) to write source code without explicit reference to message passing or parallelism, but use a general communications library to provide efficient parallel execution. Application of these strategies is illustrated with examples of codes currently under development.

  6. Visualization and Tracking of Parallel CFD Simulations

    NASA Technical Reports Server (NTRS)

    Vaziri, Arsi; Kremenetsky, Mark

    1995-01-01

    We describe a system for interactive visualization and tracking of a 3-D unsteady computational fluid dynamics (CFD) simulation on a parallel computer. CM/AVS, a distributed, parallel implementation of a visualization environment (AVS) runs on the CM-5 parallel supercomputer. A CFD solver is run as a CM/AVS module on the CM-5. Data communication between the solver, other parallel visualization modules, and a graphics workstation, which is running AVS, are handled by CM/AVS. Partitioning of the visualization task, between CM-5 and the workstation, can be done interactively in the visual programming environment provided by AVS. Flow solver parameters can also be altered by programmable interactive widgets. This system partially removes the requirement of storing large solution files at frequent time steps, a characteristic of the traditional 'simulate (yields) store (yields) visualize' post-processing approach.

  7. Genome-Wide Fitness and Genetic Interactions Determined by Tn-seq, a High-Throughput Massively Parallel Sequencing Method for Microorganisms.

    PubMed

    van Opijnen, Tim; Lazinski, David W; Camilli, Andrew

    2014-04-14

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobial agents and vaccines. This unit presents Tn-seq, a method that has made it possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies on the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated transposon insertion library. After library selection, changes in the frequency of each insertion mutant are determined by sequencing flanking regions en masse. These changes are used to calculate each mutant's fitness. The method was originally developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis, but has now been applied to several different microbial species.

  8. Genome-Wide Fitness and Genetic Interactions Determined by Tn-seq, a High-Throughput Massively Parallel Sequencing Method for Microorganisms.

    PubMed

    van Opijnen, Tim; Lazinski, David W; Camilli, Andrew

    2015-02-02

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobial agents and vaccines. This unit presents Tn-seq, a method that has made it possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies on the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated transposon insertion library. After library selection, changes in the frequency of each insertion mutant are determined by sequencing flanking regions en masse. These changes are used to calculate each mutant's fitness. The method was originally developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis, but has now been applied to several different microbial species.

  9. Stacked and H-Bonded Cytosine Dimers. Analysis of the Intermolecular Interaction Energies by Parallel Quantum Chemistry and Polarizable Molecular Mechanics.

    PubMed

    Gresh, Nohad; Sponer, Judit E; Devereux, Mike; Gkionis, Konstantinos; de Courcy, Benoit; Piquemal, Jean-Philip; Sponer, Jiri

    2015-07-30

    Until now, atomistic simulations of DNA and RNA and their complexes have been executed using well calibrated but conceptually simple pair-additive empirical potentials (force fields). Although such simulations provided many valuable results, it is well established that simple force fields also introduce errors into the description, underlying the need for development of alternative anisotropic, polarizable molecular mechanics (APMM) potentials. One of the most abundant forces in all kinds of nucleic acids topologies is base stacking. Intra- and interstrand stacking is assumed to be the most essential factor affecting local conformational variations of B-DNA. However, stacking also contributes to formation of all kinds of noncanonical nucleic acids structures, such as quadruplexes or folded RNAs. The present study focuses on 14 stacked cytosine (Cyt) dimers and the doubly H-bonded dimer. We evaluate the extent to which an APMM procedure, SIBFA, could account quantitatively for the results of high-level quantum chemistry (QC) on the total interaction energies, and the individual energy contributions and their nonisotropic behaviors. Good agreements are found at both uncorrelated HF and correlated DFT and CCSD(T) levels. Resorting in SIBFA to distributed QC multipoles and to an explicit representation of the lone pairs is essential to respectively account for the anisotropies of the Coulomb and of the exchange-repulsion QC contributions.

  10. Genome-wide fitness and genetic interactions determined by Tn-seq, a high-throughput massively parallel sequencing method for microorganisms.

    PubMed

    van Opijnen, Tim; Camilli, Andrew

    2010-11-01

    The lagging annotation of bacterial genomes and the inherent genetic complexity of many phenotypes is hindering the discovery of new drug targets and the development of new antimicrobials and vaccines. Here we present the method Tn-seq, with which it has become possible to quantitatively determine fitness for most genes in a microorganism and to screen for quantitative genetic interactions on a genome-wide scale and in a high-throughput fashion. Tn-seq can thus direct studies in the annotation of genes and untangle complex phenotypes. The method is based on the construction of a saturated Mariner transposon insertion library. After library selection, changes in frequency of each insertion mutant are determined by sequencing of the flanking regions en masse. These changes are used to calculate each mutant's fitness. The method has been developed for the Gram-positive bacterium Streptococcus pneumoniae, a causative agent of pneumonia and meningitis; however, due to the wide activity of the Mariner transposon, Tn-seq can be applied to many different microbial species.

  11. Parallel distributed computing using Python

    NASA Astrophysics Data System (ADS)

    Dalcin, Lisandro D.; Paz, Rodrigo R.; Kler, Pablo A.; Cosimo, Alejandro

    2011-09-01

    This work presents two software components aimed to relieve the costs of accessing high-performance parallel computing resources within a Python programming environment: MPI for Python and PETSc for Python. MPI for Python is a general-purpose Python package that provides bindings for the Message Passing Interface (MPI) standard using any back-end MPI implementation. Its facilities allow parallel Python programs to easily exploit multiple processors using the message passing paradigm. PETSc for Python provides access to the Portable, Extensible Toolkit for Scientific Computation (PETSc) libraries. Its facilities allow sequential and parallel Python applications to exploit state of the art algorithms and data structures readily available in PETSc for the solution of large-scale problems in science and engineering. MPI for Python and PETSc for Python are fully integrated to PETSc-FEM, an MPI and PETSc based parallel, multiphysics, finite elements code developed at CIMEC laboratory. This software infrastructure supports research activities related to simulation of fluid flows with applications ranging from the design of microfluidic devices for biochemical analysis to modeling of large-scale stream/aquifer interactions.

  12. Parallel digital forensics infrastructure.

    SciTech Connect

    Liebrock, Lorie M.; Duggan, David Patrick

    2009-10-01

    This report documents the architecture and implementation of a Parallel Digital Forensics infrastructure. This infrastructure is necessary for supporting the design, implementation, and testing of new classes of parallel digital forensics tools. Digital Forensics has become extremely difficult with data sets of one terabyte and larger. The only way to overcome the processing time of these large sets is to identify and develop new parallel algorithms for performing the analysis. To support algorithm research, a flexible base infrastructure is required. A candidate architecture for this base infrastructure was designed, instantiated, and tested by this project, in collaboration with New Mexico Tech. Previous infrastructures were not designed and built specifically for the development and testing of parallel algorithms. With the size of forensics data sets only expected to increase significantly, this type of infrastructure support is necessary for continued research in parallel digital forensics. This report documents the implementation of the parallel digital forensics (PDF) infrastructure architecture and implementation.

  13. Introduction to Parallel Computing

    DTIC Science & Technology

    1992-05-01

    Topology C, Ada, C++, Data-parallel FORTRAN, 2D mesh of node boards, each node FORTRAN-90 (late 1992) board has 1 application processor Devopment Tools ...parallel machines become the wave of the present, tools are increasingly needed to assist programmers in creating parallel tasks and coordinating...their activities. Linda was designed to be such a tool . Linda was designed with three important goals in mind: to be portable, efficient, and easy to use

  14. Parallel Wolff Cluster Algorithms

    NASA Astrophysics Data System (ADS)

    Bae, S.; Ko, S. H.; Coddington, P. D.

    The Wolff single-cluster algorithm is the most efficient method known for Monte Carlo simulation of many spin models. Due to the irregular size, shape and position of the Wolff clusters, this method does not easily lend itself to efficient parallel implementation, so that simulations using this method have thus far been confined to workstations and vector machines. Here we present two parallel implementations of this algorithm, and show that one gives fairly good performance on a MIMD parallel computer.

  15. PCLIPS: Parallel CLIPS

    NASA Technical Reports Server (NTRS)

    Hall, Lawrence O.; Bennett, Bonnie H.; Tello, Ivan

    1994-01-01

    A parallel version of CLIPS 5.1 has been developed to run on Intel Hypercubes. The user interface is the same as that for CLIPS with some added commands to allow for parallel calls. A complete version of CLIPS runs on each node of the hypercube. The system has been instrumented to display the time spent in the match, recognize, and act cycles on each node. Only rule-level parallelism is supported. Parallel commands enable the assertion and retraction of facts to/from remote nodes working memory. Parallel CLIPS was used to implement a knowledge-based command, control, communications, and intelligence (C(sup 3)I) system to demonstrate the fusion of high-level, disparate sources. We discuss the nature of the information fusion problem, our approach, and implementation. Parallel CLIPS has also be used to run several benchmark parallel knowledge bases such as one to set up a cafeteria. Results show from running Parallel CLIPS with parallel knowledge base partitions indicate that significant speed increases, including superlinear in some cases, are possible.

  16. Application Portable Parallel Library

    NASA Technical Reports Server (NTRS)

    Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott

    1995-01-01

    Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.

  17. Parallel Algorithms and Patterns

    SciTech Connect

    Robey, Robert W.

    2016-06-16

    This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.

  18. Linked-View Parallel Coordinate Plot Renderer

    SciTech Connect

    2011-06-28

    This software allows multiple linked views for interactive querying via map-based data selection, bar chart analytic overlays, and high dynamic range (HDR) line renderings. The major component of the visualization package is a parallel coordinate renderer with binning, curved layouts, shader-based rendering, and other techniques to allow interactive visualization of multidimensional data.

  19. Parallel Lisp simulator

    SciTech Connect

    Weening, J.S.

    1988-05-01

    CSIM is a simulator for parallel Lisp, based on a continuation passing interpreter. It models a shared-memory multiprocessor executing programs written in Common Lisp, extended with several primitives for creating and controlling processes. This paper describes the structure of the simulator, measures its performance, and gives an example of its use with a parallel Lisp program.

  20. Parallel and Distributed Computing.

    DTIC Science & Technology

    1986-12-12

    program was devoted to parallel and distributed computing . Support for this part of the program was obtained from the present Army contract and a...Umesh Vazirani. A workshop on parallel and distributed computing was held from May 19 to May 23, 1986 and drew 141 participants. Keywords: Mathematical programming; Protocols; Randomized algorithms. (Author)

  1. Synchronization Of Parallel Discrete Event Simulations

    NASA Technical Reports Server (NTRS)

    Steinman, Jeffrey S.

    1992-01-01

    Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.

  2. Massively parallel mathematical sieves

    SciTech Connect

    Montry, G.R.

    1989-01-01

    The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.

  3. Totally parallel multilevel algorithms

    NASA Technical Reports Server (NTRS)

    Frederickson, Paul O.

    1988-01-01

    Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.

  4. Parallel computing works

    SciTech Connect

    Not Available

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  5. Simulation Exploration through Immersive Parallel Planes: Preprint

    SciTech Connect

    Brunhart-Lupo, Nicholas; Bush, Brian W.; Gruchalla, Kenny; Smith, Steve

    2016-03-01

    We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, each individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.

  6. Parallel Adaptive Mesh Refinement

    SciTech Connect

    Diachin, L; Hornung, R; Plassmann, P; WIssink, A

    2005-03-04

    As large-scale, parallel computers have become more widely available and numerical models and algorithms have advanced, the range of physical phenomena that can be simulated has expanded dramatically. Many important science and engineering problems exhibit solutions with localized behavior where highly-detailed salient features or large gradients appear in certain regions which are separated by much larger regions where the solution is smooth. Examples include chemically-reacting flows with radiative heat transfer, high Reynolds number flows interacting with solid objects, and combustion problems where the flame front is essentially a two-dimensional sheet occupying a small part of a three-dimensional domain. Modeling such problems numerically requires approximating the governing partial differential equations on a discrete domain, or grid. Grid spacing is an important factor in determining the accuracy and cost of a computation. A fine grid may be needed to resolve key local features while a much coarser grid may suffice elsewhere. Employing a fine grid everywhere may be inefficient at best and, at worst, may make an adequately resolved simulation impractical. Moreover, the location and resolution of fine grid required for an accurate solution is a dynamic property of a problem's transient features and may not be known a priori. Adaptive mesh refinement (AMR) is a technique that can be used with both structured and unstructured meshes to adjust local grid spacing dynamically to capture solution features with an appropriate degree of resolution. Thus, computational resources can be focused where and when they are needed most to efficiently achieve an accurate solution without incurring the cost of a globally-fine grid. Figure 1.1 shows two example computations using AMR; on the left is a structured mesh calculation of a impulsively-sheared contact surface and on the right is the fuselage and volume discretization of an RAH-66 Comanche helicopter [35]. Note the

  7. A dual-site simultaneous binding mode in the interaction between parallel-stranded G-quadruplex [d(TGGGGT)]4 and cyanine dye 2,2′-diethyl-9-methyl-selenacarbocyanine bromide

    PubMed Central

    Gai, Wei; Yang, Qianfan; Xiang, Junfeng; Jiang, Wei; Li, Qian; Sun, Hongxia; Guan, Aijiao; Shang, Qian; Zhang, Hong; Tang, Yalin

    2013-01-01

    G-quadruplexes have attracted growing attention as a potential cancer-associated target for both treatment and detection in recent years. For detection purpose, high specificity is one of the most important factors to be considered in G-quadruplex probe design. It is well known that end stacking and groove binding are two dominated quadruplex-ligand binding modes, and currently most reported G-quadruplex probes are designed based on the former, which has been proven to show good selectivity between quadruplexes and non-quadruplexes. Because groove of G-quadruplex also has some unique chemical properties, it could be inferred that probes that can interact with both the groove and G-tetrad site of certain G-quadruplexes simultaneously might possess higher specificity in aspects of discriminating different quadruplexes. In this article, we report a cyanine dye as a potential novel probe scaffold that could occupy both the 5′-end external G-tetrad and the corresponding groove of the G-quadruplex simultaneously. By using various spectrum and nuclear magnetic resonance techniques, we give a detailed binding characterization for this dual-site simultaneous binding mode. A preliminary result suggests that this mode might provide highly specific recognition to a parallel-stranded G-quadruplex. These findings and the structural elucidation might give some clues in aspects of developing highly specific G-quadruplex probes. PMID:23275573

  8. The NAS parallel benchmarks

    NASA Technical Reports Server (NTRS)

    Bailey, David (Editor); Barton, John (Editor); Lasinski, Thomas (Editor); Simon, Horst (Editor)

    1993-01-01

    A new set of benchmarks was developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of a set of kernels, the 'Parallel Kernels,' and a simulated application benchmark. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification - all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.

  9. Parallel programming with PCN

    SciTech Connect

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  10. Parallel programming with PCN

    SciTech Connect

    Foster, I.; Tuecke, S.

    1991-09-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, a set of tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory at info.mcs.anl.gov.

  11. The Parallel Axiom

    ERIC Educational Resources Information Center

    Rogers, Pat

    1972-01-01

    Criteria for a reasonable axiomatic system are discussed. A discussion of the historical attempts to prove the independence of Euclids parallel postulate introduces non-Euclidean geometries. Poincare's model for a non-Euclidean geometry is defined and analyzed. (LS)

  12. Unsteady flow simulation on a parallel computer

    NASA Astrophysics Data System (ADS)

    Faden, M.; Pokorny, S.; Engel, K.

    For the simulation of the flow through compressor stages, an interactive flow simulation system is set up on an MIMD-type parallel computer. An explicit scheme is used in order to resolve the time-dependent interaction between the blades. The 2D Navier-Stokes equations are transformed into their general moving coordinates. The parallelization of the solver is based on the idea of domain decomposition. Results are presented for a problem of fixed size (4096 grid nodes for the Hakkinen case).

  13. Scalable parallel communications

    NASA Technical Reports Server (NTRS)

    Maly, K.; Khanna, S.; Overstreet, C. M.; Mukkamala, R.; Zubair, M.; Sekhar, Y. S.; Foudriat, E. C.

    1992-01-01

    Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth

  14. Parallel image compression

    NASA Technical Reports Server (NTRS)

    Reif, John H.

    1987-01-01

    A parallel compression algorithm for the 16,384 processor MPP machine was developed. The serial version of the algorithm can be viewed as a combination of on-line dynamic lossless test compression techniques (which employ simple learning strategies) and vector quantization. These concepts are described. How these concepts are combined to form a new strategy for performing dynamic on-line lossy compression is discussed. Finally, the implementation of this algorithm in a massively parallel fashion on the MPP is discussed.

  15. Revisiting and parallelizing SHAKE

    NASA Astrophysics Data System (ADS)

    Weinbach, Yael; Elber, Ron

    2005-10-01

    An algorithm is presented for running SHAKE in parallel. SHAKE is a widely used approach to compute molecular dynamics trajectories with constraints. An essential step in SHAKE is the solution of a sparse linear problem of the type Ax = b, where x is a vector of unknowns. Conjugate gradient minimization (that can be done in parallel) replaces the widely used iteration process that is inherently serial. Numerical examples present good load balancing and are limited only by communication time.

  16. Code Parallelization with CAPO: A User Manual

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry; Biegel, Bryan (Technical Monitor)

    2001-01-01

    A software tool has been developed to assist the parallelization of scientific codes. This tool, CAPO, extends an existing parallelization toolkit, CAPTools developed at the University of Greenwich, to generate OpenMP parallel codes for shared memory architectures. This is an interactive toolkit to transform a serial Fortran application code to an equivalent parallel version of the software - in a small fraction of the time normally required for a manual parallelization. We first discuss the way in which loop types are categorized and how efficient OpenMP directives can be defined and inserted into the existing code using the in-depth interprocedural analysis. The use of the toolkit on a number of application codes ranging from benchmark to real-world application codes is presented. This will demonstrate the great potential of using the toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of toolkit to quickly parallelize serial programs as well as the good performance achievable on a large number of processors. The second part of the document gives references to the parameters and the graphic user interface implemented in the toolkit. Finally a set of tutorials is included for hands-on experiences with this toolkit.

  17. Parallel architectures for vision

    SciTech Connect

    Maresca, M. ); Lavin, M.A. ); Li, H. )

    1988-08-01

    Vision computing involves the execution of a large number of operations on large sets of structured data. Sequential computers cannot achieve the speed required by most of the current applications and therefore parallel architectural solutions have to be explored. In this paper the authors examine the options that drive the design of a vision oriented computer, starting with the analysis of the basic vision computation and communication requirements. They briefly review the classical taxonomy for parallel computers, based on the multiplicity of the instruction and data stream, and apply a recently proposed criterion, the degree of autonomy of each processor, to further classify fine-grain SIMD massively parallel computers. They identify three types of processor autonomy, namely operation autonomy, addressing autonomy, and connection autonomy. For each type they give the basic definitions and show some examples. They focus on the concept of connection autonomy, which they believe is a key point in the development of massively parallel architectures for vision. They show two examples of parallel computers featuring different types of connection autonomy - the Connection Machine and the Polymorphic-Torus - and compare their cost and benefit.

  18. Sublattice parallel replica dynamics

    NASA Astrophysics Data System (ADS)

    Martínez, Enrique; Uberuaga, Blas P.; Voter, Arthur F.

    2014-06-01

    Exascale computing presents a challenge for the scientific community as new algorithms must be developed to take full advantage of the new computing paradigm. Atomistic simulation methods that offer full fidelity to the underlying potential, i.e., molecular dynamics (MD) and parallel replica dynamics, fail to use the whole machine speedup, leaving a region in time and sample size space that is unattainable with current algorithms. In this paper, we present an extension of the parallel replica dynamics algorithm [A. F. Voter, Phys. Rev. B 57, R13985 (1998), 10.1103/PhysRevB.57.R13985] by combining it with the synchronous sublattice approach of Shim and Amar [Y. Shim and J. G. Amar, Phys. Rev. B 71, 125432 (2005), 10.1103/PhysRevB.71.125432], thereby exploiting event locality to improve the algorithm scalability. This algorithm is based on a domain decomposition in which events happen independently in different regions in the sample. We develop an analytical expression for the speedup given by this sublattice parallel replica dynamics algorithm and compare it with parallel MD and traditional parallel replica dynamics. We demonstrate how this algorithm, which introduces a slight additional approximation of event locality, enables the study of physical systems unreachable with traditional methodologies and promises to better utilize the resources of current high performance and future exascale computers.

  19. Parallel optical sampler

    DOEpatents

    Tauke-Pedretti, Anna; Skogen, Erik J; Vawter, Gregory A

    2014-05-20

    An optical sampler includes a first and second 1.times.n optical beam splitters splitting an input optical sampling signal and an optical analog input signal into n parallel channels, respectively, a plurality of optical delay elements providing n parallel delayed input optical sampling signals, n photodiodes converting the n parallel optical analog input signals into n respective electrical output signals, and n optical modulators modulating the input optical sampling signal or the optical analog input signal by the respective electrical output signals, and providing n successive optical samples of the optical analog input signal. A plurality of output photodiodes and eADCs convert the n successive optical samples to n successive digital samples. The optical modulator may be a photodiode interconnected Mach-Zehnder Modulator. A method of sampling the optical analog input signal is disclosed.

  20. Deoxyribo Nanonucleic Acid: Antiparallel, Parallel and Unparalleled

    SciTech Connect

    Egli, M.

    2010-03-05

    The crystal structure of a single-stranded DNA oligonucleotide has revealed formation of a unique three-dimensional array by continuous antiparallel and parallel pairing between monomers. The array is based on tertiary interactions and represents a second-generation nanotechnological system.

  1. CRUNCH_PARALLEL

    SciTech Connect

    Shumaker, Dana E.; Steefel, Carl I.

    2016-06-21

    The code CRUNCH_PARALLEL is a parallel version of the CRUNCH code. CRUNCH code version 2.0 was previously released by LLNL, (UCRL-CODE-200063). Crunch is a general purpose reactive transport code developed by Carl Steefel and Yabusake (Steefel Yabsaki 1996). The code handles non-isothermal transport and reaction in one, two, and three dimensions. The reaction algorithm is generic in form, handling an arbitrary number of aqueous and surface complexation as well as mineral dissolution/precipitation. A standardized database is used containing thermodynamic and kinetic data. The code includes advective, dispersive, and diffusive transport.

  2. The NAS Parallel Benchmarks

    SciTech Connect

    Bailey, David H.

    2009-11-15

    The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, although the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage

  3. Highly parallel computation

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.; Tichy, Walter F.

    1990-01-01

    Among the highly parallel computing architectures required for advanced scientific computation, those designated 'MIMD' and 'SIMD' have yielded the best results to date. The present development status evaluation of such architectures shown neither to have attained a decisive advantage in most near-homogeneous problems' treatment; in the cases of problems involving numerous dissimilar parts, however, such currently speculative architectures as 'neural networks' or 'data flow' machines may be entailed. Data flow computers are the most practical form of MIMD fine-grained parallel computers yet conceived; they automatically solve the problem of assigning virtual processors to the real processors in the machine.

  4. Adaptive parallel logic networks

    NASA Technical Reports Server (NTRS)

    Martinez, Tony R.; Vidal, Jacques J.

    1988-01-01

    Adaptive, self-organizing concurrent systems (ASOCS) that combine self-organization with massive parallelism for such applications as adaptive logic devices, robotics, process control, and system malfunction management, are presently discussed. In ASOCS, an adaptive network composed of many simple computing elements operating in combinational and asynchronous fashion is used and problems are specified by presenting if-then rules to the system in the form of Boolean conjunctions. During data processing, which is a different operational phase from adaptation, the network acts as a parallel hardware circuit.

  5. Parallel molecular dynamics: Communication requirements for massively parallel machines

    NASA Astrophysics Data System (ADS)

    Taylor, Valerie E.; Stevens, Rick L.; Arnold, Kathryn E.

    1995-05-01

    Molecular mechanics and dynamics are becoming widely used to perform simulations of molecular systems from large-scale computations of materials to the design and modeling of drug compounds. In this paper we address two major issues: a good decomposition method that can take advantage of future massively parallel processing systems for modest-sized problems in the range of 50,000 atoms and the communication requirements needed to achieve 30 to 40% efficiency on MPPs. We analyzed a scalable benchmark molecular dynamics program executing on the Intel Touchstone Deleta parallelized with an interaction decomposition method. Using a validated analytical performance model of the code, we determined that for an MPP with a four-dimensional mesh topology and 400 MHz processors the communication startup time must be at most 30 clock cycles and the network bandwidth must be at least 2.3 GB/s. This configuration results in 30 to 40% efficiency of the MPP for a problem with 50,000 atoms executing on 50,000 processors.

  6. Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors

    NASA Technical Reports Server (NTRS)

    Banerjee, Prithviraj; Jones, Mark Howard; Sargent, Jeff S.

    1990-01-01

    Two parallel algorithms for standard cell placement using simulated annealing are developed to run on distributed-memory message-passing hypercube multiprocessors. The cells can be mapped in a two-dimensional area of a chip onto processors in an n-dimensional hypercube in two ways, such that both small and large cell exchange and displacement moves can be applied. The computation of the cost function in parallel among all the processors in the hypercube is described, along with a distributed data structure that needs to be stored in the hypercube to support the parallel cost evaluation. A novel tree broadcasting strategy is used extensively for updating cell locations in the parallel environment. A dynamic parallel annealing schedule estimates the errors due to interacting parallel moves and adapts the rate of synchronization automatically. Two novel approaches in controlling error in parallel algorithms are described: heuristic cell coloring and adaptive sequence control.

  7. Parallel Coordinate Axes.

    ERIC Educational Resources Information Center

    Friedlander, Alex; And Others

    1982-01-01

    Several methods of numerical mappings other than the usual cartesian coordinate system are considered. Some examples using parallel axes representation, which are seen to lead to aesthetically pleasing or interesting configurations, are presented. Exercises with alternative representations can stimulate pupil imagination and exploration in…

  8. Parallel programming with PCN

    SciTech Connect

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  9. Massively parallel processor computer

    NASA Technical Reports Server (NTRS)

    Fung, L. W. (Inventor)

    1983-01-01

    An apparatus for processing multidimensional data with strong spatial characteristics, such as raw image data, characterized by a large number of parallel data streams in an ordered array is described. It comprises a large number (e.g., 16,384 in a 128 x 128 array) of parallel processing elements operating simultaneously and independently on single bit slices of a corresponding array of incoming data streams under control of a single set of instructions. Each of the processing elements comprises a bidirectional data bus in communication with a register for storing single bit slices together with a random access memory unit and associated circuitry, including a binary counter/shift register device, for performing logical and arithmetical computations on the bit slices, and an I/O unit for interfacing the bidirectional data bus with the data stream source. The massively parallel processor architecture enables very high speed processing of large amounts of ordered parallel data, including spatial translation by shifting or sliding of bits vertically or horizontally to neighboring processing elements.

  10. High performance parallel architectures

    SciTech Connect

    Anderson, R.E. )

    1989-09-01

    In this paper the author describes current high performance parallel computer architectures. A taxonomy is presented to show computer architecture from the user programmer's point-of-view. The effects of the taxonomy upon the programming model are described. Some current architectures are described with respect to the taxonomy. Finally, some predictions about future systems are presented. 5 refs., 1 fig.

  11. Parallel fast gauss transform

    SciTech Connect

    Sampath, Rahul S; Sundar, Hari; Veerapaneni, Shravan

    2010-01-01

    We present fast adaptive parallel algorithms to compute the sum of N Gaussians at N points. Direct sequential computation of this sum would take O(N{sup 2}) time. The parallel time complexity estimates for our algorithms are O(N/n{sub p}) for uniform point distributions and O( (N/n{sub p}) log (N/n{sub p}) + n{sub p}log n{sub p}) for non-uniform distributions using n{sub p} CPUs. We incorporate a plane-wave representation of the Gaussian kernel which permits 'diagonal translation'. We use parallel octrees and a new scheme for translating the plane-waves to efficiently handle non-uniform distributions. Computing the transform to six-digit accuracy at 120 billion points took approximately 140 seconds using 4096 cores on the Jaguar supercomputer. Our implementation is 'kernel-independent' and can handle other 'Gaussian-type' kernels even when explicit analytic expression for the kernel is not known. These algorithms form a new class of core computational machinery for solving parabolic PDEs on massively parallel architectures.

  12. Parallel hierarchical radiosity rendering

    SciTech Connect

    Carter, M.

    1993-07-01

    In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

  13. Parallel hierarchical global illumination

    SciTech Connect

    Snell, Quinn O.

    1997-10-08

    Solving the global illumination problem is equivalent to determining the intensity of every wavelength of light in all directions at every point in a given scene. The complexity of the problem has led researchers to use approximation methods for solving the problem on serial computers. Rather than using an approximation method, such as backward ray tracing or radiosity, the authors have chosen to solve the Rendering Equation by direct simulation of light transport from the light sources. This paper presents an algorithm that solves the Rendering Equation to any desired accuracy, and can be run in parallel on distributed memory or shared memory computer systems with excellent scaling properties. It appears superior in both speed and physical correctness to recent published methods involving bidirectional ray tracing or hybrid treatments of diffuse and specular surfaces. Like progressive radiosity methods, it dynamically refines the geometry decomposition where required, but does so without the excessive storage requirements for ray histories. The algorithm, called Photon, produces a scene which converges to the global illumination solution. This amounts to a huge task for a 1997-vintage serial computer, but using the power of a parallel supercomputer significantly reduces the time required to generate a solution. Currently, Photon can be run on most parallel environments from a shared memory multiprocessor to a parallel supercomputer, as well as on clusters of heterogeneous workstations.

  14. Parallel Multigrid Equation Solver

    SciTech Connect

    Adams, Mark

    2001-09-07

    Prometheus is a fully parallel multigrid equation solver for matrices that arise in unstructured grid finite element applications. It includes a geometric and an algebraic multigrid method and has solved problems of up to 76 mullion degrees of feedom, problems in linear elasticity on the ASCI blue pacific and ASCI red machines.

  15. Method of moment solutions to scattering problems in a parallel processing environment

    NASA Technical Reports Server (NTRS)

    Cwik, Tom; Partee, Jonathan; Patterson, Jean

    1991-01-01

    This paper describes the implementation of a parallelized method of moments (MOM) code into an interactive workstation environment. The workstation allows interactive solid body modeling and mesh generation, MOM analysis, and the graphical display of results. After describing the parallel computing environment, the implementation and results of parallelizing a general MOM code are presented in detail.

  16. Programming parallel architectures - The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1989-01-01

    This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.

  17. Strength of Multiple Parallel Biological Bonds

    SciTech Connect

    Sulchek, T A; Friddle, R W; Noy, A

    2005-12-07

    Multivalent interactions play a critical role in a variety of biological processes on both molecular and cellular levels. We have used molecular force spectroscopy to investigate the strength of multiple parallel peptide-antibody bonds using a system that allowed us to determine the rupture forces and the number of ruptured bonds independently. In our experiments the interacting molecules were attached to the surfaces of the probe and sample of the atomic force microscope with flexible polymer tethers, and unique mechanical signature of the tethers determined the number of ruptured bonds. We show that the rupture forces increase with the number of interacting molecules and that the measured forces obey the predictions of a Markovian model for the strength of multiple parallel bonds. We also discuss the implications of our results to the interpretation of force spectroscopy measurements in multiple bond systems.

  18. Parallel grid population

    DOEpatents

    Wald, Ingo; Ize, Santiago

    2015-07-28

    Parallel population of a grid with a plurality of objects using a plurality of processors. One example embodiment is a method for parallel population of a grid with a plurality of objects using a plurality of processors. The method includes a first act of dividing a grid into n distinct grid portions, where n is the number of processors available for populating the grid. The method also includes acts of dividing a plurality of objects into n distinct sets of objects, assigning a distinct set of objects to each processor such that each processor determines by which distinct grid portion(s) each object in its distinct set of objects is at least partially bounded, and assigning a distinct grid portion to each processor such that each processor populates its distinct grid portion with any objects that were previously determined to be at least partially bounded by its distinct grid portion.

  19. Parallel Anisotropic Tetrahedral Adaptation

    NASA Technical Reports Server (NTRS)

    Park, Michael A.; Darmofal, David L.

    2008-01-01

    An adaptive method that robustly produces high aspect ratio tetrahedra to a general 3D metric specification without introducing hybrid semi-structured regions is presented. The elemental operators and higher-level logic is described with their respective domain-decomposed parallelizations. An anisotropic tetrahedral grid adaptation scheme is demonstrated for 1000-1 stretching for a simple cube geometry. This form of adaptation is applicable to more complex domain boundaries via a cut-cell approach as demonstrated by a parallel 3D supersonic simulation of a complex fighter aircraft. To avoid the assumptions and approximations required to form a metric to specify adaptation, an approach is introduced that directly evaluates interpolation error. The grid is adapted to reduce and equidistribute this interpolation error calculation without the use of an intervening anisotropic metric. Direct interpolation error adaptation is illustrated for 1D and 3D domains.

  20. Parallel Subconvolution Filtering Architectures

    NASA Technical Reports Server (NTRS)

    Gray, Andrew A.

    2003-01-01

    These architectures are based on methods of vector processing and the discrete-Fourier-transform/inverse-discrete- Fourier-transform (DFT-IDFT) overlap-and-save method, combined with time-block separation of digital filters into frequency-domain subfilters implemented by use of sub-convolutions. The parallel-processing method implemented in these architectures enables the use of relatively small DFT-IDFT pairs, while filter tap lengths are theoretically unlimited. The size of a DFT-IDFT pair is determined by the desired reduction in processing rate, rather than on the order of the filter that one seeks to implement. The emphasis in this report is on those aspects of the underlying theory and design rules that promote computational efficiency, parallel processing at reduced data rates, and simplification of the designs of very-large-scale integrated (VLSI) circuits needed to implement high-order filters and correlators.

  1. Parallel multilevel preconditioners

    SciTech Connect

    Bramble, J.H.; Pasciak, J.E.; Xu, Jinchao.

    1989-01-01

    In this paper, we shall report on some techniques for the development of preconditioners for the discrete systems which arise in the approximation of solutions to elliptic boundary value problems. Here we shall only state the resulting theorems. It has been demonstrated that preconditioned iteration techniques often lead to the most computationally effective algorithms for the solution of the large algebraic systems corresponding to boundary value problems in two and three dimensional Euclidean space. The use of preconditioned iteration will become even more important on computers with parallel architecture. This paper discusses an approach for developing completely parallel multilevel preconditioners. In order to illustrate the resulting algorithms, we shall describe the simplest application of the technique to a model elliptic problem.

  2. Ultrascalable petaflop parallel supercomputer

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Chiu, George; Cipolla, Thomas M.; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Hall, Shawn; Haring, Rudolf A.; Heidelberger, Philip; Kopcsay, Gerard V.; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan; Takken, Todd

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  3. Homology, convergence and parallelism

    PubMed Central

    Ghiselin, Michael T.

    2016-01-01

    Homology is a relation of correspondence between parts of parts of larger wholes. It is used when tracking objects of interest through space and time and in the context of explanatory historical narratives. Homologues can be traced through a genealogical nexus back to a common ancestral precursor. Homology being a transitive relation, homologues remain homologous however much they may come to differ. Analogy is a relationship of correspondence between parts of members of classes having no relationship of common ancestry. Although homology is often treated as an alternative to convergence, the latter is not a kind of correspondence: rather, it is one of a class of processes that also includes divergence and parallelism. These often give rise to misleading appearances (homoplasies). Parallelism can be particularly hard to detect, especially when not accompanied by divergences in some parts of the body. PMID:26598721

  4. Parallel unstructured grid generation

    NASA Technical Reports Server (NTRS)

    Loehner, Rainald; Camberos, Jose; Merriam, Marshal

    1991-01-01

    A parallel unstructured grid generation algorithm is presented and implemented on the Hypercube. Different processor hierarchies are discussed, and the appropraite hierarchies for mesh generation and mesh smoothing are selected. A domain-splitting algorithm for unstructured grids which tries to minimize the surface-to-volume ratio of each subdomain is described. This splitting algorithm is employed both for grid generation and grid smoothing. Results obtained on the Hypercube demonstrate the effectiveness of the algorithms developed.

  5. Development of Parallel GSSHA

    DTIC Science & Technology

    2013-09-01

    C en te r Paul R. Eller , Jing-Ru C. Cheng, Aaron R. Byrd, Charles W. Downer, and Nawa Pradhan September 2013 Approved for public release...Program ERDC TR-13-8 September 2013 Development of Parallel GSSHA Paul R. Eller and Jing-Ru C. Cheng Information Technology Laboratory US Army Engineer...5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Paul Eller , Ruth Cheng, Aaron Byrd, Chuck Downer, and Nawa Pradhan 5d. PROJECT NUMBER

  6. Xyce parallel electronic simulator.

    SciTech Connect

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd S; Pawlowski, Roger P; Santarelli, Keith R.

    2010-05-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.

  7. Massively Parallel Genetics.

    PubMed

    Shendure, Jay; Fields, Stanley

    2016-06-01

    Human genetics has historically depended on the identification of individuals whose natural genetic variation underlies an observable trait or disease risk. Here we argue that new technologies now augment this historical approach by allowing the use of massively parallel assays in model systems to measure the functional effects of genetic variation in many human genes. These studies will help establish the disease risk of both observed and potential genetic variants and to overcome the problem of "variants of uncertain significance."

  8. Parallel sphere rendering

    SciTech Connect

    Krogh, M.; Painter, J.; Hansen, C.

    1996-10-01

    Sphere rendering is an important method for visualizing molecular dynamics data. This paper presents a parallel algorithm that is almost 90 times faster than current graphics workstations. To render extremely large data sets and large images, the algorithm uses the MIMD features of the supercomputers to divide up the data, render independent partial images, and then finally composite the multiple partial images using an optimal method. The algorithm and performance results are presented for the CM-5 and the M.

  9. Implementation of Parallel Algorithms

    DTIC Science & Technology

    1993-06-30

    their socia ’ relations or to achieve some goals. For example, we define a pair-wise force law of i epulsion and attraction for a group of identical...quantization based compression schemes. Photo-refractive crystals, which provide high density recording in real time, are used as our holographic media . The...of Parallel Algorithms (J. Reif, ed.). Kluwer Academic Pu’ ishers, 1993. (4) "A Dynamic Separator Algorithm", D. Armon and J. Reif. To appear in

  10. Globality and speed of optical parallel processors.

    PubMed

    Lohmann, A W; Marathay, A S

    1989-09-15

    The chances of optical computing are probably best if a large number of processing elements act in parallel. The efficiency of parallel processors depends, among other things, on the time it takes to communicate signals from one processor to any other processor. In an optical parallel processor one hopes to be able to transmit a signal from one processor to any other processor within only one cycle period, no matter how far apart the processors are. Such a global communications network is desirable especially for algorithms with global interactions. The fast Fourier algorithm is an example. We define a degree of globality and we show how speed and globality are related. Our result applies to a specific architecture based on spatial filtering.

  11. Trajectory optimization using parallel shooting method on parallel computer

    SciTech Connect

    Wirthman, D.J.; Park, S.Y.; Vadali, S.R.

    1995-03-01

    The efficiency of a parallel shooting method on a parallel computer for solving a variety of optimal control guidance problems is studied. Several examples are considered to demonstrate that a speedup of nearly 7 to 1 is achieved with the use of 16 processors. It is suggested that further improvements in performance can be achieved by parallelizing in the state domain. 10 refs.

  12. Equalizer: a scalable parallel rendering framework.

    PubMed

    Eilemann, Stefan; Makhinya, Maxim; Pajarola, Renato

    2009-01-01

    Continuing improvements in CPU and GPU performances as well as increasing multi-core processor and cluster-based parallelism demand for flexible and scalable parallel rendering solutions that can exploit multipipe hardware accelerated graphics. In fact, to achieve interactive visualization, scalable rendering systems are essential to cope with the rapid growth of data sets. However, parallel rendering systems are non-trivial to develop and often only application specific implementations have been proposed. The task of developing a scalable parallel rendering framework is even more difficult if it should be generic to support various types of data and visualization applications, and at the same time work efficiently on a cluster with distributed graphics cards. In this paper we introduce a novel system called Equalizer, a toolkit for scalable parallel rendering based on OpenGL which provides an application programming interface (API) to develop scalable graphics applications for a wide range of systems ranging from large distributed visualization clusters and multi-processor multipipe graphics systems to single-processor single-pipe desktop machines. We describe the system architecture, the basic API, discuss its advantages over previous approaches, present example configurations and usage scenarios as well as scalability results.

  13. The Galley Parallel File System

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David

    1996-01-01

    As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. The interface conceals the parallelism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. We discuss Galley's file structure and application interface, as well as an application that has been implemented using that interface.

  14. Asynchronous interpretation of parallel microprograms

    SciTech Connect

    Bandman, O.L.

    1984-03-01

    In this article, the authors demonstrate how to pass from a given synchronous interpretation of a parallel microprogram to an equivalent asynchronous interpretation, and investigate the cost associated with the rejection of external synchronization in parallel microprogram structures.

  15. Status of TRANSP Parallel Services

    NASA Astrophysics Data System (ADS)

    Indireshkumar, K.; Andre, Robert; McCune, Douglas; Randerson, Lewis

    2006-10-01

    The PPPL TRANSP code suite has been used successfully over many years to carry out time dependent simulations of tokamak plasmas. However, accurately modeling certain phenomena such as RF heating and fast ion behavior using TRANSP requires extensive computational power and will benefit from parallelization. Parallelizing all of TRANSP is not required and parts will run sequentially while other parts run parallelized. To efficiently use a site's parallel services, the parallelized TRANSP modules are deployed to a shared ``parallel service'' on a separate cluster. The PPPL Monte Carlo fast ion module NUBEAM and the MIT RF module TORIC are the first TRANSP modules to be so deployed. This poster will show the performance scaling of these modules within the parallel server. Communications between the serial client and the parallel server will be described in detail, and measurements of startup and communications overhead will be shown. Physics modeling benefits for TRANSP users will be assessed.

  16. Resistor Combinations for Parallel Circuits.

    ERIC Educational Resources Information Center

    McTernan, James P.

    1978-01-01

    To help simplify both teaching and learning of parallel circuits, a high school electricity/electronics teacher presents and illustrates the use of tables of values for parallel resistive circuits in which total resistances are whole numbers. (MF)

  17. Parallel Debugging Using Graphical Views

    DTIC Science & Technology

    1988-03-01

    Voyeur , a prototype system for creating graphical views of parallel programs, provid(s a cost-effective way to construct such views for any parallel...programming system. We illustrate Voyeur by discussing four views created for debugging Poker programs. One is a vteneral trace facility for any Poker...Graphical views are essential for debugging parallel programs because of the large quan- tity of state information contained in parallel programs. Voyeur

  18. Parallel Pascal - An extended Pascal for parallel computers

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.

    1984-01-01

    Parallel Pascal is an extended version of the conventional serial Pascal programming language which includes a convenient syntax for specifying array operations. It is upward compatible with standard Pascal and involves only a small number of carefully chosen new features. Parallel Pascal was developed to reduce the semantic gap between standard Pascal and a large range of highly parallel computers. Two important design goals of Parallel Pascal were efficiency and portability. Portability is particularly difficult to achieve since different parallel computers frequently have very different capabilities.

  19. CSM parallel structural methods research

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.

    1989-01-01

    Parallel structural methods, research team activities, advanced architecture computers for parallel computational structural mechanics (CSM) research, the FLEX/32 multicomputer, a parallel structural analyses testbed, blade-stiffened aluminum panel with a circular cutout and the dynamic characteristics of a 60 meter, 54-bay, 3-longeron deployable truss beam are among the topics discussed.

  20. Roo: A parallel theorem prover

    SciTech Connect

    Lusk, E.L.; McCune, W.W.; Slaney, J.K.

    1991-11-01

    We describe a parallel theorem prover based on the Argonne theorem-proving system OTTER. The parallel system, called Roo, runs on shared-memory multiprocessors such as the Sequent Symmetry. We explain the parallel algorithm used and give performance results that demonstrate near-linear speedups on large problems.

  1. Parallel Eclipse Project Checkout

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas M.; Joswig, Joseph C.; Shams, Khawaja S.; Powell, Mark W.; Bachmann, Andrew G.

    2011-01-01

    Parallel Eclipse Project Checkout (PEPC) is a program written to leverage parallelism and to automate the checkout process of plug-ins created in Eclipse RCP (Rich Client Platform). Eclipse plug-ins can be aggregated in a feature project. This innovation digests a feature description (xml file) and automatically checks out all of the plug-ins listed in the feature. This resolves the issue of manually checking out each plug-in required to work on the project. To minimize the amount of time necessary to checkout the plug-ins, this program makes the plug-in checkouts parallel. After parsing the feature, a request to checkout for each plug-in in the feature has been inserted. These requests are handled by a thread pool with a configurable number of threads. By checking out the plug-ins in parallel, the checkout process is streamlined before getting started on the project. For instance, projects that took 30 minutes to checkout now take less than 5 minutes. The effect is especially clear on a Mac, which has a network monitor displaying the bandwidth use. When running the client from a developer s home, the checkout process now saturates the bandwidth in order to get all the plug-ins checked out as fast as possible. For comparison, a checkout process that ranged from 8-200 Kbps from a developer s home is now able to saturate a pipe of 1.3 Mbps, resulting in significantly faster checkouts. Eclipse IDE (integrated development environment) tries to build a project as soon as it is downloaded. As part of another optimization, this innovation programmatically tells Eclipse to stop building while checkouts are happening, which dramatically reduces lock contention and enables plug-ins to continue downloading until all of them finish. Furthermore, the software re-enables automatic building, and forces Eclipse to do a clean build once it finishes checking out all of the plug-ins. This software is fully generic and does not contain any NASA-specific code. It can be applied to any

  2. Parallel sphere rendering

    SciTech Connect

    Krogh, M.; Hansen, C.; Painter, J.; de Verdiere, G.C.

    1995-05-01

    Sphere rendering is an important method for visualizing molecular dynamics data. This paper presents a parallel divide-and-conquer algorithm that is almost 90 times faster than current graphics workstations. To render extremely large data sets and large images, the algorithm uses the MIMD features of the supercomputers to divide up the data, render independent partial images, and then finally composite the multiple partial images using an optimal method. The algorithm and performance results are presented for the CM-5 and the T3D.

  3. Parallelized direct execution simulation of message-passing parallel programs

    NASA Technical Reports Server (NTRS)

    Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

    1994-01-01

    As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.

  4. A parallel world in the dark

    SciTech Connect

    Higaki, Tetsutaro; Jeong, Kwang Sik; Takahashi, Fuminobu E-mail: ksjeong@tuhep.phys.tohoku.ac.jp

    2013-08-01

    The baryon-dark matter coincidence is a long-standing issue. Interestingly, the recent observations suggest the presence of dark radiation, which, if confirmed, would pose another coincidence problem of why the density of dark radiation is comparable to that of photons. These striking coincidences may be traced back to the dark sector with particle contents and interactions that are quite similar, if not identical, to the standard model: a dark parallel world. It naturally solves the coincidence problems of dark matter and dark radiation, and predicts a sterile neutrino(s) with mass of O(0.1−1) eV, as well as self-interacting dark matter made of the counterpart of ordinary baryons. We find a robust prediction for the relation between the abundance of dark radiation and the sterile neutrino, which can serve as the smoking-gun evidence of the dark parallel world.

  5. Parallelizing quantum circuit synthesis

    NASA Astrophysics Data System (ADS)

    Di Matteo, Olivia; Mosca, Michele

    2016-03-01

    Quantum circuit synthesis is the process in which an arbitrary unitary operation is decomposed into a sequence of gates from a universal set, typically one which a quantum computer can implement both efficiently and fault-tolerantly. As physical implementations of quantum computers improve, the need is growing for tools that can effectively synthesize components of the circuits and algorithms they will run. Existing algorithms for exact, multi-qubit circuit synthesis scale exponentially in the number of qubits and circuit depth, leaving synthesis intractable for circuits on more than a handful of qubits. Even modest improvements in circuit synthesis procedures may lead to significant advances, pushing forward the boundaries of not only the size of solvable circuit synthesis problems, but also in what can be realized physically as a result of having more efficient circuits. We present a method for quantum circuit synthesis using deterministic walks. Also termed pseudorandom walks, these are walks in which once a starting point is chosen, its path is completely determined. We apply our method to construct a parallel framework for circuit synthesis, and implement one such version performing optimal T-count synthesis over the Clifford+T gate set. We use our software to present examples where parallelization offers a significant speedup on the runtime, as well as directly confirm that the 4-qubit 1-bit full adder has optimal T-count 7 and T-depth 3.

  6. Parallel ptychographic reconstruction

    PubMed Central

    Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom; Deng, Junjing; Ross, Rob; Jacobsen, Chris

    2014-01-01

    Ptychography is an imaging method whereby a coherent beam is scanned across an object, and an image is obtained by iterative phasing of the set of diffraction patterns. It is able to be used to image extended objects at a resolution limited by scattering strength of the object and detector geometry, rather than at an optics-imposed limit. As technical advances allow larger fields to be imaged, computational challenges arise for reconstructing the correspondingly larger data volumes, yet at the same time there is also a need to deliver reconstructed images immediately so that one can evaluate the next steps to take in an experiment. Here we present a parallel method for real-time ptychographic phase retrieval. It uses a hybrid parallel strategy to divide the computation between multiple graphics processing units (GPUs) and then employs novel techniques to merge sub-datasets into a single complex phase and amplitude image. Results are shown on a simulated specimen and a real dataset from an X-ray experiment conducted at a synchrotron light source. PMID:25607174

  7. Tolerant (parallel) Programming

    NASA Technical Reports Server (NTRS)

    DiNucci, David C.; Bailey, David H. (Technical Monitor)

    1997-01-01

    In order to be truly portable, a program must be tolerant of a wide range of development and execution environments, and a parallel program is just one which must be tolerant of a very wide range. This paper first defines the term "tolerant programming", then describes many layers of tools to accomplish it. The primary focus is on F-Nets, a formal model for expressing computation as a folded partial-ordering of operations, thereby providing an architecture-independent expression of tolerant parallel algorithms. For implementing F-Nets, Cooperative Data Sharing (CDS) is a subroutine package for implementing communication efficiently in a large number of environments (e.g. shared memory and message passing). Software Cabling (SC), a very-high-level graphical programming language for building large F-Nets, possesses many of the features normally expected from today's computer languages (e.g. data abstraction, array operations). Finally, L2(sup 3) is a CASE tool which facilitates the construction, compilation, execution, and debugging of SC programs.

  8. A parallel, portable and versatile treecode

    SciTech Connect

    Warren, M.S.; Salmon, J.K. |

    1994-10-01

    Portability and versatility are important characteristics of a computer program which is meant to be generally useful. We describe how we have developed a parallel N-body treecode to meet these goals. A variety of applications to which the code can be applied are mentioned. Performance of the program is also measured on several machines. A 512 processor Intel Paragon can solve for the forces on 10 million gravitationally interacting particles to 0.5% rms accuracy in 28.6 seconds.

  9. A systolic array parallelizing compiler

    SciTech Connect

    Tseng, P.S. )

    1990-01-01

    This book presents a completely new approach to the problem of systolic array parallelizing compiler. It describes the AL parallelizing compiler for the Warp systolic array, the first working systolic array parallelizing compiler which can generate efficient parallel code for complete LINPACK routines. This book begins by analyzing the architectural strength of the Warp systolic array. It proposes a model for mapping programs onto the machine and introduces the notion of data relations for optimizing the program mapping. Also presented are successful applications of the AL compiler in matrix computation and image processing. A complete listing of the source program and compiler-generated parallel code are given to clarify the overall picture of the compiler. The book concludes that systolic array parallelizing compiler can produce efficient parallel code, almost identical to what the user would have written by hand.

  10. Parallel Computing in SCALE

    SciTech Connect

    DeHart, Mark D; Williams, Mark L; Bowman, Stephen M

    2010-01-01

    The SCALE computational architecture has remained basically the same since its inception 30 years ago, although constituent modules and capabilities have changed significantly. This SCALE concept was intended to provide a framework whereby independent codes can be linked to provide a more comprehensive capability than possible with the individual programs - allowing flexibility to address a wide variety of applications. However, the current system was designed originally for mainframe computers with a single CPU and with significantly less memory than today's personal computers. It has been recognized that the present SCALE computation system could be restructured to take advantage of modern hardware and software capabilities, while retaining many of the modular features of the present system. Preliminary work is being done to define specifications and capabilities for a more advanced computational architecture. This paper describes the state of current SCALE development activities and plans for future development. With the release of SCALE 6.1 in 2010, a new phase of evolutionary development will be available to SCALE users within the TRITON and NEWT modules. The SCALE (Standardized Computer Analyses for Licensing Evaluation) code system developed by Oak Ridge National Laboratory (ORNL) provides a comprehensive and integrated package of codes and nuclear data for a wide range of applications in criticality safety, reactor physics, shielding, isotopic depletion and decay, and sensitivity/uncertainty (S/U) analysis. Over the last three years, since the release of version 5.1 in 2006, several important new codes have been introduced within SCALE, and significant advances applied to existing codes. Many of these new features became available with the release of SCALE 6.0 in early 2009. However, beginning with SCALE 6.1, a first generation of parallel computing is being introduced. In addition to near-term improvements, a plan for longer term SCALE enhancement

  11. Toward Parallel Document Clustering

    SciTech Connect

    Mogill, Jace A.; Haglin, David J.

    2011-09-01

    A key challenge to automated clustering of documents in large text corpora is the high cost of comparing documents in a multimillion dimensional document space. The Anchors Hierarchy is a fast data structure and algorithm for localizing data based on a triangle inequality obeying distance metric, the algorithm strives to minimize the number of distance calculations needed to cluster the documents into “anchors” around reference documents called “pivots”. We extend the original algorithm to increase the amount of available parallelism and consider two implementations: a complex data structure which affords efficient searching, and a simple data structure which requires repeated sorting. The sorting implementation is integrated with a text corpora “Bag of Words” program and initial performance results of end-to-end a document processing workflow are reported.

  12. Parallel Polarization State Generation

    NASA Astrophysics Data System (ADS)

    She, Alan; Capasso, Federico

    2016-05-01

    The control of polarization, an essential property of light, is of wide scientific and technological interest. The general problem of generating arbitrary time-varying states of polarization (SOP) has always been mathematically formulated by a series of linear transformations, i.e. a product of matrices, imposing a serial architecture. Here we show a parallel architecture described by a sum of matrices. The theory is experimentally demonstrated by modulating spatially-separated polarization components of a laser using a digital micromirror device that are subsequently beam combined. This method greatly expands the parameter space for engineering devices that control polarization. Consequently, performance characteristics, such as speed, stability, and spectral range, are entirely dictated by the technologies of optical intensity modulation, including absorption, reflection, emission, and scattering. This opens up important prospects for polarization state generation (PSG) with unique performance characteristics with applications in spectroscopic ellipsometry, spectropolarimetry, communications, imaging, and security.

  13. Parallel Polarization State Generation

    PubMed Central

    She, Alan; Capasso, Federico

    2016-01-01

    The control of polarization, an essential property of light, is of wide scientific and technological interest. The general problem of generating arbitrary time-varying states of polarization (SOP) has always been mathematically formulated by a series of linear transformations, i.e. a product of matrices, imposing a serial architecture. Here we show a parallel architecture described by a sum of matrices. The theory is experimentally demonstrated by modulating spatially-separated polarization components of a laser using a digital micromirror device that are subsequently beam combined. This method greatly expands the parameter space for engineering devices that control polarization. Consequently, performance characteristics, such as speed, stability, and spectral range, are entirely dictated by the technologies of optical intensity modulation, including absorption, reflection, emission, and scattering. This opens up important prospects for polarization state generation (PSG) with unique performance characteristics with applications in spectroscopic ellipsometry, spectropolarimetry, communications, imaging, and security. PMID:27184813

  14. A parallel programming environment supporting multiple data-parallel modules

    SciTech Connect

    Seevers, B.K.; Quinn, M.J. ); Hatcher, P.J. )

    1992-10-01

    We describe a system that allows programmers to take advantage of both control and data parallelism through multiple intercommunicating data-parallel modules. This programming environment extends C-type stream I/O to include intermodule communication channels. The progammer writes each module as a separate data-parallel program, then develops a channel linker specification describing how to connect the modules together. A channel linker we have developed loads the separate modules on the parallel machine and binds the communication channels together as specified. We present performance data that demonstrates a mixed control- and data-parallel solution can yield better performance than a strictly data-parallel solution. The system described currently runs on the Intel iWarp multicomputer.

  15. Parallel imaging microfluidic cytometer.

    PubMed

    Ehrlich, Daniel J; McKenna, Brian K; Evans, James G; Belkina, Anna C; Denis, Gerald V; Sherr, David H; Cheung, Man Ching

    2011-01-01

    By adding an additional degree of freedom from multichannel flow, the parallel microfluidic cytometer (PMC) combines some of the best features of fluorescence-activated flow cytometry (FCM) and microscope-based high-content screening (HCS). The PMC (i) lends itself to fast processing of large numbers of samples, (ii) adds a 1D imaging capability for intracellular localization assays (HCS), (iii) has a high rare-cell sensitivity, and (iv) has an unusual capability for time-synchronized sampling. An inability to practically handle large sample numbers has restricted applications of conventional flow cytometers and microscopes in combinatorial cell assays, network biology, and drug discovery. The PMC promises to relieve a bottleneck in these previously constrained applications. The PMC may also be a powerful tool for finding rare primary cells in the clinic. The multichannel architecture of current PMC prototypes allows 384 unique samples for a cell-based screen to be read out in ∼6-10 min, about 30 times the speed of most current FCM systems. In 1D intracellular imaging, the PMC can obtain protein localization using HCS marker strategies at many times for the sample throughput of charge-coupled device (CCD)-based microscopes or CCD-based single-channel flow cytometers. The PMC also permits the signal integration time to be varied over a larger range than is practical in conventional flow cytometers. The signal-to-noise advantages are useful, for example, in counting rare positive cells in the most difficult early stages of genome-wide screening. We review the status of parallel microfluidic cytometry and discuss some of the directions the new technology may take.

  16. Initial results of a model rotor higher harmonic control (HHC) wind tunnel experiment on BVI impulsive noise reduction

    NASA Astrophysics Data System (ADS)

    Splettstoesser, W. R.; Lehmann, G.; van der Wall, B.

    1989-09-01

    Initial acoustic results are presented from a higher harmonic control (HHC) wind tunnel pilot experiment on helicopter rotor blade-vortex interaction (BVI) impulsive noise reduction, making use of the DFVLR 40-percent-scaled BO-105 research rotor in the DNW 6m by 8m closed test section. Considerable noise reduction (of several decibels) has been measured for particular HHC control settings, however, at the cost of increased vibration levels and vice versa. The apparently adverse results for noise and vibration reduction by HHC are explained. At optimum pitch control settings for BVI noise reduction, rotor simulation results demonstrate that blade loading at the outer tip region is decreased, vortex strength and blade vortex miss-distance are increased, resulting altogether in reduced BVI noise generation. At optimum pitch control settings for vibration reduction adverse effects on blade loading, vortex strength and blade vortex miss-distance are found.

  17. "Serial" effects in parallel models of reading.

    PubMed

    Chang, Ya-Ning; Furber, Steve; Welbourne, Stephen

    2012-06-01

    There is now considerable evidence showing that the time to read a word out loud is influenced by an interaction between orthographic length and lexicality. Given that length effects are interpreted by advocates of dual-route models as evidence of serial processing this would seem to pose a serious challenge to models of single word reading which postulate a common parallel processing mechanism for reading both words and nonwords (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; Rastle, Havelka, Wydell, Coltheart, & Besner, 2009). However, an alternative explanation of these data is that visual processes outside the scope of existing parallel models are responsible for generating the word-length related phenomena (Seidenberg & Plaut, 1998). Here we demonstrate that a parallel model of single word reading can account for the differential word-length effects found in the naming latencies of words and nonwords, provided that it includes a mapping from visual to orthographic representations, and that the nature of those orthographic representations are not preconstrained. The model can also simulate other supposedly "serial" effects. The overall findings were consistent with the view that visual processing contributes substantially to the word-length effects in normal reading and provided evidence to support the single-route theory which assumes words and nonwords are processed in parallel by a common mechanism.

  18. Parallel processor engine model program

    NASA Technical Reports Server (NTRS)

    Mclaughlin, P.

    1984-01-01

    The Parallel Processor Engine Model Program is a generalized engineering tool intended to aid in the design of parallel processing real-time simulations of turbofan engines. It is written in the FORTRAN programming language and executes as a subset of the SOAPP simulation system. Input/output and execution control are provided by SOAPP; however, the analysis, emulation and simulation functions are completely self-contained. A framework in which a wide variety of parallel processing architectures could be evaluated and tools with which the parallel implementation of a real-time simulation technique could be assessed are provided.

  19. Parallel processing and expert systems

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Lau, Sonie

    1991-01-01

    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 90's cannot enjoy an increased level of autonomy without the efficient use of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real time demands are met for large expert systems. Speed-up via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial labs in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems was surveyed. The survey is divided into three major sections: (1) multiprocessors for parallel expert systems; (2) parallel languages for symbolic computations; and (3) measurements of parallelism of expert system. Results to date indicate that the parallelism achieved for these systems is small. In order to obtain greater speed-ups, data parallelism and application parallelism must be exploited.

  20. Experiment versus theory

    NASA Technical Reports Server (NTRS)

    Schmitz, F. H.; Yu, Y. H.; Boxwell, D. A.

    1982-01-01

    High speed compressibility noise and vortex interaction noise, which are aerodynamically generated noise sources, were investigated. Noise generating mechanisms were identified. Linear and nonlinear theory were compared and are in agreement with data on amplitude and wave forms. The interaction area between the acoustic planform and blade/vortex interaction lines are examined.

  1. Using Motivational Interviewing Techniques to Address Parallel Process in Supervision

    ERIC Educational Resources Information Center

    Giordano, Amanda; Clarke, Philip; Borders, L. DiAnne

    2013-01-01

    Supervision offers a distinct opportunity to experience the interconnection of counselor-client and counselor-supervisor interactions. One product of this network of interactions is parallel process, a phenomenon by which counselors unconsciously identify with their clients and subsequently present to their supervisors in a similar fashion…

  2. Parallel Programming in the Age of Ubiquitous Parallelism

    NASA Astrophysics Data System (ADS)

    Pingali, Keshav

    2014-04-01

    Multicore and manycore processors are now ubiquitous, but parallel programming remains as difficult as it was 30-40 years ago. During this time, our community has explored many promising approaches including functional and dataflow languages, logic programming, and automatic parallelization using program analysis and restructuring, but none of these approaches has succeeded except in a few niche application areas. In this talk, I will argue that these problems arise largely from the computation-centric foundations and abstractions that we currently use to think about parallelism. In their place, I will propose a novel data-centric foundation for parallel programming called the operator formulation in which algorithms are described in terms of actions on data. The operator formulation shows that a generalized form of data-parallelism called amorphous data-parallelism is ubiquitous even in complex, irregular graph applications such as mesh generation/refinement/partitioning and SAT solvers. Regular algorithms emerge as a special case of irregular ones, and many application-specific optimization techniques can be generalized to a broader context. The operator formulation also leads to a structural analysis of algorithms called TAO-analysis that provides implementation guidelines for exploiting parallelism efficiently. Finally, I will describe a system called Galois based on these ideas for exploiting amorphous data-parallelism on multicores and GPUs

  3. Trajectories in parallel optics.

    PubMed

    Klapp, Iftach; Sochen, Nir; Mendlovic, David

    2011-10-01

    In our previous work we showed the ability to improve the optical system's matrix condition by optical design, thereby improving its robustness to noise. It was shown that by using singular value decomposition, a target point-spread function (PSF) matrix can be defined for an auxiliary optical system, which works parallel to the original system to achieve such an improvement. In this paper, after briefly introducing the all optics implementation of the auxiliary system, we show a method to decompose the target PSF matrix. This is done through a series of shifted responses of auxiliary optics (named trajectories), where a complicated hardware filter is replaced by postprocessing. This process manipulates the pixel confined PSF response of simple auxiliary optics, which in turn creates an auxiliary system with the required PSF matrix. This method is simulated on two space variant systems and reduces their system condition number from 18,598 to 197 and from 87,640 to 5.75, respectively. We perform a study of the latter result and show significant improvement in image restoration performance, in comparison to a system without auxiliary optics and to other previously suggested hybrid solutions. Image restoration results show that in a range of low signal-to-noise ratio values, the trajectories method gives a significant advantage over alternative approaches. A third space invariant study case is explored only briefly, and we present a significant improvement in the matrix condition number from 1.9160e+013 to 34,526.

  4. High Performance Parallel Architectures

    NASA Technical Reports Server (NTRS)

    El-Ghazawi, Tarek; Kaewpijit, Sinthop

    1998-01-01

    Traditional remote sensing instruments are multispectral, where observations are collected at a few different spectral bands. Recently, many hyperspectral instruments, that can collect observations at hundreds of bands, have been operational. Furthermore, there have been ongoing research efforts on ultraspectral instruments that can produce observations at thousands of spectral bands. While these remote sensing technology developments hold great promise for new findings in the area of Earth and space science, they present many challenges. These include the need for faster processing of such increased data volumes, and methods for data reduction. Dimension Reduction is a spectral transformation, aimed at concentrating the vital information and discarding redundant data. One such transformation, which is widely used in remote sensing, is the Principal Components Analysis (PCA). This report summarizes our progress on the development of a parallel PCA and its implementation on two Beowulf cluster configuration; one with fast Ethernet switch and the other with a Myrinet interconnection. Details of the implementation and performance results, for typical sets of multispectral and hyperspectral NASA remote sensing data, are presented and analyzed based on the algorithm requirements and the underlying machine configuration. It will be shown that the PCA application is quite challenging and hard to scale on Ethernet-based clusters. However, the measurements also show that a high- performance interconnection network, such as Myrinet, better matches the high communication demand of PCA and can lead to a more efficient PCA execution.

  5. Mapping between parallel processor structures and programs

    NASA Technical Reports Server (NTRS)

    Ngai, Tin-Fook; Yan, Jerry C.; Mak, Victor W. K.; Flynn, Michael J.; Lundstrom, Stephen F.

    1987-01-01

    This paper reports some ongoing research efforts at Stanford in allocation of parallel processing resources. Both processor structures and program structures have their own characteristics. Resource allocation binds the two structures during program execution. The mapping problem determines what processor structure and program structure may be combined to obtain maximum speedup. Three approaches to this mapping problem are considered. Two important factors, granularity and interaction delay, are also considered. A new hierarchical approach to structure definition is outlined. Effective and efficient tools are necessary for the study of the mapping problem. A fast turn-around simulation environment developed for investigating partition strategies for distributed computations and a computationally efficient method to predict performance of parallel processor structures are described.

  6. Parallel Computational Protein Design

    PubMed Central

    Zhou, Yichao; Donald, Bruce R.; Zeng, Jianyang

    2016-01-01

    Computational structure-based protein design (CSPD) is an important problem in computational biology, which aims to design or improve a prescribed protein function based on a protein structure template. It provides a practical tool for real-world protein engineering applications. A popular CSPD method that guarantees to find the global minimum energy solution (GMEC) is to combine both dead-end elimination (DEE) and A* tree search algorithms. However, in this framework, the A* search algorithm can run in exponential time in the worst case, which may become the computation bottleneck of large-scale computational protein design process. To address this issue, we extend and add a new module to the OSPREY program that was previously developed in the Donald lab [1] to implement a GPU-based massively parallel A* algorithm for improving protein design pipeline. By exploiting the modern GPU computational framework and optimizing the computation of the heuristic function for A* search, our new program, called gOSPREY, can provide up to four orders of magnitude speedups in large protein design cases with a small memory overhead comparing to the traditional A* search algorithm implementation, while still guaranteeing the optimality. In addition, gOSPREY can be configured to run in a bounded-memory mode to tackle the problems in which the conformation space is too large and the global optimal solution cannot be computed previously. Furthermore, the GPU-based A* algorithm implemented in the gOSPREY program can be combined with the state-of-the-art rotamer pruning algorithms such as iMinDEE [2] and DEEPer [3] to also consider continuous backbone and side-chain flexibility. PMID:27914056

  7. A Parallel Particle Swarm Optimizer

    DTIC Science & Technology

    2003-01-01

    by a computationally demanding biomechanical system identification problem, we introduce a parallel implementation of a stochastic population based...concurrent computation. The parallelization of the Particle Swarm Optimization (PSO) algorithm is detailed and its performance and characteristics demonstrated for the biomechanical system identification problem as example.

  8. Programming parallel architectures: The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1988-01-01

    Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.

  9. Tile-based Level of Detail for the Parallel Age

    SciTech Connect

    Niski, K; Cohen, J D

    2007-08-15

    Today's PCs incorporate multiple CPUs and GPUs and are easily arranged in clusters for high-performance, interactive graphics. We present an approach based on hierarchical, screen-space tiles to parallelizing rendering with level of detail. Adapt tiles, render tiles, and machine tiles are associated with CPUs, GPUs, and PCs, respectively, to efficiently parallelize the workload with good resource utilization. Adaptive tile sizes provide load balancing while our level of detail system allows total and independent management of the load on CPUs and GPUs. We demonstrate our approach on parallel configurations consisting of both single PCs and a cluster of PCs.

  10. Parallel Grid Manipulations in Earth Science Calculations

    NASA Technical Reports Server (NTRS)

    Sawyer, W.; Lucchesi, R.; daSilva, A.; Takacs, L. L.

    1999-01-01

    The National Aeronautics and Space Administration (NASA) Data Assimilation Office (DAO) at the Goddard Space Flight Center is moving its data assimilation system to massively parallel computing platforms. This parallel implementation of GEOS DAS will be used in the DAO's normal activities, which include reanalysis of data, and operational support for flight missions. Key components of GEOS DAS, including the gridpoint-based general circulation model and a data analysis system, are currently being parallelized. The parallelization of GEOS DAS is also one of the HPCC Grand Challenge Projects. The GEOS-DAS software employs several distinct grids. Some examples are: an observation grid- an unstructured grid of points at which observed or measured physical quantities from instruments or satellites are associated- a highly-structured latitude-longitude grid of points spanning the earth at given latitude-longitude coordinates at which prognostic quantities are determined, and a computational lat-lon grid in which the pole has been moved to a different location to avoid computational instabilities. Each of these grids has a different structure and number of constituent points. In spite of that, there are numerous interactions between the grids, e.g., values on one grid must be interpolated to another, or, in other cases, grids need to be redistributed on the underlying parallel platform. The DAO has designed a parallel integrated library for grid manipulations (PILGRIM) to support the needed grid interactions with maximum efficiency. It offers a flexible interface to generate new grids, define transformations between grids and apply them. Basic communication is currently MPI, however the interfaces defined here could conceivably be implemented with other message-passing libraries, e.g., Cray SHMEM, or with shared-memory constructs. The library is written in Fortran 90. First performance results indicate that even difficult problems, such as above-mentioned pole rotation- a

  11. The Galley Parallel File System

    NASA Technical Reports Server (NTRS)

    Nieuwejaar, Nils; Kotz, David

    1996-01-01

    Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/0 requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley's file structure and application interface, as well as the performance advantages offered by that interface.

  12. Parallel contingency statistics with Titan.

    SciTech Connect

    Thompson, David C.; Pebay, Philippe Pierre

    2009-09-01

    This report summarizes existing statistical engines in VTK/Titan and presents the recently parallelized contingency statistics engine. It is a sequel to [PT08] and [BPRT09] which studied the parallel descriptive, correlative, multi-correlative, and principal component analysis engines. The ease of use of this new parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; however, the very nature of contingency tables prevent this new engine from exhibiting optimal parallel speed-up as the aforementioned engines do. This report therefore discusses the design trade-offs we made and study performance with up to 200 processors.

  13. Parallel NPARC: Implementation and Performance

    NASA Technical Reports Server (NTRS)

    Townsend, S. E.

    1996-01-01

    Version 3 of the NPARC Navier-Stokes code includes support for large-grain (block level) parallelism using explicit message passing between a heterogeneous collection of computers. This capability has the potential for significant performance gains, depending upon the block data distribution. The parallel implementation uses a master/worker arrangement of processes. The master process assigns blocks to workers, controls worker actions, and provides remote file access for the workers. The processes communicate via explicit message passing using an interface library which provides portability to a number of message passing libraries, such as PVM (Parallel Virtual Machine). A Bourne shell script is used to simplify the task of selecting hosts, starting processes, retrieving remote files, and terminating a computation. This script also provides a simple form of fault tolerance. An analysis of the computational performance of NPARC is presented, using data sets from an F/A-18 inlet study and a Rocket Based Combined Cycle Engine analysis. Parallel speedup and overall computational efficiency were obtained for various NPARC run parameters on a cluster of IBM RS6000 workstations. The data show that although NPARC performance compares favorably with the estimated potential parallelism, typical data sets used with previous versions of NPARC will often need to be reblocked for optimum parallel performance. In one of the cases studied, reblocking increased peak parallel speedup from 3.2 to 11.8.

  14. Parallel processing and expert systems

    NASA Technical Reports Server (NTRS)

    Lau, Sonie; Yan, Jerry C.

    1991-01-01

    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited.

  15. Parallel integer sorting with medium and fine-scale parallelism

    NASA Technical Reports Server (NTRS)

    Dagum, Leonardo

    1993-01-01

    Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.

  16. EFFICIENT SCHEDULING OF PARALLEL JOBS ON MASSIVELY PARALLEL SYSTEMS

    SciTech Connect

    F. PETRINI; W. FENG

    1999-09-01

    We present buffered coscheduling, a new methodology to multitask parallel jobs in a message-passing environment and to develop parallel programs that can pave the way to the efficient implementation of a distributed operating system. Buffered coscheduling is based on three innovative techniques: communication buffering, strobing, and non-blocking communication. By leveraging these techniques, we can perform effective optimizations based on the global status of the parallel machine rather than on the limited knowledge available locally to each processor. The advantages of buffered coscheduling include higher resource utilization, reduced communication overhead, efficient implementation of low-control strategies and fault-tolerant protocols, accurate performance modeling, and a simplified yet still expressive parallel programming model. Preliminary experimental results show that buffered coscheduling is very effective in increasing the overall performance in the presence of load imbalance and communication-intensive workloads.

  17. Template based parallel checkpointing in a massively parallel computer system

    DOEpatents

    Archer, Charles Jens; Inglett, Todd Alan

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  18. Adaptive, multiresolution visualization of large data sets using parallel octrees.

    SciTech Connect

    Freitag, L. A.; Loy, R. M.

    1999-06-10

    The interactive visualization and exploration of large scientific data sets is a challenging and difficult task; their size often far exceeds the performance and memory capacity of even the most powerful graphics work-stations. To address this problem, we have created a technique that combines hierarchical data reduction methods with parallel computing to allow interactive exploration of large data sets while retaining full-resolution capability. The hierarchical representation is built in parallel by strategically inserting field data into an octree data structure. We provide functionality that allows the user to interactively adapt the resolution of the reduced data sets so that resolution is increased in regions of interest without sacrificing local graphics performance. We describe the creation of the reduced data sets using a parallel octree, the software architecture of the system, and the performance of this system on the data from a Rayleigh-Taylor instability simulation.

  19. Parallel Architecture For Robotics Computation

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Bejczy, Antal K.

    1990-01-01

    Universal Real-Time Robotic Controller and Simulator (URRCS) is highly parallel computing architecture for control and simulation of robot motion. Result of extensive algorithmic study of different kinematic and dynamic computational problems arising in control and simulation of robot motion. Study led to development of class of efficient parallel algorithms for these problems. Represents algorithmically specialized architecture, in sense capable of exploiting common properties of this class of parallel algorithms. System with both MIMD and SIMD capabilities. Regarded as processor attached to bus of external host processor, as part of bus memory.

  20. Multigrid on massively parallel architectures

    SciTech Connect

    Falgout, R D; Jones, J E

    1999-09-17

    The scalable implementation of multigrid methods for machines with several thousands of processors is investigated. Parallel performance models are presented for three different structured-grid multigrid algorithms, and a description is given of how these models can be used to guide implementation. Potential pitfalls are illustrated when moving from moderate-sized parallelism to large-scale parallelism, and results are given from existing multigrid codes to support the discussion. Finally, the use of mixed programming models is investigated for multigrid codes on clusters of SMPs.

  1. IOPA: I/O-aware parallelism adaption for parallel programs

    PubMed Central

    Liu, Tao; Liu, Yi; Qian, Chen; Qian, Depei

    2017-01-01

    With the development of multi-/many-core processors, applications need to be written as parallel programs to improve execution efficiency. For data-intensive applications that use multiple threads to read/write files simultaneously, an I/O sub-system can easily become a bottleneck when too many of these types of threads exist; on the contrary, too few threads will cause insufficient resource utilization and hurt performance. Therefore, programmers must pay much attention to parallelism control to find the appropriate number of I/O threads for an application. This paper proposes a parallelism control mechanism named IOPA that can adjust the parallelism of applications to adapt to the I/O capability of a system and balance computing resources and I/O bandwidth. The programming interface of IOPA is also provided to programmers to simplify parallel programming. IOPA is evaluated using multiple applications with both solid state and hard disk drives. The results show that the parallel applications using IOPA can achieve higher efficiency than those with a fixed number of threads. PMID:28278236

  2. Appendix E: Parallel Pascal development system

    NASA Technical Reports Server (NTRS)

    1985-01-01

    The Parallel Pascal Development System enables Parallel Pascal programs to be developed and tested on a conventional computer. It consists of several system programs, including a Parallel Pascal to standard Pascal translator, and a library of Parallel Pascal subprograms. The library includes subprograms for using Parallel Pascal on a parallel system with a fixed degree of parallelism, such as the Massively Parallel Processor, to conveniently manipulate arrays which have dimensions than the hardware. Programs can be conveninetly tested with small sized arrays on the conventional computer before attempting to run on a parallel system.

  3. Evaluation of the Interactions between Water Extractable Soil Organic Matter and Metal Cations (Cu(II), Eu(III)) Using Excitation-Emission Matrix Combined with Parallel Factor Analysis

    PubMed Central

    Wei, Jing; Han, Lu; Song, Jing; Chen, Mengfang

    2015-01-01

    The objectives of this study were to evaluate the binding behavior of Cu(II) and Eu(III) with water extractable organic matter (WEOM) in soil, and assess the competitive effect of the cations. Excitation-emission matrix (EEM) fluorescence spectrometry was used in combination with parallel factor analysis (PARAFAC) to obtain four WEOM components: fulvic-like, humic-like, microbial degraded humic-like, and protein-like substances. Fluorescence titration experiments were performed to obtain the binding parameters of PARAFAC-derived components with Cu(II) and Eu(III). The conditional complexation stability constants (logKM) of Cu(II) with the four components ranged from 5.49 to 5.94, and the Eu(III) logKM values were between 5.26 to 5.81. The component-specific binding parameters obtained from competitive binding experiments revealed that Cu(II) and Eu(III) competed for the same binding sites on the WEOM components. These results would help understand the molecular binding mechanisms of Cu(II) and Eu(III) with WEOM in soil environment. PMID:26121300

  4. Parallel Strategies for Crash and Impact Simulations

    SciTech Connect

    Attaway, S.; Brown, K.; Hendrickson, B.; Plimpton, S.

    1998-12-07

    We describe a general strategy we have found effective for parallelizing solid mechanics simula- tions. Such simulations often have several computationally intensive parts, including finite element integration, detection of material contacts, and particle interaction if smoothed particle hydrody- namics is used to model highly deforming materials. The need to balance all of these computations simultaneously is a difficult challenge that has kept many commercial and government codes from being used effectively on parallel supercomputers with hundreds or thousands of processors. Our strategy is to load-balance each of the significant computations independently with whatever bal- ancing technique is most appropriate. The chief benefit is that each computation can be scalably paraIlelized. The drawback is the data exchange between processors and extra coding that must be written to maintain multiple decompositions in a single code. We discuss these trade-offs and give performance results showing this strategy has led to a parallel implementation of a widely-used solid mechanics code that can now be run efficiently on thousands of processors of the Pentium-based Sandia/Intel TFLOPS machine. We illustrate with several examples the kinds of high-resolution, million-element models that can now be simulated routinely. We also look to the future and dis- cuss what possibilities this new capabUity promises, as well as the new set of challenges it poses in material models, computational techniques, and computing infrastructure.

  5. New NAS Parallel Benchmarks Results

    NASA Technical Reports Server (NTRS)

    Yarrow, Maurice; Saphir, William; VanderWijngaart, Rob; Woo, Alex; Kutler, Paul (Technical Monitor)

    1997-01-01

    NPB2 (NAS (NASA Advanced Supercomputing) Parallel Benchmarks 2) is an implementation, based on Fortran and the MPI (message passing interface) message passing standard, of the original NAS Parallel Benchmark specifications. NPB2 programs are run with little or no tuning, in contrast to NPB vendor implementations, which are highly optimized for specific architectures. NPB2 results complement, rather than replace, NPB results. Because they have not been optimized by vendors, NPB2 implementations approximate the performance a typical user can expect for a portable parallel program on distributed memory parallel computers. Together these results provide an insightful comparison of the real-world performance of high-performance computers. New NPB2 features: New implementation (CG), new workstation class problem sizes, new serial sample versions, more performance statistics.

  6. "Feeling" Series and Parallel Resistances.

    ERIC Educational Resources Information Center

    Morse, Robert A.

    1993-01-01

    Equipped with drinking straws and stirring straws, a teacher can help students understand how resistances in electric circuits combine in series and in parallel. Follow-up suggestions are provided. (ZWH)

  7. Demonstrating Forces between Parallel Wires.

    ERIC Educational Resources Information Center

    Baker, Blane

    2000-01-01

    Describes a physics demonstration that dramatically illustrates the mutual repulsion (attraction) between parallel conductors using insulated copper wire, wooden dowels, a high direct current power supply, electrical tape, and an overhead projector. (WRM)

  8. Parallel programming of industrial applications

    SciTech Connect

    Heroux, M; Koniges, A; Simon, H

    1998-07-21

    In the introductory material, we overview the typical MPP environment for real application computing and the special tools available such as parallel debuggers and performance analyzers. Next, we draw from a series of real applications codes and discuss the specific challenges and problems that are encountered in parallelizing these individual applications. The application areas drawn from include biomedical sciences, materials processing and design, plasma and fluid dynamics, and others. We show how it was possible to get a particular application to run efficiently and what steps were necessary. Finally we end with a summary of the lessons learned from these applications and predictions for the future of industrial parallel computing. This tutorial is based on material from a forthcoming book entitled: "Industrial Strength Parallel Computing" to be published by Morgan Kaufmann Publishers (ISBN l-55860-54).

  9. Distinguishing serial and parallel parsing.

    PubMed

    Gibson, E; Pearlmutter, N J

    2000-03-01

    This paper discusses ways of determining whether the human parser is serial maintaining at most, one structural interpretation at each parse state, or whether it is parallel, maintaining more than one structural interpretation in at least some circumstances. We make four points. The first two counterclaims made by Lewis (2000): (1) that the availability of alternative structures should not vary as a function of the disambiguating material in some ranked parallel models; and (2) that parallel models predict a slow down during the ambiguous region for more syntactically ambiguous structures. Our other points concern potential methods for seeking experimental evidence relevant to the serial/parallel question. We discuss effects of the plausibility of a secondary structure in the ambiguous region (Pearlmutter & Mendelsohn, 1999) and suggest examining the distribution of reaction times in the disambiguating region.

  10. Address tracing for parallel machines

    NASA Technical Reports Server (NTRS)

    Stunkel, Craig B.; Janssens, Bob; Fuchs, W. Kent

    1991-01-01

    Recently implemented parallel system address-tracing methods based on several metrics are surveyed. The issues specific to collection of traces for both shared and distributed memory parallel computers are highlighted. Five general categories of address-trace collection methods are examined: hardware-captured, interrupt-based, simulation-based, altered microcode-based, and instrumented program-based traces. The problems unique to shared memory and distributed memory multiprocessors are examined separately.

  11. Parallel Algorithms for Image Analysis.

    DTIC Science & Technology

    1982-06-01

    8217 _ _ _ _ _ _ _ 4. TITLE (aid Subtitle) S. TYPE OF REPORT & PERIOD COVERED PARALLEL ALGORITHMS FOR IMAGE ANALYSIS TECHNICAL 6. PERFORMING O4G. REPORT NUMBER TR-1180...Continue on reverse side it neceesary aid Identlfy by block number) Image processing; image analysis ; parallel processing; cellular computers. 20... IMAGE ANALYSIS TECHNICAL 6. PERFORMING ONG. REPORT NUMBER TR-1180 - 7. AUTHOR(&) S. CONTRACT OR GRANT NUMBER(s) Azriel Rosenfeld AFOSR-77-3271 9

  12. Debugging in a parallel environment

    SciTech Connect

    Wasserman, H.J.; Griffin, J.H.

    1985-01-01

    This paper describes the preliminary results of a project investigating approaches to dynamic debugging in parallel processing systems. Debugging programs in a multiprocessing environment is particularly difficult because of potential errors in synchronization of tasks, data dependencies, sharing of data among tasks, and irreproducibility of specific machine instruction sequences from one job to the next. The basic methodology involved in predicate-based debuggers is given as well as other desirable features of dynamic parallel debugging. 13 refs.

  13. Efficiency of parallel direct optimization

    NASA Technical Reports Server (NTRS)

    Janies, D. A.; Wheeler, W. C.

    2001-01-01

    Tremendous progress has been made at the level of sequential computation in phylogenetics. However, little attention has been paid to parallel computation. Parallel computing is particularly suited to phylogenetics because of the many ways large computational problems can be broken into parts that can be analyzed concurrently. In this paper, we investigate the scaling factors and efficiency of random addition and tree refinement strategies using the direct optimization software, POY, on a small (10 slave processors) and a large (256 slave processors) cluster of networked PCs running LINUX. These algorithms were tested on several data sets composed of DNA and morphology ranging from 40 to 500 taxa. Various algorithms in POY show fundamentally different properties within and between clusters. All algorithms are efficient on the small cluster for the 40-taxon data set. On the large cluster, multibuilding exhibits excellent parallel efficiency, whereas parallel building is inefficient. These results are independent of data set size. Branch swapping in parallel shows excellent speed-up for 16 slave processors on the large cluster. However, there is no appreciable speed-up for branch swapping with the further addition of slave processors (>16). This result is independent of data set size. Ratcheting in parallel is efficient with the addition of up to 32 processors in the large cluster. This result is independent of data set size. c2001 The Willi Hennig Society.

  14. Architectures for reasoning in parallel

    NASA Technical Reports Server (NTRS)

    Hall, Lawrence O.

    1989-01-01

    The research conducted has dealt with rule-based expert systems. The algorithms that may lead to effective parallelization of them were investigated. Both the forward and backward chained control paradigms were investigated in the course of this work. The best computer architecture for the developed and investigated algorithms has been researched. Two experimental vehicles were developed to facilitate this research. They are Backpac, a parallel backward chained rule-based reasoning system and Datapac, a parallel forward chained rule-based reasoning system. Both systems have been written in Multilisp, a version of Lisp which contains the parallel construct, future. Applying the future function to a function causes the function to become a task parallel to the spawning task. Additionally, Backpac and Datapac have been run on several disparate parallel processors. The machines are an Encore Multimax with 10 processors, the Concert Multiprocessor with 64 processors, and a 32 processor BBN GP1000. Both the Concert and the GP1000 are switch-based machines. The Multimax has all its processors hung off a common bus. All are shared memory machines, but have different schemes for sharing the memory and different locales for the shared memory. The main results of the investigations come from experiments on the 10 processor Encore and the Concert with partitions of 32 or less processors. Additionally, experiments have been run with a stripped down version of EMYCIN.

  15. Efficiency of parallel direct optimization.

    PubMed

    Janies, D A; Wheeler, W C

    2001-03-01

    Tremendous progress has been made at the level of sequential computation in phylogenetics. However, little attention has been paid to parallel computation. Parallel computing is particularly suited to phylogenetics because of the many ways large computational problems can be broken into parts that can be analyzed concurrently. In this paper, we investigate the scaling factors and efficiency of random addition and tree refinement strategies using the direct optimization software, POY, on a small (10 slave processors) and a large (256 slave processors) cluster of networked PCs running LINUX. These algorithms were tested on several data sets composed of DNA and morphology ranging from 40 to 500 taxa. Various algorithms in POY show fundamentally different properties within and between clusters. All algorithms are efficient on the small cluster for the 40-taxon data set. On the large cluster, multibuilding exhibits excellent parallel efficiency, whereas parallel building is inefficient. These results are independent of data set size. Branch swapping in parallel shows excellent speed-up for 16 slave processors on the large cluster. However, there is no appreciable speed-up for branch swapping with the further addition of slave processors (>16). This result is independent of data set size. Ratcheting in parallel is efficient with the addition of up to 32 processors in the large cluster. This result is independent of data set size.

  16. The science of computing - The evolution of parallel processing

    NASA Technical Reports Server (NTRS)

    Denning, P. J.

    1985-01-01

    The present paper is concerned with the approaches to be employed to overcome the set of limitations in software technology which impedes currently an effective use of parallel hardware technology. The process required to solve the arising problems is found to involve four different stages. At the present time, Stage One is nearly finished, while Stage Two is under way. Tentative explorations are beginning on Stage Three, and Stage Four is more distant. In Stage One, parallelism is introduced into the hardware of a single computer, which consists of one or more processors, a main storage system, a secondary storage system, and various peripheral devices. In Stage Two, parallel execution of cooperating programs on different machines becomes explicit, while in Stage Three, new languages will make parallelism implicit. In Stage Four, there will be very high level user interfaces capable of interacting with scientists at the same level of abstraction as scientists do with each other.

  17. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J; Blocksome, Michael E; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Endpoint-based parallel data processing in a parallel active messaging interface ('PAMI') of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective opeartion through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  18. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOEpatents

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-08-12

    Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  19. Parallel stochastic systems biology in the cloud.

    PubMed

    Aldinucci, Marco; Torquati, Massimo; Spampinato, Concetto; Drocco, Maurizio; Misale, Claudia; Calcagno, Cristina; Coppo, Mario

    2014-09-01

    The stochastic modelling of biological systems, coupled with Monte Carlo simulation of models, is an increasingly popular technique in bioinformatics. The simulation-analysis workflow may result computationally expensive reducing the interactivity required in the model tuning. In this work, we advocate the high-level software design as a vehicle for building efficient and portable parallel simulators for the cloud. In particular, the Calculus of Wrapped Components (CWC) simulator for systems biology, which is designed according to the FastFlow pattern-based approach, is presented and discussed. Thanks to the FastFlow framework, the CWC simulator is designed as a high-level workflow that can simulate CWC models, merge simulation results and statistically analyse them in a single parallel workflow in the cloud. To improve interactivity, successive phases are pipelined in such a way that the workflow begins to output a stream of analysis results immediately after simulation is started. Performance and effectiveness of the CWC simulator are validated on the Amazon Elastic Compute Cloud.

  20. The economics of parallel trade.

    PubMed

    Danzon, P M

    1998-03-01

    The potential for parallel trade in the European Union (EU) has grown with the accession of low price countries and the harmonisation of registration requirements. Parallel trade implies a conflict between the principle of autonomy of member states to set their own pharmaceutical prices, the principle of free trade and the industrial policy goal of promoting innovative research and development (R&D). Parallel trade in pharmaceuticals does not yield the normal efficiency gains from trade because countries achieve low pharmaceutical prices by aggressive regulation, not through superior efficiency. In fact, parallel trade reduces economic welfare by undermining price differentials between markets. Pharmaceutical R&D is a global joint cost of serving all consumers worldwide; it accounts for roughly 30% of total costs. Optimal (welfare maximising) pricing to cover joint costs (Ramsey pricing) requires setting different prices in different markets, based on inverse demand elasticities. By contrast, parallel trade and regulation based on international price comparisons tend to force price convergence across markets. In response, manufacturers attempt to set a uniform 'euro' price. The primary losers from 'euro' pricing will be consumers in low income countries who will face higher prices or loss of access to new drugs. In the long run, even higher income countries are likely to be worse off with uniform prices, because fewer drugs will be developed. One policy option to preserve price differentials is to exempt on-patent products from parallel trade. An alternative is confidential contracting between individual manufacturers and governments to provide country-specific ex post discounts from the single 'euro' wholesale price, similar to rebates used by managed care in the US. This would preserve differentials in transactions prices even if parallel trade forces convergence of wholesale prices.

  1. Parallel Implicit Algorithms for CFD

    NASA Technical Reports Server (NTRS)

    Keyes, David E.

    1998-01-01

    The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.

  2. Interacting faults

    NASA Astrophysics Data System (ADS)

    Peacock, D. C. P.; Nixon, C. W.; Rotevatn, A.; Sanderson, D. J.; Zuluaga, L. F.

    2017-04-01

    The way that faults interact with each other controls fault geometries, displacements and strains. Faults rarely occur individually but as sets or networks, with the arrangement of these faults producing a variety of different fault interactions. Fault interactions are characterised in terms of the following: 1) Geometry - the spatial arrangement of the faults. Interacting faults may or may not be geometrically linked (i.e. physically connected), when fault planes share an intersection line. 2) Kinematics - the displacement distributions of the interacting faults and whether the displacement directions are parallel, perpendicular or oblique to the intersection line. Interacting faults may or may not be kinematically linked, where the displacements, stresses and strains of one fault influences those of the other. 3) Displacement and strain in the interaction zone - whether the faults have the same or opposite displacement directions, and if extension or contraction dominates in the acute bisector between the faults. 4) Chronology - the relative ages of the faults. This characterisation scheme is used to suggest a classification for interacting faults. Different types of interaction are illustrated using metre-scale faults from the Mesozoic rocks of Somerset and examples from the literature.

  3. A parallel Jacobson-Oksman optimization algorithm. [parallel processing (computers)

    NASA Technical Reports Server (NTRS)

    Straeter, T. A.; Markos, A. T.

    1975-01-01

    A gradient-dependent optimization technique which exploits the vector-streaming or parallel-computing capabilities of some modern computers is presented. The algorithm, derived by assuming that the function to be minimized is homogeneous, is a modification of the Jacobson-Oksman serial minimization method. In addition to describing the algorithm, conditions insuring the convergence of the iterates of the algorithm and the results of numerical experiments on a group of sample test functions are presented. The results of these experiments indicate that this algorithm will solve optimization problems in less computing time than conventional serial methods on machines having vector-streaming or parallel-computing capabilities.

  4. Final Report of the Center of Excellence in Rotary Technology at Rensselaer Polytechnic Institute

    DTIC Science & Technology

    1988-04-15

    34’ Oft ASSTIRACT 21. ABSTqACT SECURITY CLASSIF ICATION UNCLASSIFIEOIUNLIMITEO C SAME AS RPT. OTIC USERS Unclassified 221. NAME OF RESPONSIBLE INOIVIOUAL...105 (c). Correlation of Theory and Experiment ............... 106 (3). Unsteady Potor Aerodynamic Coefficients in Forward Flight...Coefficient for a Close Encounter and Comparisons with Experiments ....................................... 131 59. Blade-Vortex Interaction Regions

  5. Parallelizing Timed Petri Net simulations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.

    1993-01-01

    The possibility of using parallel processing to accelerate the simulation of Timed Petri Nets (TPN's) was studied. It was recognized that complex system development tools often transform system descriptions into TPN's or TPN-like models, which are then simulated to obtain information about system behavior. Viewed this way, it was important that the parallelization of TPN's be as automatic as possible, to admit the possibility of the parallelization being embedded in the system design tool. Later years of the grant were devoted to examining the problem of joint performance and reliability analysis, to explore whether both types of analysis could be accomplished within a single framework. In this final report, the results of our studies are summarized. We believe that the problem of parallelizing TPN's automatically for MIMD architectures has been almost completely solved for a large and important class of problems. Our initial investigations into joint performance/reliability analysis are two-fold; it was shown that Monte Carlo simulation, with importance sampling, offers promise of joint analysis in the context of a single tool, and methods for the parallel simulation of general Continuous Time Markov Chains, a model framework within which joint performance/reliability models can be cast, were developed. However, very much more work is needed to determine the scope and generality of these approaches. The results obtained in our two studies, future directions for this type of work, and a list of publications are included.

  6. Simultaneous Glycan-Peptide Characterization Using Hydrophilic Interaction Chromatography and Parallel Fragmentation by CID, Higher Energy Collisional Dissociation, and Electron Transfer Dissociation MS Applied to the N-Linked Glycoproteome of Campylobacter jejuni*

    PubMed Central

    Scott, Nichollas E.; Parker, Benjamin L.; Connolly, Angela M.; Paulech, Jana; Edwards, Alistair V. G.; Crossett, Ben; Falconer, Linda; Kolarich, Daniel; Djordjevic, Steven P.; Højrup, Peter; Packer, Nicolle H.; Larsen, Martin R.; Cordwell, Stuart J.

    2011-01-01

    Campylobacter jejuni is a gastrointestinal pathogen that is able to modify membrane and periplasmic proteins by the N-linked addition of a 7-residue glycan at the strict attachment motif (D/E)XNX(S/T). Strategies for a comprehensive analysis of the targets of glycosylation, however, are hampered by the resistance of the glycan-peptide bond to enzymatic digestion or β-elimination and have previously concentrated on soluble glycoproteins compatible with lectin affinity and gel-based approaches. We developed strategies for enriching C. jejuni HB93-13 glycopeptides using zwitterionic hydrophilic interaction chromatography and examined novel fragmentation, including collision-induced dissociation (CID) and higher energy collisional (C-trap) dissociation (HCD) as well as CID/electron transfer dissociation (ETD) mass spectrometry. CID/HCD enabled the identification of glycan structure and peptide backbone, allowing glycopeptide identification, whereas CID/ETD enabled the elucidation of glycosylation sites by maintaining the glycan-peptide linkage. A total of 130 glycopeptides, representing 75 glycosylation sites, were identified from LC-MS/MS using zwitterionic hydrophilic interaction chromatography coupled to CID/HCD and CID/ETD. CID/HCD provided the majority of the identifications (73 sites) compared with ETD (26 sites). We also examined soluble glycoproteins by soybean agglutinin affinity and two-dimensional electrophoresis and identified a further six glycosylation sites. This study more than doubles the number of confirmed N-linked glycosylation sites in C. jejuni and is the first to utilize HCD fragmentation for glycopeptide identification with intact glycan. We also show that hydrophobic integral membrane proteins are significant targets of glycosylation in this organism. Our data demonstrate that peptide-centric approaches coupled to novel mass spectrometric fragmentation techniques may be suitable for application to eukaryotic glycoproteins for simultaneous

  7. Simultaneous glycan-peptide characterization using hydrophilic interaction chromatography and parallel fragmentation by CID, higher energy collisional dissociation, and electron transfer dissociation MS applied to the N-linked glycoproteome of Campylobacter jejuni.

    PubMed

    Scott, Nichollas E; Parker, Benjamin L; Connolly, Angela M; Paulech, Jana; Edwards, Alistair V G; Crossett, Ben; Falconer, Linda; Kolarich, Daniel; Djordjevic, Steven P; Højrup, Peter; Packer, Nicolle H; Larsen, Martin R; Cordwell, Stuart J

    2011-02-01

    Campylobacter jejuni is a gastrointestinal pathogen that is able to modify membrane and periplasmic proteins by the N-linked addition of a 7-residue glycan at the strict attachment motif (D/E)XNX(S/T). Strategies for a comprehensive analysis of the targets of glycosylation, however, are hampered by the resistance of the glycan-peptide bond to enzymatic digestion or β-elimination and have previously concentrated on soluble glycoproteins compatible with lectin affinity and gel-based approaches. We developed strategies for enriching C. jejuni HB93-13 glycopeptides using zwitterionic hydrophilic interaction chromatography and examined novel fragmentation, including collision-induced dissociation (CID) and higher energy collisional (C-trap) dissociation (HCD) as well as CID/electron transfer dissociation (ETD) mass spectrometry. CID/HCD enabled the identification of glycan structure and peptide backbone, allowing glycopeptide identification, whereas CID/ETD enabled the elucidation of glycosylation sites by maintaining the glycan-peptide linkage. A total of 130 glycopeptides, representing 75 glycosylation sites, were identified from LC-MS/MS using zwitterionic hydrophilic interaction chromatography coupled to CID/HCD and CID/ETD. CID/HCD provided the majority of the identifications (73 sites) compared with ETD (26 sites). We also examined soluble glycoproteins by soybean agglutinin affinity and two-dimensional electrophoresis and identified a further six glycosylation sites. This study more than doubles the number of confirmed N-linked glycosylation sites in C. jejuni and is the first to utilize HCD fragmentation for glycopeptide identification with intact glycan. We also show that hydrophobic integral membrane proteins are significant targets of glycosylation in this organism. Our data demonstrate that peptide-centric approaches coupled to novel mass spectrometric fragmentation techniques may be suitable for application to eukaryotic glycoproteins for simultaneous

  8. Visualizing Parallel Computer System Performance

    NASA Technical Reports Server (NTRS)

    Malony, Allen D.; Reed, Daniel A.

    1988-01-01

    Parallel computer systems are among the most complex of man's creations, making satisfactory performance characterization difficult. Despite this complexity, there are strong, indeed, almost irresistible, incentives to quantify parallel system performance using a single metric. The fallacy lies in succumbing to such temptations. A complete performance characterization requires not only an analysis of the system's constituent levels, it also requires both static and dynamic characterizations. Static or average behavior analysis may mask transients that dramatically alter system performance. Although the human visual system is remarkedly adept at interpreting and identifying anomalies in false color data, the importance of dynamic, visual scientific data presentation has only recently been recognized Large, complex parallel system pose equally vexing performance interpretation problems. Data from hardware and software performance monitors must be presented in ways that emphasize important events while eluding irrelevant details. Design approaches and tools for performance visualization are the subject of this paper.

  9. Features in Continuous Parallel Coordinates.

    PubMed

    Lehmann, Dirk J; Theisel, Holger

    2011-12-01

    Continuous Parallel Coordinates (CPC) are a contemporary visualization technique in order to combine several scalar fields, given over a common domain. They facilitate a continuous view for parallel coordinates by considering a smooth scalar field instead of a finite number of straight lines. We show that there are feature curves in CPC which appear to be the dominant structures of a CPC. We present methods to extract and classify them and demonstrate their usefulness to enhance the visualization of CPCs. In particular, we show that these feature curves are related to discontinuities in Continuous Scatterplots (CSP). We show this by exploiting a curve-curve duality between parallel and Cartesian coordinates, which is a generalization of the well-known point-line duality. Furthermore, we illustrate the theoretical considerations. Concluding, we discuss relations and aspects of the CPC's/CSP's features concerning the data analysis.

  10. PARAVT: Parallel Voronoi tessellation code

    NASA Astrophysics Data System (ADS)

    González, R. E.

    2016-10-01

    In this study, we present a new open source code for massive parallel computation of Voronoi tessellations (VT hereafter) in large data sets. The code is focused for astrophysical purposes where VT densities and neighbors are widely used. There are several serial Voronoi tessellation codes, however no open source and parallel implementations are available to handle the large number of particles/galaxies in current N-body simulations and sky surveys. Parallelization is implemented under MPI and VT using Qhull library. Domain decomposition takes into account consistent boundary computation between tasks, and includes periodic conditions. In addition, the code computes neighbors list, Voronoi density, Voronoi cell volume, density gradient for each particle, and densities on a regular grid. Code implementation and user guide are publicly available at https://github.com/regonzar/paravt.

  11. Parallel integrated frame synchronizer chip

    NASA Technical Reports Server (NTRS)

    Ghuman, Parminder Singh (Inventor); Solomon, Jeffrey Michael (Inventor); Bennett, Toby Dennis (Inventor)

    2000-01-01

    A parallel integrated frame synchronizer which implements a sequential pipeline process wherein serial data in the form of telemetry data or weather satellite data enters the synchronizer by means of a front-end subsystem and passes to a parallel correlator subsystem or a weather satellite data processing subsystem. When in a CCSDS mode, data from the parallel correlator subsystem passes through a window subsystem, then to a data alignment subsystem and then to a bit transition density (BTD)/cyclical redundancy check (CRC) decoding subsystem. Data from the BTD/CRC decoding subsystem or data from the weather satellite data processing subsystem is then fed to an output subsystem where it is output from a data output port.

  12. Fast data parallel polygon rendering

    SciTech Connect

    Ortega, F.A.; Hansen, C.D.

    1993-09-01

    This paper describes a parallel method for polygonal rendering on a massively parallel SIMD machine. This method, based on a simple shading model, is targeted for applications which require very fast polygon rendering for extremely large sets of polygons such as is found in many scientific visualization applications. The algorithms described in this paper are incorporated into a library of 3D graphics routines written for the Connection Machine. The routines are implemented on both the CM-200 and the CM-5. This library enables a scientists to display 3D shaded polygons directly from a parallel machine without the need to transmit huge amounts of data to a post-processing rendering system.

  13. Massively Parallel MRI Detector Arrays

    PubMed Central

    Keil, Boris; Wald, Lawrence L

    2013-01-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called “ultimate” SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays. PMID:23453758

  14. Parallel Adaptive Mesh Refinement Library

    NASA Technical Reports Server (NTRS)

    Mac-Neice, Peter; Olson, Kevin

    2005-01-01

    Parallel Adaptive Mesh Refinement Library (PARAMESH) is a package of Fortran 90 subroutines designed to provide a computer programmer with an easy route to extension of (1) a previously written serial code that uses a logically Cartesian structured mesh into (2) a parallel code with adaptive mesh refinement (AMR). Alternatively, in its simplest use, and with minimal effort, PARAMESH can operate as a domain-decomposition tool for users who want to parallelize their serial codes but who do not wish to utilize adaptivity. The package builds a hierarchy of sub-grids to cover the computational domain of a given application program, with spatial resolution varying to satisfy the demands of the application. The sub-grid blocks form the nodes of a tree data structure (a quad-tree in two or an oct-tree in three dimensions). Each grid block has a logically Cartesian mesh. The package supports one-, two- and three-dimensional models.

  15. Hybrid parallel programming with MPI and Unified Parallel C.

    SciTech Connect

    Dinan, J.; Balaji, P.; Lusk, E.; Sadayappan, P.; Thakur, R.; Mathematics and Computer Science; The Ohio State Univ.

    2010-01-01

    The Message Passing Interface (MPI) is one of the most widely used programming models for parallel computing. However, the amount of memory available to an MPI process is limited by the amount of local memory within a compute node. Partitioned Global Address Space (PGAS) models such as Unified Parallel C (UPC) are growing in popularity because of their ability to provide a shared global address space that spans the memories of multiple compute nodes. However, taking advantage of UPC can require a large recoding effort for existing parallel applications. In this paper, we explore a new hybrid parallel programming model that combines MPI and UPC. This model allows MPI programmers incremental access to a greater amount of memory, enabling memory-constrained MPI codes to process larger data sets. In addition, the hybrid model offers UPC programmers an opportunity to create static UPC groups that are connected over MPI. As we demonstrate, the use of such groups can significantly improve the scalability of locality-constrained UPC codes. This paper presents a detailed description of the hybrid model and demonstrates its effectiveness in two applications: a random access benchmark and the Barnes-Hut cosmological simulation. Experimental results indicate that the hybrid model can greatly enhance performance; using hybrid UPC groups that span two cluster nodes, RA performance increases by a factor of 1.33 and using groups that span four cluster nodes, Barnes-Hut experiences a twofold speedup at the expense of a 2% increase in code size.

  16. Medipix2 parallel readout system

    NASA Astrophysics Data System (ADS)

    Fanti, V.; Marzeddu, R.; Randaccio, P.

    2003-08-01

    A fast parallel readout system based on a PCI board has been developed in the framework of the Medipix collaboration. The readout electronics consists of two boards: the motherboard directly interfacing the Medipix2 chip, and the PCI board with digital I/O ports 32 bits wide. The device driver and readout software have been developed at low level in Assembler to allow fast data transfer and image reconstruction. The parallel readout permits a transfer rate up to 64 Mbytes/s. http://medipix.web.cern ch/MEDIPIX/

  17. Gang scheduling a parallel machine

    SciTech Connect

    Gorda, B.C.; Brooks, E.D. III.

    1991-03-01

    Program development on parallel machines can be a nightmare of scheduling headaches. We have developed a portable time sharing mechanism to handle the problem of scheduling gangs of processors. User program and their gangs of processors are put to sleep and awakened by the gang scheduler to provide a time sharing environment. Time quantums are adjusted according to priority queues and a system of fair share accounting. The initial platform for this software is the 128 processor BBN TC2000 in use in the Massively Parallel Computing Initiative at the Lawrence Livermore National Laboratory. 2 refs., 1 fig.

  18. Gang scheduling a parallel machine

    SciTech Connect

    Gorda, B.C.; Brooks, E.D. III.

    1991-12-01

    Program development on parallel machines can be a nightmare of scheduling headaches. We have developed a portable time sharing mechanism to handle the problem of scheduling gangs of processes. User programs and their gangs of processes are put to sleep and awakened by the gang scheduler to provide a time sharing environment. Time quantum are adjusted according to priority queues and a system of fair share accounting. The initial platform for this software is the 128 processor BBN TC2000 in use in the Massively Parallel Computing Initiative at the Lawrence Livermore National Laboratory.

  19. The Complexity of Parallel Algorithms,

    DTIC Science & Technology

    1985-11-01

    Much of this work was done in collaboration with my advisor, Ernst Mayr . He was also supported in part by ONR contract N00014-85-C-0731. F ’. Table...Helinbold and Mayr in their algorithn to compute an optimal two processor schedule [HM2]. One of the promising developments in parallel algorithms is that...lei can be solved by it fast parallel algorithmmmi if the nmlmmmibers are smiall. llehmibold and Mayr JIlM I] have slhowm that. if Ole job timies are

  20. A Comparison of Automatic Parallelization Tools/Compilers on the SGI Origin 2000 Using the NAS Benchmarks

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Frumkin, Michael; Hribar, Michelle; Jin, Hao-Qiang; Waheed, Abdul; Yan, Jerry

    1998-01-01

    Porting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is extremely time consuming and costly, porting codes would ideally be automated by using some parallelization tools and compilers. In this paper, we compare the performance of the hand written NAB Parallel Benchmarks against three parallel versions generated with the help of tools and compilers: 1) CAPTools: an interactive computer aided parallelization too] that generates message passing code, 2) the Portland Group's HPF compiler and 3) using compiler directives with the native FORTAN77 compiler on the SGI Origin2000.

  1. File concepts for parallel I/O

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1989-01-01

    The subject of input/output (I/O) was often neglected in the design of parallel computer systems, although for many problems I/O rates will limit the speedup attainable. The I/O problem is addressed by considering the role of files in parallel systems. The notion of parallel files is introduced. Parallel files provide for concurrent access by multiple processes, and utilize parallelism in the I/O system to improve performance. Parallel files can also be used conventionally by sequential programs. A set of standard parallel file organizations is proposed, organizations are suggested, using multiple storage devices. Problem areas are also identified and discussed.

  2. Parallel Performance Optimization of the Direct Simulation Monte Carlo Method

    NASA Astrophysics Data System (ADS)

    Gao, Da; Zhang, Chonglin; Schwartzentruber, Thomas

    2009-11-01

    Although the direct simulation Monte Carlo (DSMC) particle method is more computationally intensive compared to continuum methods, it is accurate for conditions ranging from continuum to free-molecular, accurate in highly non-equilibrium flow regions, and holds potential for incorporating advanced molecular-based models for gas-phase and gas-surface interactions. As available computer resources continue their rapid growth, the DSMC method is continually being applied to increasingly complex flow problems. Although processor clock speed continues to increase, a trend of increasing multi-core-per-node parallel architectures is emerging. To effectively utilize such current and future parallel computing systems, a combined shared/distributed memory parallel implementation (using both Open Multi-Processing (OpenMP) and Message Passing Interface (MPI)) of the DSMC method is under development. The parallel implementation of a new state-of-the-art 3D DSMC code employing an embedded 3-level Cartesian mesh will be outlined. The presentation will focus on performance optimization strategies for DSMC, which includes, but is not limited to, modified algorithm designs, practical code-tuning techniques, and parallel performance optimization. Specifically, key issues important to the DSMC shared memory (OpenMP) parallel performance are identified as (1) granularity (2) load balancing (3) locality and (4) synchronization. Challenges and solutions associated with these issues as they pertain to the DSMC method will be discussed.

  3. Matpar: Parallel Extensions for MATLAB

    NASA Technical Reports Server (NTRS)

    Springer, P. L.

    1998-01-01

    Matpar is a set of client/server software that allows a MATLAB user to take advantage of a parallel computer for very large problems. The user can replace calls to certain built-in MATLAB functions with calls to Matpar functions.

  4. Parallel, Distributed Scripting with Python

    SciTech Connect

    Miller, P J

    2002-05-24

    Parallel computers used to be, for the most part, one-of-a-kind systems which were extremely difficult to program portably. With SMP architectures, the advent of the POSIX thread API and OpenMP gave developers ways to portably exploit on-the-box shared memory parallelism. Since these architectures didn't scale cost-effectively, distributed memory clusters were developed. The associated MPI message passing libraries gave these systems a portable paradigm too. Having programmers effectively use this paradigm is a somewhat different question. Distributed data has to be explicitly transported via the messaging system in order for it to be useful. In high level languages, the MPI library gives access to data distribution routines in C, C++, and FORTRAN. But we need more than that. Many reasonable and common tasks are best done in (or as extensions to) scripting languages. Consider sysadm tools such as password crackers, file purgers, etc ... These are simple to write in a scripting language such as Python (an open source, portable, and freely available interpreter). But these tasks beg to be done in parallel. Consider the a password checker that checks an encrypted password against a 25,000 word dictionary. This can take around 10 seconds in Python (6 seconds in C). It is trivial to parallelize if you can distribute the information and co-ordinate the work.

  5. Fast, Massively Parallel Data Processors

    NASA Technical Reports Server (NTRS)

    Heaton, Robert A.; Blevins, Donald W.; Davis, ED

    1994-01-01

    Proposed fast, massively parallel data processor contains 8x16 array of processing elements with efficient interconnection scheme and options for flexible local control. Processing elements communicate with each other on "X" interconnection grid with external memory via high-capacity input/output bus. This approach to conditional operation nearly doubles speed of various arithmetic operations.

  6. Optical Interferometric Parallel Data Processor

    NASA Technical Reports Server (NTRS)

    Breckinridge, J. B.

    1987-01-01

    Image data processed faster than in present electronic systems. Optical parallel-processing system effectively calculates two-dimensional Fourier transforms in time required by light to travel from plane 1 to plane 8. Coherence interferometer at plane 4 splits light into parts that form double image at plane 6 if projection screen placed there.

  7. Tutorial: Parallel Simulation on Supercomputers

    SciTech Connect

    Perumalla, Kalyan S

    2012-01-01

    This tutorial introduces typical hardware and software characteristics of extant and emerging supercomputing platforms, and presents issues and solutions in executing large-scale parallel discrete event simulation scenarios on such high performance computing systems. Covered topics include synchronization, model organization, example applications, and observed performance from illustrative large-scale runs.

  8. The physics of parallel machines

    NASA Technical Reports Server (NTRS)

    Chan, Tony F.

    1988-01-01

    The idea is considered that architectures for massively parallel computers must be designed to go beyond supporting a particular class of algorithms to supporting the underlying physical processes being modelled. Physical processes modelled by partial differential equations (PDEs) are discussed. Also discussed is the idea that an efficient architecture must go beyond nearest neighbor mesh interconnections and support global and hierarchical communications.

  9. PALM: a Parallel Dynamic Coupler

    NASA Astrophysics Data System (ADS)

    Thevenin, A.; Morel, T.

    2008-12-01

    In order to efficiently represent complex systems, numerical modeling has to rely on many physical models at a time: an ocean model coupled with an atmospheric model is at the basis of climate modeling. The continuity of the solution is granted only if these models can constantly exchange information. PALM is a coupler allowing the concurrent execution and the intercommunication of programs not having been especially designed for that. With PALM, the dynamic coupling approach is introduced: a coupled component can be launched and can release computers' resources upon termination at any moment during the simulation. In order to exploit as much as possible computers' possibilities, the PALM coupler handles two levels of parallelism. The first level concerns the components themselves. While managing the resources, PALM allocates the number of processes which are necessary to any coupled component. These models can be parallel programs based on domain decomposition with MPI or applications multithreaded with OpenMP. The second level of parallelism is a task parallelism: one can define a coupling algorithm allowing two or more programs to be executed in parallel. PALM applications are implemented via a Graphical User Interface called PrePALM. In this GUI, the programmer initially defines the coupling algorithm then he describes the actual communications between the models. PALM offers a very high flexibility for testing different coupling techniques and for reaching the best load balance in a high performance computer. The transformation of computational independent code is almost straightforward. The other qualities of PALM are its easy set-up, its flexibility, its performances, the simple updates and evolutions of the coupled application and the many side services and functions that it offers.

  10. Fast parallel Markov clustering in bioinformatics using massively parallel computing on GPU with CUDA and ELLPACK-R sparse format.

    PubMed

    Bustamam, Alhadi; Burrage, Kevin; Hamilton, Nicholas A

    2012-01-01

    Markov clustering (MCL) is becoming a key algorithm within bioinformatics for determining clusters in networks. However,with increasing vast amount of data on biological networks, performance and scalability issues are becoming a critical limiting factor in applications. Meanwhile, GPU computing, which uses CUDA tool for implementing a massively parallel computing environment in the GPU card, is becoming a very powerful, efficient, and low-cost option to achieve substantial performance gains over CPU approaches. The use of on-chip memory on the GPU is efficiently lowering the latency time, thus, circumventing a major issue in other parallel computing environments, such as MPI. We introduce a very fast Markov clustering algorithm using CUDA (CUDA-MCL) to perform parallel sparse matrix-matrix computations and parallel sparse Markov matrix normalizations, which are at the heart of MCL. We utilized ELLPACK-R sparse format to allow the effective and fine-grain massively parallel processing to cope with the sparse nature of interaction networks data sets in bioinformatics applications. As the results show, CUDA-MCL is significantly faster than the original MCL running on CPU. Thus, large-scale parallel computation on off-the-shelf desktop-machines, that were previously only possible on supercomputing architectures, can significantly change the way bioinformaticians and biologists deal with their data.

  11. Parallelism in computational chemistry: Applications in quantum and statistical mechanics

    NASA Astrophysics Data System (ADS)

    Clementi, E.; Corongiu, G.; Detrich, J. H.; Kahnmohammadbaigi, H.; Chin, S.; Domingo, L.; Laaksonen, A.; Nguyen, N. L.

    1985-08-01

    Often very fundamental biochemical and biophysical problems defy simulations because of limitation in today's computers. We present and discuss a distributed system composed of two IBM-4341 and one IBM-4381, as front-end processors, and ten FPS-164 attached array processors. This parallel system-called LCAP-has presently a peak performance of about 120 MFlops; extensions to higher performance are discussed. Presently, the system applications use a modified version of VM/SP as the operating system: description of the modifications is given. Three applications programs have migrated from sequential to parallel; a molecular quantum mechanical, a Metropolis-Monte Carlo and a Molecular Dynamics program. Descriptions of the parallel codes are briefly outlined. As examples and tests of these applications we report on a study for proton tunneling in DNA base-pairs, very relevant to spontaneous mutations in genetics. As a second example, we present a Monte Carlo study of liquid water at room temperature where not only two- and three-body interactions are considered but-for the first time-also four-body interactions are included. Finally we briefly summarize a molecular dynamics study where two- and three-body interactions have been considered. These examples, and very positive performance comparison with today's supercomputers allow us to conclude that parallel computers and programming of the type we have considered, represent a pragmatic answer to many computer intensive problems.

  12. Web based parallel/distributed medical data mining using software agents

    SciTech Connect

    Kargupta, H.; Stafford, B.; Hamzaoglu, I.

    1997-12-31

    This paper describes an experimental parallel/distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based interface for interactive data visualization. It also presents the results of applying PADMA for detecting patterns in unstructured texts of postmortem reports and laboratory test data for Hepatitis C patients.

  13. A brief parallel I/O tutorial.

    SciTech Connect

    Ward, H. Lee

    2010-03-01

    This document provides common best practices for the efficient utilization of parallel file systems for analysts and application developers. A multi-program, parallel supercomputer is able to provide effective compute power by aggregating a host of lower-power processors using a network. The idea, in general, is that one either constructs the application to distribute parts to the different nodes and processors available and then collects the result (a parallel application), or one launches a large number of small jobs, each doing similar work on different subsets (a campaign). The I/O system on these machines is usually implemented as a tightly-coupled, parallel application itself. It is providing the concept of a 'file' to the host applications. The 'file' is an addressable store of bytes and that address space is global in nature. In essence, it is providing a global address space. Beyond the simple reality that the I/O system is normally composed of a small, less capable, collection of hardware, that concept of a global address space will cause problems if not very carefully utilized. How much of a problem and the ways in which those problems manifest will be different, but that it is problem prone has been well established. Worse, the file system is a shared resource on the machine - a system service. What an application does when it uses the file system impacts all users. It is not the case that some portion of the available resource is reserved. Instead, the I/O system responds to requests by scheduling and queuing based on instantaneous demand. Using the system well contributes to the overall throughput on the machine. From a solely self-centered perspective, using it well reduces the time that the application or campaign is subject to impact by others. The developer's goal should be to accomplish I/O in a way that minimizes interaction with the I/O system, maximizes the amount of data moved per call, and provides the I/O system the most information about

  14. Task parallelism and high-performance languages

    SciTech Connect

    Foster, I.

    1996-03-01

    The definition of High Performance Fortran (HPF) is a significant event in the maturation of parallel computing: it represents the first parallel language that has gained widespread support from vendors and users. The subject of this paper is to incorporate support for task parallelism. The term task parallelism refers to the explicit creation of multiple threads of control, or tasks, which synchronize and communicate under programmer control. Task and data parallelism are complementary rather than competing programming models. While task parallelism is more general and can be used to implement algorithms that are not amenable to data-parallel solutions, many problems can benefit from a mixed approach, with for example a task-parallel coordination layer integrating multiple data-parallel computations. Other problems admit to both data- and task-parallel solutions, with the better solution depending on machine characteristics, compiler performance, or personal taste. For these reasons, we believe that a general-purpose high-performance language should integrate both task- and data-parallel constructs. The challenge is to do so in a way that provides the expressivity needed for applications, while preserving the flexibility and portability of a high-level language. In this paper, we examine and illustrate the considerations that motivate the use of task parallelism. We also describe one particular approach to task parallelism in Fortran, namely the Fortran M extensions. Finally, we contrast Fortran M with other proposed approaches and discuss the implications of this work for task parallelism and high-performance languages.

  15. A generalized parallel replica dynamics

    NASA Astrophysics Data System (ADS)

    Binder, Andrew; Lelièvre, Tony; Simpson, Gideon

    2015-03-01

    Metastability is a common obstacle to performing long molecular dynamics simulations. Many numerical methods have been proposed to overcome it. One method is parallel replica dynamics, which relies on the rapid convergence of the underlying stochastic process to a quasi-stationary distribution. Two requirements for applying parallel replica dynamics are knowledge of the time scale on which the process converges to the quasi-stationary distribution and a mechanism for generating samples from this distribution. By combining a Fleming-Viot particle system with convergence diagnostics to simultaneously identify when the process converges while also generating samples, we can address both points. This variation on the algorithm is illustrated with various numerical examples, including those with entropic barriers and the 2D Lennard-Jones cluster of seven atoms.

  16. Merlin - Massively parallel heterogeneous computing

    NASA Technical Reports Server (NTRS)

    Wittie, Larry; Maples, Creve

    1989-01-01

    Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.

  17. Parallel supercomputing with commodity components

    SciTech Connect

    Warren, M.S.; Goda, M.P.; Becker, D.J.

    1997-09-01

    We have implemented a parallel computer architecture based entirely upon commodity personal computer components. Using 16 Intel Pentium Pro microprocessors and switched fast ethernet as a communication fabric, we have obtained sustained performance on scientific applications in excess of one Gigaflop. During one production astrophysics treecode simulation, we performed 1.2 x 10{sup 15} floating point operations (1.2 Petaflops) over a three week period, with one phase of that simulation running continuously for two weeks without interruption. We report on a variety of disk, memory and network benchmarks. We also present results from the NAS parallel benchmark suite, which indicate that this architecture is competitive with current commercial architectures. In addition, we describe some software written to support efficient message passing, as well as a Linux device driver interface to the Pentium hardware performance monitoring registers.

  18. ASP: a parallel computing technology

    NASA Astrophysics Data System (ADS)

    Lea, R. M.

    1990-09-01

    ASP modules constitute the basis of a parallel computing technology platform for the rapid development of a broad range of numeric and symbolic information processing systems. Based on off-the-shelf general-purpose hardware and software modules ASP technology is intended to increase productivity in the development (and competitiveness in the marketing) of cost-effective low-MIMD/high-SIMD Massively Parallel Processor (MPPs). The paper discusses ASP module philosophy and demonstrates how ASP modules can satisfy the market algorithmic architectural and engineering requirements of such MPPs. In particular two specific ASP modules based on VLSI and WSI technologies are studied as case examples of ASP technology the latter reporting 1 TOPS/fl3 1 GOPS/W and 1 MOPS/$ as ball-park figures-of-merit of cost-effectiveness.

  19. Parallel processing spacecraft communication system

    NASA Technical Reports Server (NTRS)

    Bolotin, Gary S. (Inventor); Donaldson, James A. (Inventor); Luong, Huy H. (Inventor); Wood, Steven H. (Inventor)

    1998-01-01

    An uplink controlling assembly speeds data processing using a special parallel codeblock technique. A correct start sequence initiates processing of a frame. Two possible start sequences can be used; and the one which is used determines whether data polarity is inverted or non-inverted. Processing continues until uncorrectable errors are found. The frame ends by intentionally sending a block with an uncorrectable error. Each of the codeblocks in the frame has a channel ID. Each channel ID can be separately processed in parallel. This obviates the problem of waiting for error correction processing. If that channel number is zero, however, it indicates that the frame of data represents a critical command only. That data is handled in a special way, independent of the software. Otherwise, the processed data further handled using special double buffering techniques to avoid problems from overrun. When overrun does occur, the system takes action to lose only the oldest data.

  20. A generalized parallel replica dynamics

    SciTech Connect

    Binder, Andrew; Lelièvre, Tony; Simpson, Gideon

    2015-03-01

    Metastability is a common obstacle to performing long molecular dynamics simulations. Many numerical methods have been proposed to overcome it. One method is parallel replica dynamics, which relies on the rapid convergence of the underlying stochastic process to a quasi-stationary distribution. Two requirements for applying parallel replica dynamics are knowledge of the time scale on which the process converges to the quasi-stationary distribution and a mechanism for generating samples from this distribution. By combining a Fleming–Viot particle system with convergence diagnostics to simultaneously identify when the process converges while also generating samples, we can address both points. This variation on the algorithm is illustrated with various numerical examples, including those with entropic barriers and the 2D Lennard-Jones cluster of seven atoms.

  1. Parallel supercomputing with commodity components

    NASA Technical Reports Server (NTRS)

    Warren, M. S.; Goda, M. P.; Becker, D. J.

    1997-01-01

    We have implemented a parallel computer architecture based entirely upon commodity personal computer components. Using 16 Intel Pentium Pro microprocessors and switched fast ethernet as a communication fabric, we have obtained sustained performance on scientific applications in excess of one Gigaflop. During one production astrophysics treecode simulation, we performed 1.2 x 10(sup 15) floating point operations (1.2 Petaflops) over a three week period, with one phase of that simulation running continuously for two weeks without interruption. We report on a variety of disk, memory and network benchmarks. We also present results from the NAS parallel benchmark suite, which indicate that this architecture is competitive with current commercial architectures. In addition, we describe some software written to support efficient message passing, as well as a Linux device driver interface to the Pentium hardware performance monitoring registers.

  2. Parallel multiplex laser feedback interferometry

    SciTech Connect

    Zhang, Song; Tan, Yidong; Zhang, Shulian

    2013-12-15

    We present a parallel multiplex laser feedback interferometer based on spatial multiplexing which avoids the signal crosstalk in the former feedback interferometer. The interferometer outputs two close parallel laser beams, whose frequencies are shifted by two acousto-optic modulators by 2Ω simultaneously. A static reference mirror is inserted into one of the optical paths as the reference optical path. The other beam impinges on the target as the measurement optical path. Phase variations of the two feedback laser beams are simultaneously measured through heterodyne demodulation with two different detectors. Their subtraction accurately reflects the target displacement. Under typical room conditions, experimental results show a resolution of 1.6 nm and accuracy of 7.8 nm within the range of 100 μm.

  3. Parallelism in Manipulator Dynamics. Revision.

    DTIC Science & Technology

    1983-12-01

    excessive, and a VLSI implementation architecutre is suggested. We indicate possible appli- cations to incorporating dynamical considerations into...Inverse Dynamics problem. It investigates the high degree of parallelism inherent in the computations , and presents two "mathematically exact" formulations...and a 3 b Cases ............. ... 109 5 .9-- i 0. OVERVIEW The Inverse Dynamics problem consists (loosely) of computing the motor torques necessary to

  4. Parallel Symmetric Eigenvalue Problem Solvers

    DTIC Science & Technology

    2015-05-01

    graduate school. Debugging somebody else’s MPI code is an immensely frustrating experience, but he would regularly stay late at the oce to assist me...cessfully. In addition, I will describe the parallel kernels required by my code . 5 The next sections will describe my Fortran-based implementations of...Sandia’s publicly available Trace- Min code . Each of the methods has its own unique advantages and disadvantages, summarized in table 3.1. In short, I

  5. Parallel Algorithms for Computer Vision.

    DTIC Science & Technology

    1987-01-01

    73 755 P fiu.LEL ALORITHMS FOR CO PUTER VISIO (U) /MASSACHUSETTS INST OF TECH CRMORIDGE T P00010 ET AL.JAN 8? ETL-0456 DACA7-05-C-8IIO m 7E F/0 1...regularization principles, such as edge detection, stereo , motion, surface interpolation and shape from shading. The basic members of class I are convolution...them in collabo- ration with Thinking Machines Corporation): * Parallel convolution * Zero-crossing detection * Stereo -matching * Surface reconstruction

  6. Lightweight Specifications for Parallel Correctness

    DTIC Science & Technology

    2012-12-05

    this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204... George Necula Professor David Wessel Fall 2012 1 Abstract Lightweight Specifications for Parallel Correctness by Jacob Samuels Burnim Doctor of Philosophy...enthusiasm and endless flow of ideas, and for his keen research sense. I would also like to thank George Necula for chairing my qualifying exam committee and

  7. National Combustion Code: Parallel Performance

    NASA Technical Reports Server (NTRS)

    Babrauckas, Theresa

    2001-01-01

    This report discusses the National Combustion Code (NCC). The NCC is an integrated system of codes for the design and analysis of combustion systems. The advanced features of the NCC meet designers' requirements for model accuracy and turn-around time. The fundamental features at the inception of the NCC were parallel processing and unstructured mesh. The design and performance of the NCC are discussed.

  8. Parallel Algorithms for Computer Vision.

    DTIC Science & Technology

    1989-01-01

    demonstrated the Vision Machine system processing images and recognizing objects through the inte- gration of several visual cues. The first version of the...achievements. n 2.1 The Vision Machine The overall organization of tie Vision Machine systeliis ased. o parallel processing of tie images by independent...smoothed and made dense by exploiting known constraints within each process (for example., that disparity is smooth). This is the stage of approximation

  9. Parallel strategies for SAR processing

    NASA Astrophysics Data System (ADS)

    Segoviano, Jesus A.

    2004-12-01

    This article proposes a series of strategies for improving the computer process of the Synthetic Aperture Radar (SAR) signal treatment, following the three usual lines of action to speed up the execution of any computer program. On the one hand, it is studied the optimization of both, the data structures and the application architecture used on it. On the other hand it is considered a hardware improvement. For the former, they are studied both, the usually employed SAR process data structures, proposing the use of parallel ones and the way the parallelization of the algorithms employed on the process is implemented. Besides, the parallel application architecture classifies processes between fine/coarse grain. These are assigned to individual processors or separated in a division among processors, all of them in their corresponding architectures. For the latter, it is studied the hardware employed on the computer parallel process used in the SAR handling. The improvement here refers to several kinds of platforms in which the SAR process is implemented, shared memory multicomputers, and distributed memory multiprocessors. A comparison between them gives us some guidelines to follow in order to get a maximum throughput with a minimum latency and a maximum effectiveness with a minimum cost, all together with a limited complexness. It is concluded and described, that the approach consisting of the processing of the algorithms in a GNU/Linux environment, together with a Beowulf cluster platform offers, under certain conditions, the best compromise between performance and cost, and promises the major development in the future for the Synthetic Aperture Radar computer power thirsty applications in the next years.

  10. Parallel Power Grid Simulation Toolkit

    SciTech Connect

    Smith, Steve; Kelley, Brian; Banks, Lawrence; Top, Philip; Woodward, Carol

    2015-09-14

    ParGrid is a 'wrapper' that integrates a coupled Power Grid Simulation toolkit consisting of a library to manage the synchronization and communication of independent simulations. The included library code in ParGid, named FSKIT, is intended to support the coupling multiple continuous and discrete even parallel simulations. The code is designed using modern object oriented C++ methods utilizing C++11 and current Boost libraries to ensure compatibility with multiple operating systems and environments.

  11. Parallel processing of genomics data

    NASA Astrophysics Data System (ADS)

    Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario

    2016-10-01

    The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.

  12. Parallelism in integrated fluidic circuits

    NASA Astrophysics Data System (ADS)

    Bousse, Luc J.; Kopf-Sill, Anne R.; Parce, J. W.

    1998-04-01

    Many research groups around the world are working on integrated microfluidics. The goal of these projects is to automate and integrate the handling of liquid samples and reagents for measurement and assay procedures in chemistry and biology. Ultimately, it is hoped that this will lead to a revolution in chemical and biological procedures similar to that caused in electronics by the invention of the integrated circuit. The optimal size scale of channels for liquid flow is determined by basic constraints to be somewhere between 10 and 100 micrometers . In larger channels, mixing by diffusion takes too long; in smaller channels, the number of molecules present is so low it makes detection difficult. At Caliper, we are making fluidic systems in glass chips with channels in this size range, based on electroosmotic flow, and fluorescence detection. One application of this technology is rapid assays for drug screening, such as enzyme assays and binding assays. A further challenge in this area is to perform multiple functions on a chip in parallel, without a large increase in the number of inputs and outputs. A first step in this direction is a fluidic serial-to-parallel converter. Fluidic circuits will be shown with the ability to distribute an incoming serial sample stream to multiple parallel channels.

  13. Highly parallel sparse Cholesky factorization

    NASA Technical Reports Server (NTRS)

    Gilbert, John R.; Schreiber, Robert

    1990-01-01

    Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.

  14. Parallel Environment for Quantum Computing

    NASA Astrophysics Data System (ADS)

    Tabakin, Frank; Diaz, Bruno Julia

    2009-03-01

    To facilitate numerical study of noise and decoherence in QC algorithms,and of the efficacy of error correction schemes, we have developed a Fortran 90 quantum computer simulator with parallel processing capabilities. It permits rapid evaluation of quantum algorithms for a large number of qubits and for various ``noise'' scenarios. State vectors are distributed over many processors, to employ a large number of qubits. Parallel processing is implemented by the Message-Passing Interface protocol. A description of how to spread the wave function components over many processors, along with how to efficiently describe the action of general one- and two-qubit operators on these state vectors will be delineated.Grover's search and Shor's factoring algorithms with noise will be discussed as examples. A major feature of this work is that concurrent versions of the algorithms can be evaluated with each version subject to diverse noise effects, corresponding to solving a stochastic Schrodinger equation. The density matrix for the ensemble of such noise cases is constructed using parallel distribution methods to evaluate its associated entropy. Applications of this powerful tool is made to delineate the stability and correction of QC processes using Hamiltonian based dynamics.

  15. Parallel Markov chain Monte Carlo simulations.

    PubMed

    Ren, Ruichao; Orkoulas, G

    2007-06-07

    With strict detailed balance, parallel Monte Carlo simulation through domain decomposition cannot be validated with conventional Markov chain theory, which describes an intrinsically serial stochastic process. In this work, the parallel version of Markov chain theory and its role in accelerating Monte Carlo simulations via cluster computing is explored. It is shown that sequential updating is the key to improving efficiency in parallel simulations through domain decomposition. A parallel scheme is proposed to reduce interprocessor communication or synchronization, which slows down parallel simulation with increasing number of processors. Parallel simulation results for the two-dimensional lattice gas model show substantial reduction of simulation time for systems of moderate and large size.

  16. Parallel Markov chain Monte Carlo simulations

    NASA Astrophysics Data System (ADS)

    Ren, Ruichao; Orkoulas, G.

    2007-06-01

    With strict detailed balance, parallel Monte Carlo simulation through domain decomposition cannot be validated with conventional Markov chain theory, which describes an intrinsically serial stochastic process. In this work, the parallel version of Markov chain theory and its role in accelerating Monte Carlo simulations via cluster computing is explored. It is shown that sequential updating is the key to improving efficiency in parallel simulations through domain decomposition. A parallel scheme is proposed to reduce interprocessor communication or synchronization, which slows down parallel simulation with increasing number of processors. Parallel simulation results for the two-dimensional lattice gas model show substantial reduction of simulation time for systems of moderate and large size.

  17. Resolutions of the Coulomb operator: VIII. Parallel implementation using the modern programming language X10.

    PubMed

    Limpanuparb, Taweetham; Milthorpe, Josh; Rendell, Alistair P

    2014-10-30

    Use of the modern parallel programming language X10 for computing long-range Coulomb and exchange interactions is presented. By using X10, a partitioned global address space language with support for task parallelism and the explicit representation of data locality, the resolution of the Ewald operator can be parallelized in a straightforward manner including use of both intranode and internode parallelism. We evaluate four different schemes for dynamic load balancing of integral calculation using X10's work stealing runtime, and report performance results for long-range HF energy calculation of large molecule/high quality basis running on up to 1024 cores of a high performance cluster machine.

  18. Method for resource control in parallel environments using program organization and run-time support

    NASA Technical Reports Server (NTRS)

    Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)

    2001-01-01

    A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.

  19. Method for resource control in parallel environments using program organization and run-time support

    NASA Technical Reports Server (NTRS)

    Ekanadham, Kattamuri (Inventor); Moreira, Jose Eduardo (Inventor); Naik, Vijay Krishnarao (Inventor)

    1999-01-01

    A system and method for dynamic scheduling and allocation of resources to parallel applications during the course of their execution. By establishing well-defined interactions between an executing job and the parallel system, the system and method support dynamic reconfiguration of processor partitions, dynamic distribution and redistribution of data, communication among cooperating applications, and various other monitoring actions. The interactions occur only at specific points in the execution of the program where the aforementioned operations can be performed efficiently.

  20. Parallel processing of atmospheric chemistry calculations: Preliminary considerations

    SciTech Connect

    Elliott, S.; Jones, P.

    1995-01-01

    Global climate calculations are already saturating the class modern vector supercomputers with only a few central processing units. Increased resolution and inclusion of routines to deal with biogeochemical portions of the terrestrial climate system will soon demand massively parallel approaches. The atmospheric photochemistry ensemble is intimately linked to climate through the trace greenhouse gases ozone and methane and modules for representing it are being attached to global three dimensional transport and GCM frameworks. Atmospheric kinetics involve dozens of highly interactive tracers and so will accentuate the need for parallel processing of earth system simulations. In the present text we lay some of the groundwork for addition of atmospheric kinetics packages to GCM and global scale atmospheric models on multiply parallel computers. The discussion is tailored for consumption by the photochemical modelling community. After a review of numerical atmospheric chemistry methods, we examine how kinetics can be implemented on a parallel computer. We concentrate especially on data layout and flexibility and how these can be implemented in various programming models. We conclude that chemistry can be implemented rather easily within existing frameworks of several parallel atmospheric models. However, memory limitations may preclude high resolution studies of global chemistry.

  1. Parallelizing alternating direction implicit solver on GPUs

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We present a parallel Alternating Direction Implicit (ADI) solver on GPUs. Our implementation significantly improves existing implementations in two aspects. First, we address the scalability issue of existing Parallel Cyclic Reduction (PCR) implementations by eliminating their hardware resource con...

  2. Implementing clips on a parallel computer

    NASA Technical Reports Server (NTRS)

    Riley, Gary

    1987-01-01

    The C language integrated production system (CLIPS) is a forward chaining rule based language to provide training and delivery for expert systems. Conceptually, rule based languages have great potential for benefiting from the inherent parallelism of the algorithms that they employ. During each cycle of execution, a knowledge base of information is compared against a set of rules to determine if any rules are applicable. Parallelism also can be employed for use with multiple cooperating expert systems. To investigate the potential benefits of using a parallel computer to speed up the comparison of facts to rules in expert systems, a parallel version of CLIPS was developed for the FLEX/32, a large grain parallel computer. The FLEX implementation takes a macroscopic approach in achieving parallelism by splitting whole sets of rules among several processors rather than by splitting the components of an individual rule among processors. The parallel CLIPS prototype demonstrates the potential advantages of integrating expert system tools with parallel computers.

  3. Force user's manual: A portable, parallel FORTRAN

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.; Benten, Muhammad S.; Arenstorf, Norbert S.; Ramanan, Aruna V.

    1990-01-01

    The use of Force, a parallel, portable FORTRAN on shared memory parallel computers is described. Force simplifies writing code for parallel computers and, once the parallel code is written, it is easily ported to computers on which Force is installed. Although Force is nearly the same for all computers, specific details are included for the Cray-2, Cray-YMP, Convex 220, Flex/32, Encore, Sequent, Alliant computers on which it is installed.

  4. Automatic Multilevel Parallelization Using OpenMP

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

    2002-01-01

    In this paper we describe the extension of the CAPO (CAPtools (Computer Aided Parallelization Toolkit) OpenMP) parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report some results for several benchmark codes and one full application that have been parallelized using our system.

  5. The electron signature of parallel electric fields

    NASA Astrophysics Data System (ADS)

    Burch, J. L.; Gurgiolo, C.; Menietti, J. D.

    1990-12-01

    Dynamics Explorer I High-Altitude Plasma Instrument electron data are presented. The electron distribution functions have characteristics expected of a region of parallel electric fields. The data are consistent with previous test-particle simulations for observations within parallel electric field regions which indicate that typical hole, bump, and loss-cone electron distributions, which contain evidence for parallel potential differences both above and below the point of observation, are not expected to occur in regions containing actual parallel electric fields.

  6. Debugging Parallel Programs with Instant Replay.

    DTIC Science & Technology

    1986-09-01

    produce the same results. In this paper we present a general solution for reproducing the execution behavior of parallel programs, termed Instant Replay...Instant Replay on the BBN Butterfly Parallel Processor, and discuss how it can be incorporated into the debugging cycle for parallel programs. This...program often do not produce the same results. In this paper we present a general solution for reproducing the execution behavior of parallel

  7. Parallel machine architecture and compiler design facilities

    NASA Technical Reports Server (NTRS)

    Kuck, David J.; Yew, Pen-Chung; Padua, David; Sameh, Ahmed; Veidenbaum, Alex

    1990-01-01

    The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role.

  8. Collective Interaction of a Compressible Periodic Parallel Jet Flow

    NASA Technical Reports Server (NTRS)

    Miles, Jeffrey Hilton

    1997-01-01

    A linear instability model for multiple spatially periodic supersonic rectangular jets is solved using Floquet-Bloch theory. The disturbance environment is investigated using a two dimensional perturbation of a mean flow. For all cases large temporal growth rates are found. This work is motivated by an increase in mixing found in experimental measurements of spatially periodic supersonic rectangular jets with phase-locked screech. The results obtained in this paper suggests that phase-locked screech or edge tones may produce correlated spatially periodic jet flow downstream of the nozzles which creates a large span wise multi-nozzle region where a disturbance can propagate. The large temporal growth rates for eddies obtained by model calculation herein are related to the increased mixing since eddies are the primary mechanism that transfer energy from the mean flow to the large turbulent structures. Calculations of growth rates are presented for a range of Mach numbers and nozzle spacings corresponding to experimental test conditions where screech synchronized phase locking was observed. The model may be of significant scientific and engineering value in the quest to understand and construct supersonic mixer-ejector nozzles which provide increased mixing and reduced noise.

  9. Parallel ecological networks in ecosystems.

    PubMed

    Olff, Han; Alonso, David; Berg, Matty P; Eriksson, B Klemens; Loreau, Michel; Piersma, Theunis; Rooney, Neil

    2009-06-27

    In ecosystems, species interact with other species directly and through abiotic factors in multiple ways, often forming complex networks of various types of ecological interaction. Out of this suite of interactions, predator-prey interactions have received most attention. The resulting food webs, however, will always operate simultaneously with networks based on other types of ecological interaction, such as through the activities of ecosystem engineers or mutualistic interactions. Little is known about how to classify, organize and quantify these other ecological networks and their mutual interplay. The aim of this paper is to provide new and testable ideas on how to understand and model ecosystems in which many different types of ecological interaction operate simultaneously. We approach this problem by first identifying six main types of interaction that operate within ecosystems, of which food web interactions are one. Then, we propose that food webs are structured among two main axes of organization: a vertical (classic) axis representing trophic position and a new horizontal 'ecological stoichiometry' axis representing decreasing palatability of plant parts and detritus for herbivores and detrivores and slower turnover times. The usefulness of these new ideas is then explored with three very different ecosystems as test cases: temperate intertidal mudflats; temperate short grass prairie; and tropical savannah.

  10. Global Arrays Parallel Programming Toolkit

    SciTech Connect

    Nieplocha, Jaroslaw; Krishnan, Manoj Kumar; Palmer, Bruce J.; Tipparaju, Vinod; Harrison, Robert J.; Chavarría-Miranda, Daniel

    2011-01-01

    The two predominant classes of programming models for parallel computing are distributed memory and shared memory. Both shared memory and distributed memory models have advantages and shortcomings. Shared memory model is much easier to use but it ignores data locality/placement. Given the hierarchical nature of the memory subsystems in modern computers this characteristic can have a negative impact on performance and scalability. Careful code restructuring to increase data reuse and replacing fine grain load/stores with block access to shared data can address the problem and yield performance for shared memory that is competitive with message-passing. However, this performance comes at the cost of compromising the ease of use that the shared memory model advertises. Distributed memory models, such as message-passing or one-sided communication, offer performance and scalability but they are difficult to program. The Global Arrays toolkit attempts to offer the best features of both models. It implements a shared-memory programming model in which data locality is managed by the programmer. This management is achieved by calls to functions that transfer data between a global address space (a distributed array) and local storage. In this respect, the GA model has similarities to the distributed shared-memory models that provide an explicit acquire/release protocol. However, the GA model acknowledges that remote data is slower to access than local data and allows data locality to be specified by the programmer and hence managed. GA is related to the global address space languages such as UPC, Titanium, and, to a lesser extent, Co-Array Fortran. In addition, by providing a set of data-parallel operations, GA is also related to data-parallel languages such as HPF, ZPL, and Data Parallel C. However, the Global Array programming model is implemented as a library that works with most languages used for technical computing and does not rely on compiler technology for achieving

  11. High Performance Parallel Computational Nanotechnology

    NASA Technical Reports Server (NTRS)

    Saini, Subhash; Craw, James M. (Technical Monitor)

    1995-01-01

    At a recent press conference, NASA Administrator Dan Goldin encouraged NASA Ames Research Center to take a lead role in promoting research and development of advanced, high-performance computer technology, including nanotechnology. Manufacturers of leading-edge microprocessors currently perform large-scale simulations in the design and verification of semiconductor devices and microprocessors. Recently, the need for this intensive simulation and modeling analysis has greatly increased, due in part to the ever-increasing complexity of these devices, as well as the lessons of experiences such as the Pentium fiasco. Simulation, modeling, testing, and validation will be even more important for designing molecular computers because of the complex specification of millions of atoms, thousands of assembly steps, as well as the simulation and modeling needed to ensure reliable, robust and efficient fabrication of the molecular devices. The software for this capacity does not exist today, but it can be extrapolated from the software currently used in molecular modeling for other applications: semi-empirical methods, ab initio methods, self-consistent field methods, Hartree-Fock methods, molecular mechanics; and simulation methods for diamondoid structures. In as much as it seems clear that the application of such methods in nanotechnology will require powerful, highly powerful systems, this talk will discuss techniques and issues for performing these types of computations on parallel systems. We will describe system design issues (memory, I/O, mass storage, operating system requirements, special user interface issues, interconnects, bandwidths, and programming languages) involved in parallel methods for scalable classical, semiclassical, quantum, molecular mechanics, and continuum models; molecular nanotechnology computer-aided designs (NanoCAD) techniques; visualization using virtual reality techniques of structural models and assembly sequences; software required to

  12. Parallel Computing Using Web Servers and "Servlets".

    ERIC Educational Resources Information Center

    Lo, Alfred; Bloor, Chris; Choi, Y. K.

    2000-01-01

    Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common…

  13. Identifying, Quantifying, Extracting and Enhancing Implicit Parallelism

    ERIC Educational Resources Information Center

    Agarwal, Mayank

    2009-01-01

    The shift of the microprocessor industry towards multicore architectures has placed a huge burden on the programmers by requiring explicit parallelization for performance. Implicit Parallelization is an alternative that could ease the burden on programmers by parallelizing applications "under the covers" while maintaining sequential semantics…

  14. Parallel Processing at the High School Level.

    ERIC Educational Resources Information Center

    Sheary, Kathryn Anne

    This study investigated the ability of high school students to cognitively understand and implement parallel processing. Data indicates that most parallel processing is being taught at the university level. Instructional modules on C, Linux, and the parallel processing language, P4, were designed to show that high school students are highly…

  15. Coordination in serial-parallel image processing

    NASA Astrophysics Data System (ADS)

    Wójcik, Waldemar; Dubovoi, Vladymyr M.; Duda, Marina E.; Romaniuk, Ryszard S.; Yesmakhanova, Laura; Kozbakova, Ainur

    2015-12-01

    Serial-parallel systems used to convert the image. The control of their work results with the need to solve coordination problem. The paper summarizes the model of coordination of resource allocation in relation to the task of synchronizing parallel processes; the genetic algorithm of coordination developed, its adequacy verified in relation to the process of parallel image processing.

  16. Scalable Parallel Algebraic Multigrid Solvers

    SciTech Connect

    Bank, R; Lu, S; Tong, C; Vassilevski, P

    2005-03-23

    The authors propose a parallel algebraic multilevel algorithm (AMG), which has the novel feature that the subproblem residing in each processor is defined over the entire partition domain, although the vast majority of unknowns for each subproblem are associated with the partition owned by the corresponding processor. This feature ensures that a global coarse description of the problem is contained within each of the subproblems. The advantages of this approach are that interprocessor communication is minimized in the solution process while an optimal order of convergence rate is preserved; and the speed of local subproblem solvers can be maximized using the best existing sequential algebraic solvers.

  17. Parallel Assembly of LIGA Components

    SciTech Connect

    Christenson, T.R.; Feddema, J.T.

    1999-03-04

    In this paper, a prototype robotic workcell for the parallel assembly of LIGA components is described. A Cartesian robot is used to press 386 and 485 micron diameter pins into a LIGA substrate and then place a 3-inch diameter wafer with LIGA gears onto the pins. Upward and downward looking microscopes are used to locate holes in the LIGA substrate, pins to be pressed in the holes, and gears to be placed on the pins. This vision system can locate parts within 3 microns, while the Cartesian manipulator can place the parts within 0.4 microns.

  18. True Shear Parallel Plate Viscometer

    NASA Technical Reports Server (NTRS)

    Ethridge, Edwin; Kaukler, William

    2010-01-01

    This viscometer (which can also be used as a rheometer) is designed for use with liquids over a large temperature range. The device consists of horizontally disposed, similarly sized, parallel plates with a precisely known gap. The lower plate is driven laterally with a motor to apply shear to the liquid in the gap. The upper plate is freely suspended from a double-arm pendulum with a sufficiently long radius to reduce height variations during the swing to negligible levels. A sensitive load cell measures the shear force applied by the liquid to the upper plate. Viscosity is measured by taking the ratio of shear stress to shear rate.

  19. Scheduling Tasks In Parallel Processing

    NASA Technical Reports Server (NTRS)

    Price, Camille C.; Salama, Moktar A.

    1989-01-01

    Algorithms sought to minimize time and cost of computation. Report describes research on scheduling of computations tasks in system of multiple identical data processors operating in parallel. Computational intractability requires use of suboptimal heuristic algorithms. First algorithm called "list heuristic", variation of classical list scheduling. Second algorithm called "cluster heuristic" applied to tightly coupled tasks and consists of four phases. Third algorithm called "exchange heuristic", iterative-improvement algorithm beginning with initial feasible assignment of tasks to processors and periods of time. Fourth algorithm is iterative one for optimal assignment of tasks and based on concept called "simulated annealing" because of mathematical resemblance to aspects of physical annealing processes.

  20. Heart Fibrillation and Parallel Supercomputers

    NASA Technical Reports Server (NTRS)

    Kogan, B. Y.; Karplus, W. J.; Chudin, E. E.

    1997-01-01

    The Luo and Rudy 3 cardiac cell mathematical model is implemented on the parallel supercomputer CRAY - T3D. The splitting algorithm combined with variable time step and an explicit method of integration provide reasonable solution times and almost perfect scaling for rectilinear wave propagation. The computer simulation makes it possible to observe new phenomena: the break-up of spiral waves caused by intracellular calcium and dynamics and the non-uniformity of the calcium distribution in space during the onset of the spiral wave.

  1. Aggregation and Gelation of Aromatic Polyamides with Parallel and Anti-parallel Alignment of Molecular Dipole Along the Backbone

    NASA Astrophysics Data System (ADS)

    Zhu, Dan; Shang, Jing; Ye, Xiaodong; Shen, Jian

    2016-12-01

    The understanding of macromolecular structures and interactions is important but difficult, due to the facts that a macromolecules are of versatile conformations and aggregate states, which vary with environmental conditions and histories. In this work two polyamides with parallel or anti-parallel dipoles along the linear backbone, named as ABAB (parallel) and AABB (anti-parallel) have been studied. By using a combination of methods, the phase behaviors of the polymers during the aggregate and gelation, i.e., the forming or dissociation processes of nuclei and fibril, cluster of fibrils, and cluster-cluster aggregation have been revealed. Such abundant phase behaviors are dominated by the inter-chain interactions, including dispersion, polarity and hydrogen bonding, and correlatd with the solubility parameters of solvents, the temperature, and the polymer concentration. The results of X-ray diffraction and fast-mode dielectric relaxation indicate that AABB possesses more rigid conformation than ABAB, and because of that AABB aggregates are of long fibers while ABAB is of hairy fibril clusters, the gelation concentration in toluene is 1 w/v% for AABB, lower than the 3 w/v% for ABAB.

  2. Aggregation and Gelation of Aromatic Polyamides with Parallel and Anti-parallel Alignment of Molecular Dipole Along the Backbone

    PubMed Central

    Zhu, Dan; Shang, Jing; Ye, Xiaodong; Shen, Jian

    2016-01-01

    The understanding of macromolecular structures and interactions is important but difficult, due to the facts that a macromolecules are of versatile conformations and aggregate states, which vary with environmental conditions and histories. In this work two polyamides with parallel or anti-parallel dipoles along the linear backbone, named as ABAB (parallel) and AABB (anti-parallel) have been studied. By using a combination of methods, the phase behaviors of the polymers during the aggregate and gelation, i.e., the forming or dissociation processes of nuclei and fibril, cluster of fibrils, and cluster-cluster aggregation have been revealed. Such abundant phase behaviors are dominated by the inter-chain interactions, including dispersion, polarity and hydrogen bonding, and correlatd with the solubility parameters of solvents, the temperature, and the polymer concentration. The results of X-ray diffraction and fast-mode dielectric relaxation indicate that AABB possesses more rigid conformation than ABAB, and because of that AABB aggregates are of long fibers while ABAB is of hairy fibril clusters, the gelation concentration in toluene is 1 w/v% for AABB, lower than the 3 w/v% for ABAB. PMID:27958362

  3. Kinetic theory of turbulence for parallel propagation revisited: Formal results

    SciTech Connect

    Yoon, Peter H.

    2015-08-15

    In a recent paper, Gaelzer et al. [Phys. Plasmas 22, 032310 (2015)] revisited the second-order nonlinear kinetic theory for turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field. The original work was according to Yoon and Fang [Phys. Plasmas 15, 122312 (2008)], but Gaelzer et al. noted that the terms pertaining to discrete-particle effects in Yoon and Fang's theory did not enjoy proper dimensionality. The purpose of Gaelzer et al. was to restore the dimensional consistency associated with such terms. However, Gaelzer et al. was concerned only with linear wave-particle interaction terms. The present paper completes the analysis by considering the dimensional correction to nonlinear wave-particle interaction terms in the wave kinetic equation.

  4. Xyce parallel electronic simulator design.

    SciTech Connect

    Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.

    2010-09-01

    This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.

  5. Parallel job-scheduling algorithms

    SciTech Connect

    Rodger, S.H.

    1989-01-01

    In this thesis, we consider solving job scheduling problems on the CREW PRAM model. We show how to adapt Cole's pipeline merge technique to yield several efficient parallel algorithms for a number of job scheduling problems and one optimal parallel algorithm for the following job scheduling problem: Given a set of n jobs defined by release times, deadlines and processing times, find a schedule that minimizes the maximum lateness of the jobs and allows preemption when the jobs are scheduled to run on one machine. In addition, we present the first NC algorithm for the following job scheduling problem: Given a set of n jobs defined by release times, deadlines and unit processing times, determine if there is a schedule of jobs on one machine, and calculate the schedule if it exists. We identify the notion of a canonical schedule, which is the type of schedule our algorithm computes if there is a schedule. Our algorithm runs in O((log n){sup 2}) time and uses O(n{sup 2}k{sup 2}) processors, where k is the minimum number of distinct offsets of release times or deadlines.

  6. Parallel multiscale simulations of a brain aneurysm

    NASA Astrophysics Data System (ADS)

    Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em

    2013-07-01

    Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multiscale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver NɛκTαr. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers (NɛκTαr and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300 K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future

  7. Parallel multiscale simulations of a brain aneurysm

    SciTech Connect

    Grinberg, Leopold; Fedosov, Dmitry A.; Karniadakis, George Em

    2013-07-01

    Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multiscale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier–Stokes solver NεκTαr. The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers (NεκTαr and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300 K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in

  8. Parallel multiscale simulations of a brain aneurysm.

    PubMed

    Grinberg, Leopold; Fedosov, Dmitry A; Karniadakis, George Em

    2013-07-01

    Cardiovascular pathologies, such as a brain aneurysm, are affected by the global blood circulation as well as by the local microrheology. Hence, developing computational models for such cases requires the coupling of disparate spatial and temporal scales often governed by diverse mathematical descriptions, e.g., by partial differential equations (continuum) and ordinary differential equations for discrete particles (atomistic). However, interfacing atomistic-based with continuum-based domain discretizations is a challenging problem that requires both mathematical and computational advances. We present here a hybrid methodology that enabled us to perform the first multi-scale simulations of platelet depositions on the wall of a brain aneurysm. The large scale flow features in the intracranial network are accurately resolved by using the high-order spectral element Navier-Stokes solver εκαr . The blood rheology inside the aneurysm is modeled using a coarse-grained stochastic molecular dynamics approach (the dissipative particle dynamics method) implemented in the parallel code LAMMPS. The continuum and atomistic domains overlap with interface conditions provided by effective forces computed adaptively to ensure continuity of states across the interface boundary. A two-way interaction is allowed with the time-evolving boundary of the (deposited) platelet clusters tracked by an immersed boundary method. The corresponding heterogeneous solvers ( εκαr and LAMMPS) are linked together by a computational multilevel message passing interface that facilitates modularity and high parallel efficiency. Results of multiscale simulations of clot formation inside the aneurysm in a patient-specific arterial tree are presented. We also discuss the computational challenges involved and present scalability results of our coupled solver on up to 300K computer processors. Validation of such coupled atomistic-continuum models is a main open issue that has to be addressed in future

  9. A CS1 pedagogical approach to parallel thinking

    NASA Astrophysics Data System (ADS)

    Rague, Brian William

    Almost all collegiate programs in Computer Science offer an introductory course in programming primarily devoted to communicating the foundational principles of software design and development. The ACM designates this introduction to computer programming course for first-year students as CS1, during which methodologies for solving problems within a discrete computational context are presented. Logical thinking is highlighted, guided primarily by a sequential approach to algorithm development and made manifest by typically using the latest, commercially successful programming language. In response to the most recent developments in accessible multicore computers, instructors of these introductory classes may wish to include training on how to design workable parallel code. Novel issues arise when programming concurrent applications which can make teaching these concepts to beginning programmers a seemingly formidable task. Student comprehension of design strategies related to parallel systems should be monitored to ensure an effective classroom experience. This research investigated the feasibility of integrating parallel computing concepts into the first-year CS classroom. To quantitatively assess student comprehension of parallel computing, an experimental educational study using a two-factor mixed group design was conducted to evaluate two instructional interventions in addition to a control group: (1) topic lecture only, and (2) topic lecture with laboratory work using a software visualization Parallel Analysis Tool (PAT) specifically designed for this project. A new evaluation instrument developed for this study, the Perceptions of Parallelism Survey (PoPS), was used to measure student learning regarding parallel systems. The results from this educational study show a statistically significant main effect among the repeated measures, implying that student comprehension levels of parallel concepts as measured by the PoPS improve immediately after the delivery of

  10. STALK : an interactive virtual molecular docking system.

    SciTech Connect

    Levine, D.; Facello, M.; Hallstrom, P.; Reeder, G.; Walenz, B.; Stevens, F.; Univ. of Illinois

    1997-04-01

    Several recent technologies-genetic algorithms, parallel and distributed computing, virtual reality, and high-speed networking-underlie a new approach to the computational study of how biomolecules interact or 'dock' together. With the Stalk system, a user in a virtual reality environment can interact with a genetic algorithm running on a parallel computer to help in the search for likely geometric configurations.

  11. Partitioning in parallel processing of production systems

    SciTech Connect

    Oflazer, K.

    1987-01-01

    This thesis presents research on certain issues related to parallel processing of production systems. It first presents a parallel production system interpreter that has been implemented on a four-processor multiprocessor. This parallel interpreter is based on Forgy's OPS5 interpreter and exploits production-level parallelism in production systems. Runs on the multiprocessor system indicate that it is possible to obtain speed-up of around 1.7 in the match computation for certain production systems when productions are split into three sets that are processed in parallel. The next issue addressed is that of partitioning a set of rules to processors in a parallel interpreter with production-level parallelism, and the extent of additional improvement in performance. The partitioning problem is formulated and an algorithm for approximate solutions is presented. The thesis next presents a parallel processing scheme for OPS5 production systems that allows some redundancy in the match computation. This redundancy enables the processing of a production to be divided into units of medium granularity each of which can be processed in parallel. Subsequently, a parallel processor architecture for implementing the parallel processing algorithm is presented.

  12. Parallel Rendering of Large Time-Varying Volume Data

    NASA Technical Reports Server (NTRS)

    Garbutt, Alexander E.

    2005-01-01

    Interactive visualization of large time-varying 3D volume datasets has been and still is a great challenge to the modem computational world. It stretches the limits of the memory capacity, the disk space, the network bandwidth and the CPU speed of a conventional computer. In this SURF project, we propose to develop a parallel volume rendering program on SGI's Prism, a cluster computer equipped with state-of-the-art graphic hardware. The proposed program combines both parallel computing and hardware rendering in order to achieve an interactive rendering rate. We use 3D texture mapping and a hardware shader to implement 3D volume rendering on each workstation. We use SGI's VisServer to enable remote rendering using Prism's graphic hardware. And last, we will integrate this new program with ParVox, a parallel distributed visualization system developed at JPL. At the end of the project, we Will demonstrate remote interactive visualization using this new hardware volume renderer on JPL's Prism System using a time-varying dataset from selected JPL applications.

  13. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1991-01-01

    The main contribution of the effort in the last two years is the introduction of the MOPPS system. After doing extensive literature search, we introduced the system which is described next. MOPPS employs a new solution to the problem of managing programs which solve scientific and engineering applications on a distributed processing environment. Autonomous computers cooperate efficiently in solving large scientific problems with this solution. MOPPS has the advantage of not assuming the presence of any particular network topology or configuration, computer architecture, or operating system. It imposes little overhead on network and processor resources while efficiently managing programs concurrently. The core of MOPPS is an intelligent program manager that builds a knowledge base of the execution performance of the parallel programs it is managing under various conditions. The manager applies this knowledge to improve the performance of future runs. The program manager learns from experience.

  14. Hybrid Optimization Parallel Search PACKage

    SciTech Connect

    2009-11-10

    HOPSPACK is open source software for solving optimization problems without derivatives. Application problems may have a fully nonlinear objective function, bound constraints, and linear and nonlinear constraints. Problem variables may be continuous, integer-valued, or a mixture of both. The software provides a framework that supports any derivative-free type of solver algorithm. Through the framework, solvers request parallel function evaluation, which may use MPI (multiple machines) or multithreading (multiple processors/cores on one machine). The framework provides a Cache and Pending Cache of saved evaluations that reduces execution time and facilitates restarts. Solvers can dynamically create other algorithms to solve subproblems, a useful technique for handling multiple start points and integer-valued variables. HOPSPACK ships with the Generating Set Search (GSS) algorithm, developed at Sandia as part of the APPSPACK open source software project.

  15. Parallel Performance Characterization of Columbia

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak

    2004-01-01

    Using a collection of benchmark problems of increasing levels of realism and computational effort, we will characterize the strengths and limitations of the 10,240 processor Columbia system to deliver supercomputing value to application scientists. Scientists need to be able to determine if and how they can utilize Columbia to carry extreme workloads, either in terms of ultra-large applications that cannot be run otherwise (capability), or in terms of very large ensembles of medium-scale applications to populate response matrices (capacity). We select existing application benchmarks that scale from a small number of processors to the entire machine, and that highlight different issues in running supercomputing-calss applicaions, such as the various types of memory access, file I/O, inter- and intra-node communications and parallelization paradigms. http://www.nas.nasa.gov/Software/NPB/

  16. Information hiding in parallel programs

    SciTech Connect

    Foster, I.

    1992-01-30

    A fundamental principle in program design is to isolate difficult or changeable design decisions. Application of this principle to parallel programs requires identification of decisions that are difficult or subject to change, and the development of techniques for hiding these decisions. We experiment with three complex applications, and identify mapping, communication, and scheduling as areas in which decisions are particularly problematic. We develop computational abstractions that hide such decisions, and show that these abstractions can be used to develop elegant solutions to programming problems. In particular, they allow us to encode common structures, such as transforms, reductions, and meshes, as software cells and templates that can reused in different applications. An important characteristic of these structures is that they do not incorporate mapping, communication, or scheduling decisions: these aspects of the design are specified separately, when composing existing structures to form applications. This separation of concerns allows the same cells and templates to be reused in different contexts.

  17. Embodied and Distributed Parallel DJing.

    PubMed

    Cappelen, Birgitta; Andersson, Anders-Petter

    2016-01-01

    Everyone has a right to take part in cultural events and activities, such as music performances and music making. Enforcing that right, within Universal Design, is often limited to a focus on physical access to public areas, hearing aids etc., or groups of persons with special needs performing in traditional ways. The latter might be people with disabilities, being musicians playing traditional instruments, or actors playing theatre. In this paper we focus on the innovative potential of including people with special needs, when creating new cultural activities. In our project RHYME our goal was to create health promoting activities for children with severe disabilities, by developing new musical and multimedia technologies. Because of the users' extreme demands and rich contribution, we ended up creating both a new genre of musical instruments and a new art form. We call this new art form Embodied and Distributed Parallel DJing, and the new genre of instruments for Empowering Multi-Sensorial Things.

  18. Parallel spinors on flat manifolds

    NASA Astrophysics Data System (ADS)

    Sadowski, Michał

    2006-05-01

    Let p(M) be the dimension of the vector space of parallel spinors on a closed spin manifold M. We prove that every finite group G is the holonomy group of a closed flat spin manifold M(G) such that p(M(G))>0. If the holonomy group Hol(M) of M is cyclic, then we give an explicit formula for p(M) another than that given in [R.J. Miatello, R.A. Podesta, The spectrum of twisted Dirac operators on compact flat manifolds, Trans. Am. Math. Soc., in press]. We answer the question when p(M)>0 if Hol(M) is a cyclic group of prime order or dim⁡M≤4.

  19. Device for balancing parallel strings

    DOEpatents

    Mashikian, Matthew S.

    1985-01-01

    A battery plant is described which features magnetic circuit means in association with each of the battery strings in the battery plant for balancing the electrical current flow through the battery strings by equalizing the voltage across each of the battery strings. Each of the magnetic circuit means generally comprises means for sensing the electrical current flow through one of the battery strings, and a saturable reactor having a main winding connected electrically in series with the battery string, a bias winding connected to a source of alternating current and a control winding connected to a variable source of direct current controlled by the sensing means. Each of the battery strings is formed by a plurality of batteries connected electrically in series, and these battery strings are connected electrically in parallel across common bus conductors.

  20. Parallel network simulations with NEURON.

    PubMed

    Migliore, M; Cannia, C; Lytton, W W; Markram, Henry; Hines, M L

    2006-10-01

    The NEURON simulation environment has been extended to support parallel network simulations. Each processor integrates the equations for its subnet over an interval equal to the minimum (interprocessor) presynaptic spike generation to postsynaptic spike delivery connection delay. The performance of three published network models with very different spike patterns exhibits superlinear speedup on Beowulf clusters and demonstrates that spike communication overhead is often less than the benefit of an increased fraction of the entire problem fitting into high speed cache. On the EPFL IBM Blue Gene, almost linear speedup was obtained up to 100 processors. Increasing one model from 500 to 40,000 realistic cells exhibited almost linear speedup on 2,000 processors, with an integration time of 9.8 seconds and communication time of 1.3 seconds. The potential for speed-ups of several orders of magnitude makes practical the running of large network simulations that could otherwise not be explored.

  1. Parallel computing in enterprise modeling.

    SciTech Connect

    Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.

    2008-08-01

    This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.

  2. Integrated Task and Data Parallel Programming

    NASA Technical Reports Server (NTRS)

    Grimshaw, A. S.

    1998-01-01

    This research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers 1995 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program. Additional 1995 Activities During the fall I collaborated

  3. Parallel processing considerations for image recognition tasks

    NASA Astrophysics Data System (ADS)

    Simske, Steven J.

    2011-01-01

    Many image recognition tasks are well-suited to parallel processing. The most obvious example is that many imaging tasks require the analysis of multiple images. From this standpoint, then, parallel processing need be no more complicated than assigning individual images to individual processors. However, there are three less trivial categories of parallel processing that will be considered in this paper: parallel processing (1) by task; (2) by image region; and (3) by meta-algorithm. Parallel processing by task allows the assignment of multiple workflows-as diverse as optical character recognition [OCR], document classification and barcode reading-to parallel pipelines. This can substantially decrease time to completion for the document tasks. For this approach, each parallel pipeline is generally performing a different task. Parallel processing by image region allows a larger imaging task to be sub-divided into a set of parallel pipelines, each performing the same task but on a different data set. This type of image analysis is readily addressed by a map-reduce approach. Examples include document skew detection and multiple face detection and tracking. Finally, parallel processing by meta-algorithm allows different algorithms to be deployed on the same image simultaneously. This approach may result in improved accuracy.

  4. “Serial” effects in parallel models of reading

    PubMed Central

    Chang, Ya-Ning; Furber, Steve; Welbourne, Stephen

    2012-01-01

    There is now considerable evidence showing that the time to read a word out loud is influenced by an interaction between orthographic length and lexicality. Given that length effects are interpreted by advocates of dual-route models as evidence of serial processing this would seem to pose a serious challenge to models of single word reading which postulate a common parallel processing mechanism for reading both words and nonwords (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; Rastle, Havelka, Wydell, Coltheart, & Besner, 2009). However, an alternative explanation of these data is that visual processes outside the scope of existing parallel models are responsible for generating the word-length related phenomena (Seidenberg & Plaut, 1998). Here we demonstrate that a parallel model of single word reading can account for the differential word-length effects found in the naming latencies of words and nonwords, provided that it includes a mapping from visual to orthographic representations, and that the nature of those orthographic representations are not preconstrained. The model can also simulate other supposedly “serial” effects. The overall findings were consistent with the view that visual processing contributes substantially to the word-length effects in normal reading and provided evidence to support the single-route theory which assumes words and nonwords are processed in parallel by a common mechanism. PMID:22343366

  5. Instant well-log inversion with a parallel computer

    SciTech Connect

    Kimminau, S.J.; Trivedi, H.

    1993-08-01

    Well-log analysis requires several vectors of input data to be inverted with a physical model that produces more vectors of output data. The problem is inherently suited to either vectorization or parallelization. PLATO (parallel log analysis, timely output) is a research prototype system that uses a parallel architecture computer with memory-mapped graphics to invert vector data and display the result rapidly. By combining this high-performance computing and display system with a graphical user interface, the analyst can interact with the system in real time'' and can visualize the result of changing parameters on up to 1,000 levels of computed volumes and reconstructed logs. It is expected that such instant'' inversion will remove the main disadvantages frequently cited for simultaneous analysis methods, namely difficulty in assessing sensitivity to different parameters and slow output response. Although the prototype system uses highly specific features of a parallel processor, a subsequent version has been implemented on a conventional (Serial) workstation with less performance but adequate functionality to preserve the apparently instant response. PLATO demonstrates the feasibility of petroleum computing applications combining an intuitive graphical interface, high-performance computing of physical models, and real-time output graphics.

  6. Toward an automated parallel computing environment for geosciences

    NASA Astrophysics Data System (ADS)

    Zhang, Huai; Liu, Mian; Shi, Yaolin; Yuen, David A.; Yan, Zhenzhen; Liang, Guoping

    2007-08-01

    Software for geodynamic modeling has not kept up with the fast growing computing hardware and network resources. In the past decade supercomputing power has become available to most researchers in the form of affordable Beowulf clusters and other parallel computer platforms. However, to take full advantage of such computing power requires developing parallel algorithms and associated software, a task that is often too daunting for geoscience modelers whose main expertise is in geosciences. We introduce here an automated parallel computing environment built on open-source algorithms and libraries. Users interact with this computing environment by specifying the partial differential equations, solvers, and model-specific properties using an English-like modeling language in the input files. The system then automatically generates the finite element codes that can be run on distributed or shared memory parallel machines. This system is dynamic and flexible, allowing users to address different problems in geosciences. It is capable of providing web-based services, enabling users to generate source codes online. This unique feature will facilitate high-performance computing to be integrated with distributed data grids in the emerging cyber-infrastructures for geosciences. In this paper we discuss the principles of this automated modeling environment and provide examples to demonstrate its versatility.

  7. Towards Distributed Memory Parallel Program Analysis

    SciTech Connect

    Quinlan, D; Barany, G; Panas, T

    2008-06-17

    This paper presents a parallel attribute evaluation for distributed memory parallel computer architectures where previously only shared memory parallel support for this technique has been developed. Attribute evaluation is a part of how attribute grammars are used for program analysis within modern compilers. Within this work, we have extended ROSE, a open compiler infrastructure, with a distributed memory parallel attribute evaluation mechanism to support user defined global program analysis required for some forms of security analysis which can not be addressed by a file by file view of large scale applications. As a result, user defined security analyses may now run in parallel without the user having to specify the way data is communicated between processors. The automation of communication enables an extensible open-source parallel program analysis infrastructure.

  8. Parallel reactor systems for bioprocess development.

    PubMed

    Weuster-Botz, Dirk

    2005-01-01

    Controlled parallel bioreactor systems allow fed-batch operation at early stages of process development. The characteristics of shaken bioreactors operated in parallel (shake flask, microtiter plate), sparged bioreactors (small-scale bubble column) and stirred bioreactors (stirred-tank, stirred column) are briefly summarized. Parallel fed-batch operation is achieved with an intermittent feeding and pH-control system for up to 16 bioreactors operated in parallel on a scale of 100 ml. Examples of the scale-up and scale-down of pH-controlled microbial fed-batch processes demonstrate that controlled parallel reactor systems can result in more effective bioprocess development. Future developments are also outlined, including units of 48 parallel stirred-tank reactors with individual pH- and pO2-controls and automation as well as liquid handling system, operated on a scale of ml.

  9. Linearly exact parallel closures for slab geometry

    NASA Astrophysics Data System (ADS)

    Ji, Jeong-Young; Held, Eric D.; Jhang, Hogun

    2013-08-01

    Parallel closures are obtained by solving a linearized kinetic equation with a model collision operator using the Fourier transform method. The closures expressed in wave number space are exact for time-dependent linear problems to within the limits of the model collision operator. In the adiabatic, collisionless limit, an inverse Fourier transform is performed to obtain integral (nonlocal) parallel closures in real space; parallel heat flow and viscosity closures for density, temperature, and flow velocity equations replace Braginskii's parallel closure relations, and parallel flow velocity and heat flow closures for density and temperature equations replace Spitzer's parallel transport relations. It is verified that the closures reproduce the exact linear response function of Hammett and Perkins [Phys. Rev. Lett. 64, 3019 (1990)] for Landau damping given a temperature gradient. In contrast to their approximate closures where the vanishing viscosity coefficient numerically gives an exact response, our closures relate the heat flow and nonvanishing viscosity to temperature and flow velocity (gradients).

  10. Design considerations for parallel graphics libraries

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1994-01-01

    Applications which run on parallel supercomputers are often characterized by massive datasets. Converting these vast collections of numbers to visual form has proven to be a powerful aid to comprehension. For a variety of reasons, it may be desirable to provide this visual feedback at runtime. One way to accomplish this is to exploit the available parallelism to perform graphics operations in place. In order to do this, we need appropriate parallel rendering algorithms and library interfaces. This paper provides a tutorial introduction to some of the issues which arise in designing parallel graphics libraries and their underlying rendering algorithms. The focus is on polygon rendering for distributed memory message-passing systems. We illustrate our discussion with examples from PGL, a parallel graphics library which has been developed on the Intel family of parallel systems.

  11. Automatic Multilevel Parallelization Using OpenMP

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

    2002-01-01

    In this paper we describe the extension of the CAPO parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report first results for several benchmark codes and one full application that have been parallelized using our system.

  12. Inverse Kinematics for a Parallel Myoelectric Elbow

    DTIC Science & Technology

    2001-10-25

    Inverse Kinematics for a Parallel Myoelectric Elbow A. Z. Escudero, Ja. Álvarez, L. Leija. Center of Research and Advanced Studies of the IPN...replacement above elbow are serial mechanisms driven by a DC motor and they include only one active articulation for the elbow [1]. Parallel mechanisms...are rather scarce [2]. The inverse kinematics model of a 3-degree of freedom parallel prosthetic elbow mechanism is reported. The mathematical

  13. Multipactor saturation in parallel-plate waveguides

    SciTech Connect

    Sorolla, E.; Mattes, M.

    2012-07-15

    The saturation stage of a multipactor discharge is considered of interest, since it can guide towards a criterion to assess the multipactor onset. The electron cloud under multipactor regime within a parallel-plate waveguide is modeled by a thin continuous distribution of charge and the equations of motion are calculated taking into account the space charge effects. The saturation is identified by the interaction of the electron cloud with its image charge. The stability of the electron population growth is analyzed and two mechanisms of saturation to explain the steady-state multipactor for voltages near above the threshold onset are identified. The impact energy in the collision against the metal plates decreases during the electron population growth due to the attraction of the electron sheet on the image through the initial plate. When this growth remains stable till the impact energy reaches the first cross-over point, the electron surface density tends to a constant value. When the stability is broken before reaching the first cross-over point the surface charge density oscillates chaotically bounded within a certain range. In this case, an expression to calculate the maximum electron surface charge density is found whose predictions agree with the simulations when the voltage is not too high.

  14. Use Computer-Aided Tools to Parallelize Large CFD Applications

    NASA Technical Reports Server (NTRS)

    Jin, H.; Frumkin, M.; Yan, J.

    2000-01-01

    Greenwich, to reduce potential errors made by users. Earlier tests on NAS Benchmarks and ARC3D have demonstrated good success of this tool. In this study, we have applied CAPO to parallelize three large applications in the area of computational fluid dynamics (CFD): OVERFLOW, TLNS3D and INS3D. These codes are widely used for solving Navier-Stokes equations with complicated boundary conditions and turbulence model in multiple zones. Each one comprises of from 50K to 1,00k lines of FORTRAN77. As an example, CAPO took 77 hours to complete the data dependence analysis of OVERFLOW on a workstation (SGI, 175MHz, R10K processor). A fair amount of effort was spent on correcting false dependencies due to lack of necessary knowledge during the analysis. Even so, CAPO provides an easy way for user to interact with the parallelization process. The OpenMP version was generated within a day after the analysis was completed. Due to sequential algorithms involved, code sections in TLNS3D and INS3D need to be restructured by hand to produce more efficient parallel codes. An included figure shows preliminary test results of the generated OVERFLOW with several test cases in single zone. The MPI data points for the small test case were taken from a handcoded MPI version. As we can see, CAPO's version has achieved 18 fold speed up on 32 nodes of the SGI O2K. For the small test case, it outperformed the MPI version. These results are very encouraging, but further work is needed. For example, although CAPO attempts to place directives on the outer- most parallel loops in an interprocedural framework, it does not insert directives based on the best manual strategy. In particular, it lacks the support of parallelization at the multi-zone level. Future work will emphasize on the development of methodology to work in a multi-zone level and with a hybrid approach. Development of tools to perform more complicated code transformation is also needed.

  15. Parallel auto-correlative statistics with VTK.

    SciTech Connect

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2013-08-01

    This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.

  16. Parallel programming in Split-C

    SciTech Connect

    Culler, D.E.; Dusseau, A.; Goldstein, S.C.; Krishnamurthy, A.; Lumetta, S.; Eicken, T. von; Yelick, K.

    1993-12-31

    The authors introduce the Split-C language, a parallel extension of C intended for high performance programming on distributed memory multiprocessors, and demonstrate the use of the language in optimizing parallel programs. Split-C provides a global address space with a clear concept of locality and unusual assignment operators. These are used as tools to reduce the frequency and cost of remote access. The language allows a mixture of shared memory, message passing, and data parallel programming styles while providing efficient access to the underlying machine. They demonstrate the basic language concepts using regular and irregular parallel programs and give performance results for various stages of program optimization.

  17. Shared-memory parallel programming in C++

    SciTech Connect

    Beck, B. )

    1990-07-01

    This paper discusses how researchers have produced a set of portable parallel-programming constructs for C, implemented in M4 macros. These parallel-programming macros are available under the name Parmacs. The Parmacs macros let one write parallel C programs for shared-memory, distributed-memory, and mixed-memory (shared and distributed) systems. They have been implemented on several machines. Because Parmacs offers useful parallel-programming features, the author has considered how these problems might be overcome or avoided. The author thought that using C++, rather than C, would address these problems adequately, and describes the C++ features exploited. The work described addresses shared-memory constructs.

  18. Parallel Algorithms for the Exascale Era

    SciTech Connect

    Robey, Robert W.

    2016-10-19

    New parallel algorithms are needed to reach the Exascale level of parallelism with millions of cores. We look at some of the research developed by students in projects at LANL. The research blends ideas from the early days of computing while weaving in the fresh approach brought by students new to the field of high performance computing. We look at reproducibility of global sums and why it is important to parallel computing. Next we look at how the concept of hashing has led to the development of more scalable algorithms suitable for next-generation parallel computers. Nearly all of this work has been done by undergraduates and published in leading scientific journals.

  19. A parallel algorithm for global routing

    NASA Technical Reports Server (NTRS)

    Brouwer, Randall J.; Banerjee, Prithviraj

    1990-01-01

    A Parallel Hierarchical algorithm for Global Routing (PHIGURE) is presented. The router is based on the work of Burstein and Pelavin, but has many extensions for general global routing and parallel execution. Main features of the algorithm include structured hierarchical decomposition into separate independent tasks which are suitable for parallel execution and adaptive simplex solution for adding feedthroughs and adjusting channel heights for row-based layout. Alternative decomposition methods and the various levels of parallelism available in the algorithm are examined closely. The algorithm is described and results are presented for a shared-memory multiprocessor implementation.

  20. Conformal pure radiation with parallel rays

    NASA Astrophysics Data System (ADS)

    Leistner, Thomas; Nurowski, Paweł

    2012-03-01

    We define pure radiation metrics with parallel rays to be n-dimensional pseudo-Riemannian metrics that admit a parallel null line bundle K and whose Ricci tensor vanishes on vectors that are orthogonal to K. We give necessary conditions in terms of the Weyl, Cotton and Bach tensors for a pseudo-Riemannian metric to be conformal to a pure radiation metric with parallel rays. Then, we derive conditions in terms of the tractor calculus that are equivalent to the existence of a pure radiation metric with parallel rays in a conformal class. We also give analogous results for n-dimensional pseudo-Riemannian pp-waves.

  1. Parallel Genetic Algorithm for Alpha Spectra Fitting

    NASA Astrophysics Data System (ADS)

    García-Orellana, Carlos J.; Rubio-Montero, Pilar; González-Velasco, Horacio

    2005-01-01

    We present a performance study of alpha-particle spectra fitting using parallel Genetic Algorithm (GA). The method uses a two-step approach. In the first step we run parallel GA to find an initial solution for the second step, in which we use Levenberg-Marquardt (LM) method for a precise final fit. GA is a high resources-demanding method, so we use a Beowulf cluster for parallel simulation. The relationship between simulation time (and parallel efficiency) and processors number is studied using several alpha spectra, with the aim of obtaining a method to estimate the optimal processors number that must be used in a simulation.

  2. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1995-01-01

    The scope of this project dealt with the investigation of the requirements to support distributed computing of scientific computations over a cluster of cooperative workstations. Various experiments on computations for the solution of simultaneous linear equations were performed in the early phase of the project to gain experience in the general nature and requirements of scientific applications. A specification of a distributed integrated computing environment, DICE, based on a distributed shared memory communication paradigm has been developed and evaluated. The distributed shared memory model facilitates porting existing parallel algorithms that have been designed for shared memory multiprocessor systems to the new environment. The potential of this new environment is to provide supercomputing capability through the utilization of the aggregate power of workstations cooperating in a cluster interconnected via a local area network. Workstations, generally, do not have the computing power to tackle complex scientific applications, making them primarily useful for visualization, data reduction, and filtering as far as complex scientific applications are concerned. There is a tremendous amount of computing power that is left unused in a network of workstations. Very often a workstation is simply sitting idle on a desk. A set of tools can be developed to take advantage of this potential computing power to create a platform suitable for large scientific computations. The integration of several workstations into a logical cluster of distributed, cooperative, computing stations presents an alternative to shared memory multiprocessor systems. In this project we designed and evaluated such a system.

  3. Parallel Computing for Brain Simulation.

    PubMed

    Pastur-Romay, L A; Porto-Pazos, A B; Cedrón, F; Pazos, A

    2016-11-04

    The human brain is the most complex system in the known universe, but it is the most unknown system. It allows the human beings to possess extraordinary capacities. However, we don´t understand yet how and why most of these capacities are produced. For decades, it have been tried that the computers reproduces these capacities. On one hand, to help understanding the nervous system. On the other hand, to process the data in a more efficient way than before. It is intended to make the computers process the information like the brain does it. The important technological developments and the big multidisciplinary projects have allowed create the first simulation with a number of neurons similar to the human brain neurons number. This paper presents an update review about the main research projects that are trying of simulate and/or emulate the human brain. They employ different types of computational models using parallel computing: digital models, analog models and hybrid models. This review includes the actual applications of these works and also the future trends. We have reviewed some works that look for a step forward in Neuroscience and other ones that look for a breakthrough in Computer Science (neuromorphic hardware, machine learning techniques). We summarize the most outstanding characteristics of them and present the latest advances and future plans. In addition, this review remarks the importance of considering not only neurons: the computational models of the brain should include glial cells, given the proven importance of the astrocytes in the information processing.

  4. Dynamic force spectroscopy of parallel individual mucin1-antibody bonds

    SciTech Connect

    Sulchek, T A; Friddle, R W; Langry, K; Lau, E; Albrecht, H; Ratto, T; DeNardo, S; Colvin, M E; Noy, A

    2005-05-02

    We used atomic force microscopy (AFM) to measure the binding forces between Mucin1 (MUC1) peptide and a single chain antibody fragment (scFv) selected from a scFv library screened against MUC1. This binding interaction is central to the design of the molecules for targeted delivery of radioimmunotherapeutic agents for prostate and breast cancer treatment. Our experiments separated the specific binding interaction from non-specific interactions by tethering the antibody and MUC1 molecules to the AFM tip and sample surface with flexible polymer spacers. Rupture force magnitude and elastic characteristics of the spacers allowed identification of the bond rupture events corresponding to different number of interacting proteins. We used dynamic force spectroscopy to estimate the intermolecular potential widths and equivalent thermodynamic off rates for mono-, bi-, and tri-valent interactions. Measured interaction potential parameters agree with the results of molecular docking simulation. Our results demonstrate that an increase of the interaction valency leads to a precipitous decline in the dissociation rate. Binding forces measured for mono and multivalent interactions match the predictions of a Markovian model for the strength of multiple uncorrelated bonds in parallel configuration. Our approach is promising for comparison of the specific effects of molecular modifications as well as for determination of the best configuration of antibody-based multivalent targeting agents.

  5. Parallel methods for the flight simulation model

    SciTech Connect

    Xiong, Wei Zhong; Swietlik, C.

    1994-06-01

    The Advanced Computer Applications Center (ACAC) has been involved in evaluating advanced parallel architecture computers and the applicability of these machines to computer simulation models. The advanced systems investigated include parallel machines with shared. memory and distributed architectures consisting of an eight processor Alliant FX/8, a twenty four processor sor Sequent Symmetry, Cray XMP, IBM RISC 6000 model 550, and the Intel Touchstone eight processor Gamma and 512 processor Delta machines. Since parallelizing a truly efficient application program for the parallel machine is a difficult task, the implementation for these machines in a realistic setting has been largely overlooked. The ACAC has developed considerable expertise in optimizing and parallelizing application models on a collection of advanced multiprocessor systems. One of aspect of such an application model is the Flight Simulation Model, which used a set of differential equations to describe the flight characteristics of a launched missile by means of a trajectory. The Flight Simulation Model was written in the FORTRAN language with approximately 29,000 lines of source code. Depending on the number of trajectories, the computation can require several hours to full day of CPU time on DEC/VAX 8650 system. There is an impetus to reduce the execution time and utilize the advanced parallel architecture computing environment available. ACAC researchers developed a parallel method that allows the Flight Simulation Model to be able to run in parallel on the multiprocessor system. For the benchmark data tested, the parallel Flight Simulation Model implemented on the Alliant FX/8 has achieved nearly linear speedup. In this paper, we describe a parallel method for the Flight Simulation Model. We believe the method presented in this paper provides a general concept for the design of parallel applications. This concept, in most cases, can be adapted to many other sequential application programs.

  6. Recent Improvements to HST Parallel Scheduling

    NASA Astrophysics Data System (ADS)

    Henry, Ronald; Butschky, Mike

    The Hubble Space Telescope (HST) has several scientific instruments (SIs) that may be used at any given time. Most primary visits submitted by HST observers only use one SI, leaving the other SIs free to be requested by ``pure parallel'' observing programs. In order to accomplish this, separate scheduling units (SUs) for each parallel SI must be created and then scheduled by the Science Planning and Scheduling System (SPSS), taking into account numerous orbital and scientific constraints. The Parallel Observation Matching System (POMS) has the task of matching parallel visits to primary observations and ``crafting'' appropriate parallel SUs at each opportunity, taking scientific criteria and orbital constraints into account. The process for planning and scheduling parallel observations is thus quite different from the process for primary science. In the past, custom crafting rules for each parallel program were necessary, requiring full-time support from a software developer. In addition, because POMS ran as a standalone system, its ability to model how long parallel SUs would take was limited, especially with the flexible buffer-management schemes used for the second-generation SIs. A new version of POMS was developed in 1997. This version uses a formal proposal syntax (the same used for primary observations) for parallels, so that different proposals can be handled uniformly and without the need for customized ``crafting rules.'' In addition, POMS is integrated with the Transformation (TRANS) planning system in order to give it full knowledge of overheads within an SU, eliminating the need for ad hoc modeling. The power and versatility of this approach has paid off in improved utilization of parallel opportunities, greatly reduced maintenance costs, and an ability to gracefully handle new parallel proposals and new SIs with minimal software effort. This paper discusses the requirements, design, and operational results of the new POMS.

  7. High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models. Research Report. ETS RR-16-34

    ERIC Educational Resources Information Center

    von Davier, Matthias

    2016-01-01

    This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…

  8. On the dimensionally correct kinetic theory of turbulence for parallel propagation

    SciTech Connect

    Gaelzer, R. E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br; Ziebell, L. F. E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br; Yoon, P. H. E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br; Kim, Sunjung E-mail: yoonp@umd.edu E-mail: luiz.ziebell@ufrgs.br

    2015-03-15

    Yoon and Fang [Phys. Plasmas 15, 122312 (2008)] formulated a second-order nonlinear kinetic theory that describes the turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field. Their theory also includes discrete-particle effects, or the effects due to spontaneously emitted thermal fluctuations. However, terms associated with the spontaneous fluctuations in particle and wave kinetic equations in their theory contain proper dimensionality only for an artificial one-dimensional situation. The present paper extends the analysis and re-derives the dimensionally correct kinetic equations for three-dimensional case. The new formalism properly describes the effects of spontaneous fluctuations emitted in three-dimensional space, while the collectively emitted turbulence propagates predominantly in directions parallel/anti-parallel to the ambient magnetic field. As a first step, the present investigation focuses on linear wave-particle interaction terms only. A subsequent paper will include the dimensionally correct nonlinear wave-particle interaction terms.

  9. The language parallel Pascal and other aspects of the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  10. Parallelized Vlasov-Fokker-Planck solver for desktop personal computers

    NASA Astrophysics Data System (ADS)

    Schönfeldt, Patrik; Brosi, Miriam; Schwarz, Markus; Steinmann, Johannes L.; Müller, Anke-Susanne

    2017-03-01

    The numerical solution of the Vlasov-Fokker-Planck equation is a well established method to simulate the dynamics, including the self-interaction with its own wake field, of an electron bunch in a storage ring. In this paper we present Inovesa, a modularly extensible program that uses opencl to massively parallelize the computation. It allows a standard desktop PC to work with appropriate accuracy and yield reliable results within minutes. We provide numerical stability-studies over a wide parameter range and compare our numerical findings to known results. Simulation results for the case of coherent synchrotron radiation will be compared to measurements that probe the effects of the microbunching instability occurring in the short bunch operation at ANKA. It will be shown that the impedance model based on the shielding effect of two parallel plates can not only describe the instability threshold, but also the presence of multiple regimes that show differences in the emission of coherent synchrotron radiation.

  11. Parallel discrete event simulation: A shared memory approach

    NASA Technical Reports Server (NTRS)

    Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.

    1987-01-01

    With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models.

  12. Parallelization of MRCI based on hole-particle symmetry.

    PubMed

    Suo, Bing; Zhai, Gaohong; Wang, Yubin; Wen, Zhenyi; Hu, Xiangqian; Li, Lemin

    2005-01-15

    The parallel implementation of multireference configuration interaction program based on the hole-particle symmetry is described. The platform to implement the parallelization is an Intel-Architectural cluster consisting of 12 nodes, each of which is equipped with two 2.4-G XEON processors, 3-GB memory, and 36-GB disk, and are connected by a Gigabit Ethernet Switch. The dependence of speedup on molecular symmetries and task granularities is discussed. Test calculations show that the scaling with the number of nodes is about 1.9 (for C1 and Cs), 1.65 (for C2v), and 1.55 (for D2h) when the number of nodes is doubled. The largest calculation performed on this cluster involves 5.6 x 10(8) CSFs.

  13. Reducing neural network training time with parallel processing

    NASA Technical Reports Server (NTRS)

    Rogers, James L., Jr.; Lamarsh, William J., II

    1995-01-01

    Obtaining optimal solutions for engineering design problems is often expensive because the process typically requires numerous iterations involving analysis and optimization programs. Previous research has shown that a near optimum solution can be obtained in less time by simulating a slow, expensive analysis with a fast, inexpensive neural network. A new approach has been developed to further reduce this time. This approach decomposes a large neural network into many smaller neural networks that can be trained in parallel. Guidelines are developed to avoid some of the pitfalls when training smaller neural networks in parallel. These guidelines allow the engineer: to determine the number of nodes on the hidden layer of the smaller neural networks; to choose the initial training weights; and to select a network configuration that will capture the interactions among the smaller neural networks. This paper presents results describing how these guidelines are developed.

  14. An Approach to Performance Prediction for Parallel Applications

    SciTech Connect

    Ipek, E; de Supinski, B R; Schulz, M; McKee, S A

    2005-05-17

    Accurately modeling and predicting performance for large-scale applications becomes increasingly difficult as system complexity scales dramatically. Analytic predictive models are useful, but are difficult to construct, usually limited in scope, and often fail to capture subtle interactions between architecture and software. In contrast, we employ multilayer neural networks trained on input data from executions on the target platform. This approach is useful for predicting many aspects of performance, and it captures full system complexity. Our models are developed automatically from the training input set, avoiding the difficult and potentially error-prone process required to develop analytic models. This study focuses on the high-performance, parallel application SMG2000, a much studied code whose variations in execution times are still not well understood. Our model predicts performance on two large-scale parallel platforms within 5%-7% error across a large, multi-dimensional parameter space.

  15. Parallel line analysis: multifunctional software for the biomedical sciences

    NASA Technical Reports Server (NTRS)

    Swank, P. R.; Lewis, M. L.; Damron, K. L.; Morrison, D. R.

    1990-01-01

    An easy to use, interactive FORTRAN program for analyzing the results of parallel line assays is described. The program is menu driven and consists of five major components: data entry, data editing, manual analysis, manual plotting, and automatic analysis and plotting. Data can be entered from the terminal or from previously created data files. The data editing portion of the program is used to inspect and modify data and to statistically identify outliers. The manual analysis component is used to test the assumptions necessary for parallel line assays using analysis of covariance techniques and to determine potency ratios with confidence limits. The manual plotting component provides a graphic display of the data on the terminal screen or on a standard line printer. The automatic portion runs through multiple analyses without operator input. Data may be saved in a special file to expedite input at a future time.

  16. A Survey of Parallel Computing

    DTIC Science & Technology

    1988-07-01

    CENTERS 153 Newnann Cente’r (JVNC) near Princeton, New Jersey. Each center is equipped with state-of-the- art supercomputing equipment and a staff to...offers state-of-the- art , networked workstations for interactive work on the Cray X-MP and for other research purposes such as analyzing results...corporations. Designated employees from participating corporations receive training tailored to their needs, access to state-of-the- art workstations and

  17. GOTPM: a parallel hybrid particle-mesh treecode

    NASA Astrophysics Data System (ADS)

    Dubinski, John; Kim, Juhan; Park, Changbom; Humble, Robin

    2004-02-01

    We describe a parallel, cosmological N-body code based on a hybrid scheme using the particle-mesh (PM) and Barnes-Hut (BH) oct-tree algorithm. We call the algorithm GOTPM for Grid-of-Oct-Trees-Particle-Mesh. The code is parallelized using the Message Passing Interface (MPI) library and is optimized to run on Beowulf clusters as well as symmetric multi-processors. The gravitational potential is determined on a mesh using a standard PM method with particle forces determined through interpolation. The softened PM force is corrected for short range interactions using a grid of localized BH trees throughout the entire simulation volume in a completely analogous way to P3M methods. This method makes no assumptions about the local density for short range force corrections and so is consistent with the results of the P3M method in the limit that the treecode opening angle parameter, θ→0. The PM method is parallelized using one-dimensional slice domain decomposition. Particles are distributed in slices of equal width to allow mass assignment onto mesh points. The Fourier transforms in the PM method are done in parallel using the MPI implementation of the FFTW package. Parallelization for the tree force corrections is achieved again using one-dimensional slices but the width of each slice is allowed to vary according to the amount of computational work required by the particles within each slice to achieve load balance. The tree force corrections dominate the computational load and so imbalances in the PM density assignment step do not impact the overall load balance and performance significantly. The code performance scales well to 128 processors and is significantly better than competing methods. We present preliminary results from simulations run on different platforms containing up to N=1 G particles to verify the code.

  18. Prototyping Parallel and Distributed Programs in Proteus

    DTIC Science & Technology

    1990-10-01

    Cole90, Gibb89]. " Highly-parallel processors - Applications for highly-parallel machines such as the CM- 2 or the iPSC are programmed using data...Programming, (Prentice-Hall, Englewood Cliffs, NJ) 1990. [Gibb89] Gibbons , P.B., "A more practical PRAM model", in: Proceedings of the First ACM

  19. Parallel computation with the spectral element method

    SciTech Connect

    Ma, Hong

    1995-12-01

    Spectral element models for the shallow water equations and the Navier-Stokes equations have been successfully implemented on a data parallel supercomputer, the Connection Machine model CM-5. The nonstaggered grid formulations for both models are described, which are shown to be especially efficient in data parallel computing environment.

  20. Predicting Protein Structure Using Parallel Genetic Algorithms.

    DTIC Science & Technology

    1994-12-01

    By " Predicting rotein Structure D istribticfiar.. ................ Using Parallel Genetic Algorithms ,Avaiu " ’ •"... Dist THESIS I IGeorge H...iiLite-d Approved for public release; distribution unlimited AFIT/ GCS /ENG/94D-03 Predicting Protein Structure Using Parallel Genetic Algorithms ...1-1 1.2 Genetic Algorithms ......... ............................ 1-3 1.3 The Protein Folding Problem

  1. Parallel Activation in Bilingual Phonological Processing

    ERIC Educational Resources Information Center

    Lee, Su-Yeon

    2011-01-01

    In bilingual language processing, the parallel activation hypothesis suggests that bilinguals activate their two languages simultaneously during language processing. Support for the parallel activation mainly comes from studies of lexical (word-form) processing, with relatively less attention to phonological (sound) processing. According to…

  2. MULTIOBJECTIVE PARALLEL GENETIC ALGORITHM FOR WASTE MINIMIZATION

    EPA Science Inventory

    In this research we have developed an efficient multiobjective parallel genetic algorithm (MOPGA) for waste minimization problems. This MOPGA integrates PGAPack (Levine, 1996) and NSGA-II (Deb, 2000) with novel modifications. PGAPack is a master-slave parallel implementation of a...

  3. Parallel Narrative Structure in Paul Harding's "Tinkers"

    ERIC Educational Resources Information Center

    Çirakli, Mustafa Zeki

    2014-01-01

    The present paper explores the implications of parallel narrative structure in Paul Harding's "Tinkers" (2009). Besides primarily recounting the two sets of parallel narratives, "Tinkers" also comprises of seemingly unrelated fragments such as excerpts from clock repair manuals and diaries. The main stories, however, told…

  4. Parallel Computing Strategies for Irregular Algorithms

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

    2002-01-01

    Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.

  5. Parallel unstructured grid generation for computational aerosciences

    NASA Technical Reports Server (NTRS)

    Shephard, Mark S.

    1993-01-01

    The objective of this research project is to develop efficient parallel automatic grid generation procedures for use in computational aerosciences. This effort is focused on a parallel version of the Finite Octree grid generator. Progress made during the first six months is reported.

  6. Differences Between Distributed and Parallel Systems

    SciTech Connect

    Brightwell, R.; Maccabe, A.B.; Rissen, R.

    1998-10-01

    Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.

  7. Parallel-In-Time For Moving Meshes

    SciTech Connect

    Falgout, R. D.; Manteuffel, T. A.; Southworth, B.; Schroder, J. B.

    2016-02-04

    With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.

  8. Configuration space representation in parallel coordinates

    NASA Technical Reports Server (NTRS)

    Fiorini, Paolo; Inselberg, Alfred

    1989-01-01

    By means of a system of parallel coordinates, a nonprojective mapping from R exp N to R squared is obtained for any positive integer N. In this way multivariate data and relations can be represented in the Euclidean plane (embedded in the projective plane). Basically, R squared with Cartesian coordinates is augmented by N parallel axes, one for each variable. The N joint variables of a robotic device can be represented graphically by using parallel coordinates. It is pointed out that some properties of the relation are better perceived visually from the parallel coordinate representation, and that new algorithms and data structures can be obtained from this representation. The main features of parallel coordinates are described, and an example is presented of their use for configuration space representation of a mechanical arm (where Cartesian coordinates cannot be used).

  9. Implementation and performance of parallel Prolog interpreter

    SciTech Connect

    Wei, S.; Kale, L.V.; Balkrishna, R. . Dept. of Computer Science)

    1988-01-01

    In this paper, the authors discuss the implementation of a parallel Prolog interpreter on different parallel machines. The implementation is based on the REDUCE--OR process model which exploits both AND and OR parallelism in logic programs. It is machine independent as it runs on top of the chare-kernel--a machine-independent parallel programming system. The authors also give the performance of the interpreter running a diverse set of benchmark pargrams on parallel machines including shared memory systems: an Alliant FX/8, Sequent and a MultiMax, and a non-shared memory systems: Intel iPSC/32 hypercube, in addition to its performance on a multiprocessor simulation system.

  10. Parallel Algebraic Multigrid Methods - High Performance Preconditioners

    SciTech Connect

    Yang, U M

    2004-11-11

    The development of high performance, massively parallel computers and the increasing demands of computationally challenging applications have necessitated the development of scalable solvers and preconditioners. One of the most effective ways to achieve scalability is the use of multigrid or multilevel techniques. Algebraic multigrid (AMG) is a very efficient algorithm for solving large problems on unstructured grids. While much of it can be parallelized in a straightforward way, some components of the classical algorithm, particularly the coarsening process and some of the most efficient smoothers, are highly sequential, and require new parallel approaches. This chapter presents the basic principles of AMG and gives an overview of various parallel implementations of AMG, including descriptions of parallel coarsening schemes and smoothers, some numerical results as well as references to existing software packages.

  11. A parallel variable metric optimization algorithm

    NASA Technical Reports Server (NTRS)

    Straeter, T. A.

    1973-01-01

    An algorithm, designed to exploit the parallel computing or vector streaming (pipeline) capabilities of computers is presented. When p is the degree of parallelism, then one cycle of the parallel variable metric algorithm is defined as follows: first, the function and its gradient are computed in parallel at p different values of the independent variable; then the metric is modified by p rank-one corrections; and finally, a single univariant minimization is carried out in the Newton-like direction. Several properties of this algorithm are established. The convergence of the iterates to the solution is proved for a quadratic functional on a real separable Hilbert space. For a finite-dimensional space the convergence is in one cycle when p equals the dimension of the space. Results of numerical experiments indicate that the new algorithm will exploit parallel or pipeline computing capabilities to effect faster convergence than serial techniques.

  12. National Combustion Code: Parallel Implementation and Performance

    NASA Technical Reports Server (NTRS)

    Quealy, A.; Ryder, R.; Norris, A.; Liu, N.-S.

    2000-01-01

    The National Combustion Code (NCC) is being developed by an industry-government team for the design and analysis of combustion systems. CORSAIR-CCD is the current baseline reacting flow solver for NCC. This is a parallel, unstructured grid code which uses a distributed memory, message passing model for its parallel implementation. The focus of the present effort has been to improve the performance of the NCC flow solver to meet combustor designer requirements for model accuracy and analysis turnaround time. Improving the performance of this code contributes significantly to the overall reduction in time and cost of the combustor design cycle. This paper describes the parallel implementation of the NCC flow solver and summarizes its current parallel performance on an SGI Origin 2000. Earlier parallel performance results on an IBM SP-2 are also included. The performance improvements which have enabled a turnaround of less than 15 hours for a 1.3 million element fully reacting combustion simulation are described.

  13. Parallelization of a Compositional Reservoir Simulator

    NASA Astrophysics Data System (ADS)

    Reme, Hilde; Åge Øye, Geir; Espedal, Magne S.; Fladmark, Gunnar E.

    A finite volume dicretization has been used to solve compositional flow in porous media. Secondary migration in fractured rocks has been the main motivation for the work. Multipoint flux approximation has been implemented and adaptive local grid refinement, based on domain decomposition, is used at fractures and faults. The parallelization method, which is described in this paper, strongly promotes code reuse and gives a very high level of parallelization despite low implementation costs. The programming framework is also portable to other platforms or other applications. We have presented computer experiments to examine the parallel efficiency of the implemented parallel simulator with respect to scalability and speedup. Keywords: porous media, multipoint flux approximation, domain decomposition, parallelization

  14. Genetic Parallel Programming: design and implementation.

    PubMed

    Cheang, Sin Man; Leung, Kwong Sak; Lee, Kin Hong

    2006-01-01

    This paper presents a novel Genetic Parallel Programming (GPP) paradigm for evolving parallel programs running on a Multi-Arithmetic-Logic-Unit (Multi-ALU) Processor (MAP). The MAP is a Multiple Instruction-streams, Multiple Data-streams (MIMD), general-purpose register machine that can be implemented on modern Very Large-Scale Integrated Circuits (VLSIs) in order to evaluate genetic programs at high speed. For human programmers, writing parallel programs is more difficult than writing sequential programs. However, experimental results show that GPP evolves parallel programs with less computational effort than that of their sequential counterparts. It creates a new approach to evolving a feasible problem solution in parallel program form and then serializes it into a sequential program if required. The effectiveness and efficiency of GPP are investigated using a suite of 14 well-studied benchmark problems. Experimental results show that GPP speeds up evolution substantially.

  15. Parallel hypergraph partitioning for scientific computing.

    SciTech Connect

    Heaphy, Robert; Devine, Karen Dragon; Catalyurek, Umit; Bisseling, Robert; Hendrickson, Bruce Alan; Boman, Erik Gunnar

    2005-07-01

    Graph partitioning is often used for load balancing in parallel computing, but it is known that hypergraph partitioning has several advantages. First, hypergraphs more accurately model communication volume, and second, they are more expressive and can better represent nonsymmetric problems. Hypergraph partitioning is particularly suited to parallel sparse matrix-vector multiplication, a common kernel in scientific computing. We present a parallel software package for hypergraph (and sparse matrix) partitioning developed at Sandia National Labs. The algorithm is a variation on multilevel partitioning. Our parallel implementation is novel in that it uses a two-dimensional data distribution among processors. We present empirical results that show our parallel implementation achieves good speedup on several large problems (up to 33 million nonzeros) with up to 64 processors on a Linux cluster.

  16. Broadcasting a message in a parallel computer

    DOEpatents

    Berg, Jeremy E.; Faraj, Ahmad A.

    2011-08-02

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.

  17. Sequential bioequivalence approaches for parallel designs.

    PubMed

    Fuglsang, Anders

    2014-05-01

    Regulators in EU, USA and Canada allow the use of two-stage approaches for evaluation of bioequivalence. The purpose of this paper is to evaluate such designs for parallel groups using trial simulations. The methods developed by Diane Potvin and co-workers were adapted to parallel designs. Trials were simulated and evaluated on basis of either equal or unequal variances between treatment groups. Methods B and C of Potvin et al., when adapted for parallel designs, protected well against type I error rate inflation under all of the simulated scenarios. Performance characteristics of the new parallel design methods showed little dependence on the assumption of equality of the test and reference variances. This is the first paper to describe the performance of two-stage approaches for parallel designs used to evaluate bioequivalence. The results may prove useful to sponsors developing formulations where crossover designs for bioequivalence evaluation are undesirable.

  18. Conservation of writhe helicity under anti-parallel reconnection

    NASA Astrophysics Data System (ADS)

    Laing, Christian E.; Ricca, Renzo L.; Sumners, De Witt L.

    2015-03-01

    Reconnection is a fundamental event in many areas of science, from the interaction of vortices in classical and quantum fluids, and magnetic flux tubes in magnetohydrodynamics and plasma physics, to the recombination in polymer physics and DNA biology. By using fundamental results in topological fluid mechanics, the helicity of a flux tube can be calculated in terms of writhe and twist contributions. Here we show that the writhe is conserved under anti-parallel reconnection. Hence, for a pair of interacting flux tubes of equal flux, if the twist of the reconnected tube is the sum of the original twists of the interacting tubes, then helicity is conserved during reconnection. Thus, any deviation from helicity conservation is entirely due to the intrinsic twist inserted or deleted locally at the reconnection site. This result has important implications for helicity and energy considerations in various physical contexts.

  19. Integrated Optoelectronics for Parallel Microbioanalysis

    NASA Technical Reports Server (NTRS)

    Stirbl, Robert; Moynihan, Philip; Bearman, Gregory; Lane, Arthur

    2003-01-01

    Miniature, relatively inexpensive microbioanalytical systems ("laboratory-on-achip" devices) have been proposed for the detection of hazardous microbes and toxic chemicals. Each system of this type would include optoelectronic sensors and sensor-output-processing circuitry that would simultaneously look for the optical change, fluorescence, delayed fluorescence, or phosphorescence signatures from multiple redundant sites that have interacted with the test biomolecules in order to detect which one(s) was present in a given situation. These systems could be used in a variety of settings that could include doctors offices, hospitals, hazardous-material laboratories, biological-research laboratories, military operations, and chemical-processing plants.

  20. Blade-mounted trailing edge flap control for BVI noise reduction

    NASA Technical Reports Server (NTRS)

    Hassan, A. A.; Charles, B. D.; Tadghighi, H.; Sankar, L. N.

    1992-01-01

    Numerical procedures based on the 2-D and 3-D full potential equations and the 2-D Navier-Stokes equations were developed to study the effects of leading and trailing edge flap motions on the aerodynamics of parallel airfoil-vortex interactions and on the aerodynamics and acoustics of the more general self-generated rotor blade vortex interactions (BVI). For subcritical interactions, the 2-D results indicate that the trailing edge flap can be used to alleviate the impulsive loads experienced by the airfoil. For supercritical interactions, the results show the necessity of using a leading edge flap, rather than a trailing edge flap, to alleviate the interaction. Results for various time dependent flap motions and their effect on the predicted temporal sectional loads, differential pressures, and the free vortex trajectories are presented. For the OLS model rotor, contours of a BVI noise metric were used to quantify the effects of the trailing edge flap on the size and directivity of the high/low intensity noise region(s). Average reductions in the BVI noise levels on the order of 5 dB with moderate power penalties on the order of 18 pct. for a four bladed rotor and 58 pct. for a two bladed rotor were obtained.

  1. Parallel emergence of negative epistasis across experimental lineages.

    PubMed

    Zee, Peter C; Velicer, Gregory J

    2017-01-27

    Epistatic interactions can greatly impact evolutionary phenomena, particularly the process of adaptation. Here, we leverage four parallel experimentally evolved lineages to study the emergence and trajectories of epistatic interactions in the social bacterium Myxococcus xanthus. A social gene (pilA) necessary for effective group swarming on soft agar had been deleted from the common ancestor of these lineages. During selection for competitiveness at the leading edge of growing colonies, two lineages evolved qualitatively novel mechanisms for greatly increased swarming on soft agar, whereas the other two lineages evolved relatively small increases in swarming. By reintroducing pilA into different genetic backgrounds along the four lineages, we tested whether parallel lineages showed similar patterns of epistasis. In particular, we tested whether a pattern of negative epistasis between accumulating mutations and pilA previously found in the fastest lineage would be found only in the two evolved lineages with the fastest and most striking swarming phenotypes, or rather was due to common epistatic structure across all lineages arising from the generic fixation of adaptive mutations. Our analysis reveals the emergence of negative epistasis across all four independent lineages. Further, we present results showing that the observed negative epistasis is not due exclusively to evolving populations approaching a maximum phenotypic value that inherently limits positive effects of pilA reintroduction, but rather involves direct antagonistic interactions between accumulating mutations and the reintroduced social gene.

  2. Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

    2000-01-01

    The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.

  3. Xyce parallel electronic simulator : users' guide.

    SciTech Connect

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2011-05-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique

  4. Parallelization of the Implicit RPLUS Algorithm

    NASA Technical Reports Server (NTRS)

    Orkwis, Paul D.

    1997-01-01

    The multiblock reacting Navier-Stokes flow solver RPLUS2D was modified for parallel implementation. Results for non-reacting flow calculations of this code indicate parallelization efficiencies greater than 84% are possible for a typical test problem. Results tend to improve as the size of the problem increases. The convergence rate of the scheme is degraded slightly when additional artificial block boundaries are included for the purpose of parallelization. However, this degradation virtually disappears if the solution is converged near to machine zero. Recommendations are made for further code improvements to increase efficiency, correct bugs in the original version, and study decomposition effectiveness.

  5. Parallelization of the Implicit RPLUS Algorithm

    NASA Technical Reports Server (NTRS)

    Orkwis, Paul D.

    1994-01-01

    The multiblock reacting Navier-Stokes flow-solver RPLUS2D was modified for parallel implementation. Results for non-reacting flow calculations of this code indicate parallelization efficiencies greater than 84% are possible for a typical test problem. Results tend to improve as the size of the problem increases. The convergence rate of the scheme is degraded slightly when additional artificial block boundaries are included for the purpose of parallelization. However, this degradation virtually disappears if the solution is converged near to machine zero. Recommendations are made for further code improvements to increase efficiency, correct bugs in the original version, and study decomposition effectiveness.

  6. Knowledge representation into Ada parallel processing

    NASA Technical Reports Server (NTRS)

    Masotto, Tom; Babikyan, Carol; Harper, Richard

    1990-01-01

    The Knowledge Representation into Ada Parallel Processing project is a joint NASA and Air Force funded project to demonstrate the execution of intelligent systems in Ada on the Charles Stark Draper Laboratory fault-tolerant parallel processor (FTPP). Two applications were demonstrated - a portion of the adaptive tactical navigator and a real time controller. Both systems are implemented as Activation Framework Objects on the Activation Framework intelligent scheduling mechanism developed by Worcester Polytechnic Institute. The implementations, results of performance analyses showing speedup due to parallelism and initial efficiency improvements are detailed and further areas for performance improvements are suggested.

  7. Time-parallel multiscale/multiphysics framework

    SciTech Connect

    Frantziskonis, G.; Muralidharan, Krishna; Deymier, Pierre; Simunovic, Srdjan; Nukala, Phani K; Pannala, Sreekanth

    2009-01-01

    We introduce the time-parallel compound wavelet matrix method (tpCWM) for modeling the temporal evolution of multiscale and multiphysics systems. The method couples time parallel (TP) and CWM methods operating at different spatial and temporal scales. We demonstrate the efficiency of our approach on two examples: a chemical reaction kinetic system and a non-linear predator prey system. Our results indicate that the tpCWM technique is capable of accelerating time-to-solution by 2 3-orders of magnitude and is amenable to efficient parallel implementation.

  8. Language constructs for modular parallel programs

    SciTech Connect

    Foster, I.

    1996-03-01

    We describe programming language constructs that facilitate the application of modular design techniques in parallel programming. These constructs allow us to isolate resource management and processor scheduling decisions from the specification of individual modules, which can themselves encapsulate design decisions concerned with concurrence, communication, process mapping, and data distribution. This approach permits development of libraries of reusable parallel program components and the reuse of these components in different contexts. In particular, alternative mapping strategies can be explored without modifying other aspects of program logic. We describe how these constructs are incorporated in two practical parallel programming languages, PCN and Fortran M. Compilers have been developed for both languages, allowing experimentation in substantial applications.

  9. Distributed parallel messaging for multiprocessor systems

    SciTech Connect

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  10. Parallel path aspects of transmission modeling

    SciTech Connect

    Kavicky, J.A.; Shahidehpour, S.M.

    1996-11-01

    This paper examines the present methods and modeling techniques available to address the effects of parallel flows resulting from various firm and short-term energy transactions. A survey of significant methodologies is conducted to determine the present status of parallel flow transaction modeling. The strengths and weaknesses of these approaches are identified to suggest areas of further modeling improvements. The motivating force behind this research is to improve transfer capability assessment accuracy by suggesting a real-time modeling environment that adequately represents the influences of parallel flows while recognizing operational constraints and objectives.

  11. Fast combinatorial optimization with parallel digital computers.

    PubMed

    Kakeya, H; Okabe, Y

    2000-01-01

    This paper presents an algorithm which realizes fast search for the solutions of combinatorial optimization problems with parallel digital computers.With the standard weight matrices designed for combinatorial optimization, many iterations are required before convergence to a quasioptimal solution even when many digital processors can be used in parallel. By removing the components of the eingenvectors with eminent negative eigenvalues of the weight matrix, the proposed algorithm avoids oscillation and realizes energy reduction under synchronous discrete dynamics, which enables parallel digital computers to obtain quasi-optimal solutions with much less time than the conventional algorithm.

  12. Heterogeneous parallel programming capability. Final report

    SciTech Connect

    Flower, J.W.; Kolawa, A.

    1990-11-30

    In creating a heterogeneous parallel processing capability we are really trying to approach three basic problems with current systems: (1) Supercomputer and parallel computer hardware architectures vary widely but need to support one or two fairly standard programming languages and programming models. A particularly important issue concerns the short life cycle of individual hardware designs; (2) Many algorithms require capabilities beyond the reach of single superconducters but could be approached by several machines working together; and (3) Performing a given task requires integration of a system that may contain many components in addition to the super or parallel computer itself. Peripherals from many different manufacturers must be incorporated.

  13. A flight investigation of blade section aerodynamics for a helicopter main rotor having NLR-1T airfoil sections

    NASA Technical Reports Server (NTRS)

    Morris, C. E. K., Jr.; Stevens, D. D.; Tomaine, R. L.

    1980-01-01

    A flight investigation was conducted using a teetering-rotor AH-1G helicopter to obtain data on the aerodynamic behavior of main-rotor blades with the NLR-1T blade section. The data system recorded blade-section aerodynamic pressures at 90 percent rotor radius as well as vehicle flight state, performance, and loads. The test envelope included hover, forward flight, and collective-fixed maneuvers. Data were obtained on apparent blade-vortex interactions, negative lift on the advancing blade in high-speed flight and wake interactions in hover. In many cases, good agreement was achieved between chordwise pressure distributions predicted by airfoil theory and flight data with no apparent indications of blade-vortex interactions.

  14. Runtime system library for parallel finite difference models with nesting

    SciTech Connect

    Michalakes, J.

    1997-03-01

    RSL is a parallel run-time system library for implementing regular-grid models with nesting on distributed memory parallel computers. RSL provides support for automatically decomposing multiple model domains and for redistributing work between processors at run time for dynamic load balancing. A unique feature of RSL is that processor subdomains need not be rectangular patches; rather, grid points are independently allocated to processors, allowing more precisely balanced allocation of work to processors. Communication mechanisms are tailored to the application: RSL provides an efficient high-level stencil exchange operation for updating subdomain ghost areas and interdomain communication to support two-way interaction between nest levels. RSL also provides run-time support for local iteration over subdomains, global-local index translation, and distributed I/O from ordinary Fortran record-blocked data sets. The interface to RSL supports Fortran77 and Fortran90. RSL has been used to parallelize the NCAR/Penn State Mesoscale Model (MM5).

  15. Airbreathing Propulsion System Analysis Using Multithreaded Parallel Processing

    NASA Technical Reports Server (NTRS)

    Schunk, Richard Gregory; Chung, T. J.; Rodriguez, Pete (Technical Monitor)

    2000-01-01

    In this paper, parallel processing is used to analyze the mixing, and combustion behavior of hypersonic flow. Preliminary work for a sonic transverse hydrogen jet injected from a slot into a Mach 4 airstream in a two-dimensional duct combustor has been completed [Moon and Chung, 1996]. Our aim is to extend this work to three-dimensional domain using multithreaded domain decomposition parallel processing based on the flowfield-dependent variation theory. Numerical simulations of chemically reacting flows are difficult because of the strong interactions between the turbulent hydrodynamic and chemical processes. The algorithm must provide an accurate representation of the flowfield, since unphysical flowfield calculations will lead to the faulty loss or creation of species mass fraction, or even premature ignition, which in turn alters the flowfield information. Another difficulty arises from the disparity in time scales between the flowfield and chemical reactions, which may require the use of finite rate chemistry. The situations are more complex when there is a disparity in length scales involved in turbulence. In order to cope with these complicated physical phenomena, it is our plan to utilize the flowfield-dependent variation theory mentioned above, facilitated by large eddy simulation. Undoubtedly, the proposed computation requires the most sophisticated computational strategies. The multithreaded domain decomposition parallel processing will be necessary in order to reduce both computational time and storage. Without special treatments involved in computer engineering, our attempt to analyze the airbreathing combustion appears to be difficult, if not impossible.

  16. Parallel-vector computation for CSI-design code

    NASA Technical Reports Server (NTRS)

    Nguyen, Duc T.

    1990-01-01

    Computational aspects of Control-Structure Interaction (CSI) DESIGN code is reviewed. Numerical intensive computation portions of CSI-DESIGN code were identified. Improvements in computational speed for the CSI-DESIGN code can be achieved by exploiting parallel and vector capabilities offered by modern computers, such as the Alliant, Convex, Cray-2, and Cray-YMP. Four options to generate the coefficient stiffness matrix and to solve the system of linear, simultaneous equations are currently available in the CSI-DESIGN code. A preprocessor to use RCM (Reverse Cuthill-Mackee) algorithm for bandwidth minimization was also developed for the CSI-DESIGN code. Preliminary results obtained by solving a small-scale, 97 node CSI finite element model (for eigensolution) have indicated that this new CSI-DESIGN code is 5 to 6 times faster (using 1 Alliant processor) than the old version of CSI-DESIGN code. This speed-up was achieved due to the RCM algorithm and the use of a new skyline solver. Efforts are underway to further improve the vector speed for CSI-DESIGN code, to evaluate its performance on a larger scale CSI model (such as phase zero CSI model) to make the code run efficiently on multiprocessor, parallel computer environment, and to make the code portable among different parallel computers available at NASA LaRC, such as Alliant, Convex, and Cray computers.

  17. Fast electrostatic force calculation on parallel computer clusters

    SciTech Connect

    Kia, Amirali Kim, Daejoong Darve, Eric

    2008-10-01

    The fast multipole method (FMM) and smooth particle mesh Ewald (SPME) are well known fast algorithms to evaluate long range electrostatic interactions in molecular dynamics and other fields. FMM is a multi-scale method which reduces the computation cost by approximating the potential due to a group of particles at a large distance using few multipole functions. This algorithm scales like O(N) for N particles. SPME algorithm is an O(NlnN) method which is based on an interpolation of the Fourier space part of the Ewald sum and evaluating the resulting convolutions using fast Fourier transform (FFT). Those algorithms suffer from relatively poor efficiency on large parallel machines especially for mid-size problems around hundreds of thousands of atoms. A variation of the FMM, called PWA, based on plane wave expansions is presented in this paper. A new parallelization strategy for PWA, which takes advantage of the specific form of this expansion, is described. Its parallel efficiency is compared with SPME through detail time measurements on two different computer clusters.

  18. Social Problems and Deviance: Some Parallel Issues

    ERIC Educational Resources Information Center

    Kitsuse, John I.; Spector, Malcolm

    1975-01-01

    Explores parallel developments in labeling theory and in the value conflict approach to social problems. Similarities in their critiques of functionalism and etiological theory as well as their emphasis on the definitional process are noted. (Author)

  19. Data parallel sorting for particle simulation

    NASA Technical Reports Server (NTRS)

    Dagum, Leonardo

    1992-01-01

    Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.

  20. Runtime support for parallelizing data mining algorithms

    NASA Astrophysics Data System (ADS)

    Jin, Ruoming; Agrawal, Gagan

    2002-03-01

    With recent technological advances, shared memory parallel machines have become more scalable, and offer large main memories and high bus bandwidths. They are emerging as good platforms for data warehousing and data mining. In this paper, we focus on shared memory parallelization of data mining algorithms. We have developed a series of techniques for parallelization of data mining algorithms, including full replication, full locking, fixed locking, optimized full locking, and cache-sensitive locking. Unlike previous work on shared memory parallelization of specific data mining algorithms, all of our techniques apply to a large number of common data mining algorithms. In addition, we propose a reduction-object based interface for specifying a data mining algorithm. We show how our runtime system can apply any of the technique we have developed starting from a common specification of the algorithm.